Some General Classes of Efficient Estimators in Case of Missing Data

Singh, G. N.; Usman, M.; Khatoon, B.

doi:10.1007/s42519-022-00252-0

Some General Classes of Efficient Estimators in Case of Missing Data

Original Article
Published: 30 March 2022

Volume 16, article number 21, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Statistical Theory and Practice Aims and scope Submit manuscript

Some General Classes of Efficient Estimators in Case of Missing Data

Download PDF

118 Accesses
1 Citation
Explore all metrics

Abstract

To handle the missing data problem in sample surveys, the imputation technique for missing values may suits well in reducing the negative impact of non-response in estimating the population mean. The socio-economic data yield fruitful results while imputation method employed to missing observations. Keeping this in mind, we have proposed three new general classes of difference-cum-ratio type imputation methods and the corresponding estimators in three different sampling strategies using the dual (rank) of an auxiliary variable in the presence of non-response. The biases and mean square errors of the proposed estimators are obtained up to the first-order approximation. The theoretical comparisons of the proposed estimators with usual mean imputation and the works Lee et al. [12], Kadilar and Cingi [11], Gira [9], Diana and Perri [8], Bhusan and Pandey [4], and Bhusan and Pandey [5] have been made which are also the special cases of the proposed estimators apart from being less efficient. The results are computed under an empirical study where the proposed work shows the efficacious performance over the above discussed works.

Efficient Imputation Methods to Handle Missing Data in Sample Surveys

Article 02 June 2022

Efficient and alternative approaches for imputing missing data to estimate population mean

Article 13 June 2024

A new robust ratio estimator by modified Cook’s distance for missing data imputation

Article 06 July 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Missing data are an inherent phenomena in sample surveys which may need an appropriate methodology to handle the data sets. For example, in an experiment of three kinds of new drinks in the market for 30 days, some of the data may accidently be missing for some days; in an experiment of animals, some of the animals may die in a laboratory study, or the technician may accidentally omit some of the results of the experiments; in a medical investigation patients may not turn up or may not co-operate or die before completing the actual periods. Similarly, in agricultural experiments, the plants may be eaten away by animals or washed away by floods, etc. In all such situations, the experiments result in incomplete data may mislead the inferences. In case of missing data, several statisticians have proved that the inferences or predictions regarding the population parameters may be highly distorted, especially when the respondents and non-respondents are differ. Therefore, the knowledge of the appropriate pattern of the incomplete mechanism is needed at the estimation stage to overcome the missing data problems. Rubin [17] introduced three fundamental concepts of the missing patterns of the data: missing at random (MAR), observed at random (OAR) and parameter distribution (PD). The combination of MAR and OAR termed as the notion of missing completely at random (MCAR). Heitzan and Basu (1996) have differentiated the meaning of MAR and MCAR very systematically. Following these authors, we have assumed the MCAR mechanism in the present study to deal with the problem of missing data.

Imputation is one of the effective techniques in surveys to compensate for the missing data. In various fields like the energy storage system, which provides a peak reduction service to local electricity network, the food composition databases, the clinical trials, the industrial databases, etc., the imputation technique for missing observations play a contributory role regarding the estimation of population parameters under non-response. A number of imputation methods using the MCAR mechanism have been discussed by the authors including Lee et al. [12], Singh and Horn [23], Singh and Deo [22], Kadilar and Cingi [11], Singh [21], Diana and Perri [8], Al-Omari et al. [1], Gira [9] and recently Bhusan and Pandey [4], Prasad [15], Bhusan and Pandey [5], Singh and Suman [24], Singh et al. [25], Bhusan et al. [6] and Singh and Usman [26]. These authors have made the utilization of information available on each unit of an auxiliary variable which is often used in surveys to increase the precision of estimate of the population mean. Some other related references are Prasad [16], Bouza et al. [3] and Bouza-Herrera and Viada [2].

The aim of the present study is to develop the imputation methods and subsequent estimators with enhanced precision by incorporating the double use of an auxiliary information to estimate the population mean over some relevant estimators which are based on the prime/single use of auxiliary information in case of missing data under the MCAR mechanism. For this, we have considered the rank(dual) of an auxiliary variable which may behaves like an additional auxiliary variable. To our knowledge, no one has tried this type of work for imputation to handle the missing data in estimating the population mean so far. The rest part of the paper is organized as follows: In Sect. 2, the methodology and notations have been discussed and some conventional imputation methods have been reviewed in Sect. 3. In Sect. 4, we have suggested three general class of estimators using imputation techniques and studied their properties. Section 5 talks about the theoretical comparisons of the estimators and the empirical comparisons based on real data sets are presented in Sect. 6. Finally, some concluding remarks are made in Sect. 7.

2 Methodology and Notations

Consider an identifiable population $U=\{U_{1},U_{2},U_{3},...,U_{N}\}$ of size N where our goal is to estimate the population mean ${\bar{Y}}$ of study variable y which possess a proper correlation with an auxiliary variable x. Let $(y_{i}, x_{i})$ be the $i^{th}$ observations of y and x. Suppose that the information is readily available at each unit of auxiliary variable x in the population. Let $R_{x}=\{r_{x,1},r_{x,2},...,r_{x,N}\}$ denote the values of corresponding ranks of $X=\{x_{1},x_{2},...,x_{N}\}$ in U. Remember that the rank $R_{x}$ can also hold an adequate amount of correlation with the study variable y. Let a sample of size n be drawn using simple random sampling without replacement (SRSWOR) technique from the population and surveyed. Unfortunately, response is observed only on $r(<n)$ units for study variable y. For the remaining $(n-r)$ non-responding units, we propose some new imputation methods using the rank of an auxiliary variable, given in section 4. Let A and ${\bar{A}}$ be the sets of responding units and non-responding units, respectively, in the sample. For the sampled units $i\in {A}$, the values $y_{i}$ are observed while for the units $i\in {\bar{A}}$ some imputation techniques are used.

We define some useful notations as follows:

${\bar{Y}}=\sum _{i=1}^{N}y_{i}/N$, ${\bar{y}}_{r}=\sum _{i=1}^{r}y_{i}/r$: The population mean and the response mean of study variable y,
${\bar{X}}=\sum _{i=1}^{N}x_{i}/N$, ${\bar{x}}_{r}=\sum _{i=1}^{r}x_{i}/r$, ${\bar{x}}_{r}=\sum _{i=1}^{r}x_{i}/r$: The population mean, sample mean and the response mean of auxiliary variable x,
${\bar{R}}_{x}=\sum _{i=1}^{N}r_{x,i}/N$, ${\bar{r}}_{x(n)}=\sum _{i=1}^{n}r_{x,i}/n$, ${\bar{r}}_{x(r)}=\sum _{i=1}^{r}r_{x,i}/r$: The population mean, sample mean and the response mean of $R_{x}$,
$S_{y}^{2}=\sum _{i=1}^{N}(y_{i}-{\bar{Y}})^2/(N-1)$: The population variance of y,
$S_{x}^{2}=\sum _{i=1}^{N}(x_{i}-{\bar{X}})^2/(N-1)$: The population variance of x,
$S_{r_{x}}^{2}=\sum _{i=1}^{N}(r_{x,i}-{\bar{R}}_{x})^2/(N-1)$: The population variance of $R_{x}$,
$C_{y}=S_{y}/{\bar{Y}}$: The coefficient of variation of y,
$C_{x}=S_{x}/{\bar{X}}$: The coefficient of variation of x,
$C_{r_{x}}=S_{r_{x}}/{\bar{R}}_{x}$: The coefficient of variation of $R_{x}$,
$\rho _{yx}=S_{yx}/S_{y}S_{x}$: The correlation coefficient between y and x.
$\rho _{yr_{x}}=S_{yr_{x}}/S_{y}S_{r_{x}}$: The correlation coefficient between y and $R_{x}$.
$\rho _{xr_{x}}=S_{yr_{x}}/S_{y}S_{r_{x}}$: The correlation coefficient between x and $R_{x}$.

To obtain the biases and mean square errors (MSEs) of the proposed estimators, we define the following error transformation, as:

$$\begin{aligned} \frac{{\bar{y}}_{r}}{{\bar{Y}}}=(1+\xi _{0}),\;\frac{{\bar{x}}_{r}}{{\bar{X}}}=(1+\xi '_{1}),\;\frac{{\bar{x}}_{n}}{{\bar{X}}}=(1+\xi _{1}),\; \frac{{\bar{r}}_{x(r)}}{{\bar{R}}_{x}}=(1+\xi '_{2}),\;\frac{{\bar{r}}_{x(n)}}{{\bar{R}}_{x}}=(1+\xi _{2}), \end{aligned}$$

such that

$$\begin{aligned} E(\xi _{0})= E(\xi '_{1})= E(\xi _{1})= E(\xi '_{2})= E(\xi _{2})=0 \end{aligned}$$

and

$$\begin{aligned} E(\xi ^{2}_{0})= & {} f_{2}C^{2}_{y},\quad E(\xi ^{'2}_{1})=f_{2}C^{2}_{x},\quad E(\xi ^{2}_{1})=f_{1} C^{2}_{x},\quad E(\xi ^{'2}_{2})=f_{2}C^{2}_{r_{x}},\quad \\&E(\xi ^{2}_{2})=f_{1} C^{2}_{r_{x}},\\ E(\xi _{0}\xi '_{1})= & {} f_{2}\rho _{yx}\,C_{y}C_{x},\quad E(\xi _{0}\xi _{1})=f_{1} \rho _{yx}\,C_{y}C_{x},\quad E(\xi _{0}\xi '_{2})=f_{2}\rho _{yr_{x}}\,C_{y}C_{r_{x}},\\ E(\xi _{0}\xi _{2})= & {} f_{1} \rho _{yr_{x}} C_{y}C_{r_{x}},\quad E(\xi '_{2}\xi _{2})=f_{1} C^{2}_{r_{x}},E(\xi '_{1}\xi _{1})=f_{1} C^{2}_{x},\\ E(\xi '_{1}\xi '_{2})= & {} f_{2}\rho _{xr_{x}}\,C_{x}C_{x},\quad E(\xi '_{1}\xi _{2})=f_{1} \rho _{xr_{x}}\,C_{x}C_{r{x}} \end{aligned}$$

where $f_{1}=\left( \frac{1}{n}-\frac{1}{N}\right) $ and $f_{2}=\left( \frac{1}{r}-\frac{1}{N}\right) $. We also write $f_{3}=\left( \frac{1}{r}-\frac{1}{n}\right) =f_{2}-f_{1}$.

3 Some Conventional Imputation Methods

In this section, we discuss some customary imputation methods and corresponding estimators under three different sampling strategies which are discussed as follows:

Strategy I: When ${\bar{X}}$ and ${\bar{x}}_{n}$ are used.

Strategy II: When ${\bar{X}}$ and ${\bar{x}}_{r}$ are used.

Strategy III: When ${\bar{x}}_{n}$ and ${\bar{x}}_{r}$ are used.

3.1 Mean Imputation Method

The usual mean method of imputation which is free from auxiliary information, is given by

$$\begin{aligned} y_{i,m}= {\left\{ \begin{array}{ll} y_{i}, &{}\quad { \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}}. \end{array}\right. } \end{aligned}$$

(3.1)

The estimator of population mean ${\bar{Y}}$ under mean method of imputation is, say $t_{m}={\bar{y}}_{r}$ whose variance is given by

$$\begin{aligned} V(t_{m})=V({\bar{y}}_{r})={\bar{Y}}^{2}f_{2}C^{2}_{y} \end{aligned}$$

(3.2)

3.2 Lee et al. [12] Imputation Methods

When there is an auxiliary information, then the ratio method of imputation for the data due to Lee et al. [12] can be considered in three strategies as:

$$\begin{aligned}&y_{\cdot i,R_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{1}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{n}}\right) -r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}}. \end{array}\right. }\end{aligned}$$

(3.3)

$$\begin{aligned}&y_{\cdot i,R_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{1}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{r}}\right) -r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}}. \end{array}\right. }\end{aligned}$$

(3.4)

$$\begin{aligned}&y_{\cdot i,R_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{1}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}\right) -r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}}. \end{array}\right. } \end{aligned}$$

(3.5)

The corresponding ratio type estimators are defined as:

$$\begin{aligned} t_{R_{1}}= & {} {\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{n}}\right) \end{aligned}$$

(3.6)

$$\begin{aligned} t_{R_{2}}= & {} {\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{r}}\right) \end{aligned}$$

(3.7)

$$\begin{aligned} t_{R_{3}}= & {} {\bar{y}}_{r}\left( \frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}\right) \end{aligned}$$

(3.8)

The MSEs of $t_{R_{i}}(i=1,2,3)$ to the first-order approximation, are given by

$$\begin{aligned} MSE(t_{R_{i}})=V({\bar{y}}_{r})+{\bar{Y}}^{2}f_{i} C_{x}(C_{x}-2\rho C_{y}) \end{aligned}$$

(3.9)

The ratio estimators $t_{R_{i}}(i=1,2,3)$ are better than mean estimator ${\bar{y}}_{r}$ if $(\rho C_{y}/C_{x})>1/2$.

3.3 Kadilar and Cingi [11] Imputation Methods

The imputation methods under three strategies are given by

$$\begin{aligned}&y_{i,KC_{1}}= {\left\{ \begin{array}{ll} {\left[ \frac{ny_{i}}{r}+b({\bar{X}}-x_{i})\right] }\frac{{\bar{X}}}{{\bar{x}}_{n}}, &{} {\quad \text {if}\;\;\,i \in A } \\ {[}b({\bar{X}}-x_{i})]\frac{{\bar{X}}}{{\bar{x}}_{n}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(3.10)

$$\begin{aligned}&y_{i,KC_{2}}= {\left\{ \begin{array}{ll} {\left[ \frac{ny_{i}}{r}-b\frac{nx_{i}}{r}\right] }\frac{{\bar{X}}}{{\bar{x}}_{r}}, &{} {\quad \text {if}\;\;\,i \in A } \\ b\frac{n{\bar{X}}}{n-r}\frac{{\bar{X}}}{{\bar{x}}_{r}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(3.11)

$$\begin{aligned}&y_{i,KC_{3}}= {\left\{ \begin{array}{ll} {\left[ \frac{ny_{i}}{r}-b\frac{nx_{i}}{r}\right] }\frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}, &{} {\quad \text {if}\;\;\,i \in A } \\ b\frac{n{\bar{x}}_{n}}{n-r}\frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$

(3.12)

The respective estimators are given as:

$$\begin{aligned} t_{KC_{1}}= & {} \frac{{\bar{y}}_{r}+b({\bar{X}}-{\bar{x}}_{n})}{{\bar{x}}_{n}}{\bar{X}}\end{aligned}$$

(3.13)

$$\begin{aligned} t_{KC_{2}}= & {} \frac{{\bar{y}}_{r}+b({\bar{X}}-{\bar{x}}_{r})}{{\bar{x}}_{r}}{\bar{X}}\end{aligned}$$

(3.14)

$$\begin{aligned} t_{KC_{3}}= & {} \frac{{\bar{y}}_{r}+b({\bar{x}}_{n}-{\bar{x}}_{r})}{{\bar{x}}_{r}}{\bar{x}}_{n} \end{aligned}$$

(3.15)

where $b=s_{yx}/s^{2}_{x}$ is the least squares estimated regression coefficient of y on x. Here $s_{yx}=\sum _{i=1}^{n}(x_{i}-{\bar{x}}_{n})(y_{i}-{\bar{y}}_{n})/(n-1)$ and $s^{2}_{x}=\sum _{i=1}^{n}(x_{i}-{\bar{x}}_{n})^{2}/(n-1)$.

The MSEs of the estimators $t_{KC_{i}}(i=1,2,3)$ to the first-order approximation, are respectively, given as:

$$\begin{aligned} MSE(t_{KC_{i}})=V({\bar{y}}_{r})+f_{i}{\bar{Y}}^{2}(C^{2}_{x}-\rho ^{2}_{yx}C^{2}_{y}) \end{aligned}$$

(3.16)

The estimator $t_{KC_{i}} (i=1,2,3)$ are better than mean estimator ${\bar{y}}_{r}$ if $(\rho C_{y}/C_{x})>1$.

3.4 Gira [9] Imputation Methods

On the lines of Gira [9], we have considered three ratio type imputation methods to deal with missing data, given by

$$\begin{aligned}&y_{\cdot i,G_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}\left[ n\left( \frac{\nu _{1}-{\bar{x}}_{n}}{\nu _{1}-{\bar{X}}}\right) -r\right] \frac{x_{i}}{\sum _{i\in {\bar{A}}} x_{i}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(3.17)

$$\begin{aligned}&y_{\cdot i,G_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}\left[ n\left( \frac{\nu _{2}-{\bar{x}}_{r}}{\nu _{2}-{\bar{X}}}\right) -r\right] \frac{x_{i}}{\sum _{i\in {\bar{A}}} x_{i}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(3.18)

$$\begin{aligned}&y_{\cdot i,G_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}\left[ n\left( \frac{\nu _{3}-{\bar{x}}_{r}}{\nu _{3}-{\bar{x}}_{n}}\right) -r\right] \frac{x_{i}}{\sum _{i\in {\bar{A}}} x_{i}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$

(3.19)

where $\nu _{i}(i=1,2,3)$ are the suitably chosen constants.

The point estimators under (3.17), (3.18) and (3.19) are, respectively, given by

$$\begin{aligned}&t_{G_{1}}={\bar{y}}_{r}\left( \frac{\nu _{1}-{\bar{x}}_{n}}{\nu _{1}-{\bar{X}}}\right) \end{aligned}$$

(3.20)

$$\begin{aligned}&t_{G_{2}}={\bar{y}}_{r}\left( \frac{\nu _{2}-{\bar{x}}_{r}}{\nu _{2}-{\bar{X}}}\right) \end{aligned}$$

(3.21)

$$\begin{aligned}&t_{G_{3}}={\bar{y}}_{r}\left( \frac{\nu _{3}-{\bar{x}}_{r}}{\nu _{3}-{\bar{x}}_{n}}\right) \end{aligned}$$

(3.22)

The minimum MSEs of the estimators $t_{G_{i}}(i=1,2,3)$ are given by

$$\begin{aligned} min.MSE(t_{G_{i}})={\bar{Y}}^{2}C^{2}_{y}(f_{2}-f_{i} \rho ^{2}_{yx}) \end{aligned}$$

(3.23)

The optimum values are given as: $\nu _{i(opt)}={\bar{X}}\left[ 1+\frac{1}{\rho _{yx}\frac{C_{y}}{C_{x}}}\right] $.

From (3.2), (3.9) (3.16) and (3.23), it is clear that $t_{G_{i}}(i=1,2,3)$ is always better than ${\bar{y}}_{r}$, $t_{R_{i}}$, $t_{KC_{i}}$ in the respective strategies. Note that Gira [9] showed both theoretically and empirically that his method is equally efficient to the methods propounded by Singh and Horn [23], Singh and Deo [22] and Singh [21].

3.5 Diana and Perri [8] Estimators

Diana and Perri [8] established three regression-type imputation methods, under which the resultant data take the form as:

$$\begin{aligned}&y_{i,DP_{1}}= {\left\{ \begin{array}{ll} \frac{ny_{i}}{r}+b({\bar{X}}-x_{i}), &{} {\quad \text {if}\;\;\,i \in A } \\ b({\bar{X}}-x_{i}), &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(3.24)

$$\begin{aligned}&y_{i,DP_{2}}= {\left\{ \begin{array}{ll} \frac{ny_{i}}{r}-b\frac{nx_{i}}{r}, &{} {\quad \text {if}\;\;\,i \in A } \\ b\frac{n{\bar{X}}}{n-r}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(3.25)

$$\begin{aligned}&y_{i,DP_{3}}= {\left\{ \begin{array}{ll} \frac{ny_{i}}{r}-b\frac{nx_{i}}{r}, &{} {\quad \text {if}\;\;\,i \in A } \\ b\frac{n{\bar{x}}_{n}}{n-r}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$

(3.26)

The subsequent estimators are, respectively, given as:

$$\begin{aligned} t_{DP_{1}}= & {} {\bar{y}}_{r}+b({\bar{X}}-{\bar{x}}_{n})\end{aligned}$$

(3.27)

$$\begin{aligned} t_{DP_{2}}= & {} {\bar{y}}_{r}+b({\bar{X}}-{\bar{x}}_{r})\end{aligned}$$

(3.28)

$$\begin{aligned} t_{DP_{3}}= & {} {\bar{y}}_{r}+b({\bar{x}}_{n}-{\bar{x}}_{r}) \end{aligned}$$

(3.29)

The MSEs of the estimators $t_{DP_{i}}(i=1,2,3)$ are given as:

$$\begin{aligned} MSE(t_{DP_{i}})=min.MSE(t_{G_{i}})={\bar{Y}}^{2}C^{2}_{y}\left[ f_{2}-f_{i}\rho ^{2}_{yx}\right] \end{aligned}$$

(3.30)

3.6 Bhusan and Pandey [4] Imputation Methods

Bhusan and Pandey [4] proposed three different types of imputation methods which are paralleling the improvement over the Diana and Perri [8], are given as:

$$\begin{aligned}&y_{i,BP_{1}}= {\left\{ \begin{array}{ll} \mu _{1}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \mu _{1}{\bar{y}}_{r}+\frac{n\lambda _{1}}{n-r}({\bar{X}}-{\bar{x}}_{n}), &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(3.31)

$$\begin{aligned}&y_{i,BP_{2}}= {\left\{ \begin{array}{ll} \mu _{2}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \mu _{2}{\bar{y}}_{r}+\frac{n\lambda _{2}}{n-r}({\bar{X}}-{\bar{x}}_{r}), &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(3.32)

$$\begin{aligned}&y_{i,BP_{3}}= {\left\{ \begin{array}{ll} \mu _{3}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \mu _{3}{\bar{y}}_{r}+\lambda _{3}(x_{i}-{\bar{x}}_{r}), &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$

(3.33)

where $(\mu _{i}, \lambda _{i}) (i=1,2,3)$ are the arbitrary chosen constants.

The corresponding estimators are defined as:

$$\begin{aligned} t_{BP_{1}}= & {} \mu _{1}{\bar{y}}_{r}+\lambda _{1}({\bar{X}}-{\bar{x}}_{n}) \end{aligned}$$

(3.34)

$$\begin{aligned} t_{BP_{2}}= & {} \mu _{2}{\bar{y}}_{r}+\lambda _{2}({\bar{X}}-{\bar{x}}_{r}) \end{aligned}$$

(3.35)

$$\begin{aligned} t_{BP_{3}}= & {} \mu _{3}{\bar{y}}_{r}+\lambda _{3}({\bar{x}}_{n}-{\bar{x}}_{r}) \end{aligned}$$

(3.36)

The minimum MSE of the estimators $t_{BP_{i}} (i=1,2,3)$ are, respectively, given as:

$$\begin{aligned} min.MSE(t_{BP_{i}})=\frac{{\bar{Y}}^2 MSE(t_{DP_{i}})}{{\bar{Y}}^2+MSE(t_{DP_{i}})} \end{aligned}$$

(3.37)

for the optimum values

$$\begin{aligned}&\mu _{1(opt)}=\frac{1}{1+[f_{3}+f_{1}(1-\rho ^{2}_{yx})]C^{2}_{y}},\qquad \mathrm{and} \quad \beta _{1(opt)}=\left( \rho _{yx}\frac{ S_{y}}{S_{x}}\right) \mu _{1(opt)} \end{aligned}$$

(3.38)

$$\begin{aligned}&\mu _{2(opt)}=\frac{1}{1+[f_{2}(1-\rho ^{2}_{yx})]C^{2}_{y}},\qquad \mathrm{and} \quad \beta _{2(opt)}=\left( \rho _{yx}\frac{ S_{y}}{S_{x}}\right) \mu _{2(opt)} \end{aligned}$$

(3.39)

$$\begin{aligned}&\mu _{3(opt)}=\frac{1}{1+[f_{1}+f_{3}(1-\rho ^{2}_{yx})]C^{2}_{y}},\qquad \; \mathrm{and} \quad \beta _{3(opt)}=\left( \rho _{yx}\frac{ S_{y}}{S_{x}}\right) \mu _{3(opt)}\nonumber \\ \end{aligned}$$

(3.40)

Bhusan and Pandey [5] have also given the improvement over the usual ratio type imputation methods due to Lee. et al. [12] under which the data becomes

$$\begin{aligned}&y_{i,BP^{*}_{1}}= {\left\{ \begin{array}{ll} \omega _{1}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{\omega _{1}}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{n}}\right) ^{\eta _{1}}-r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$

(3.41)

$$\begin{aligned}&y_{i,BP^{*}_{2}}= {\left\{ \begin{array}{ll} \omega _{2}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{\omega _{2}}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{r}}\right) ^{\eta _{2}}-r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$

(3.42)

$$\begin{aligned}&y_{i,BP^{*}_{3}}= {\left\{ \begin{array}{ll} \omega _{3}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{\omega _{3}}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}\right) ^{\eta _{3}}-r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$

(3.43)

The respective estimators are given by

$$\begin{aligned}&t^{*}_{BP_{1}}=\omega _{1}{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{n}}\right) ^{\eta _{1}} \end{aligned}$$

(3.44)

$$\begin{aligned}&t^{*}_{BP_{2}}=\omega _{2}{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{r}}\right) ^{\eta _{2}} \end{aligned}$$

(3.45)

$$\begin{aligned}&t^{*}_{BP_{3}}=\omega _{3}{\bar{y}}_{r}\left( \frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}\right) ^{\eta _{3}} \end{aligned}$$

(3.46)

where $(\omega _{i}, \eta _{i}) (i=1,2,3)$ are the arbitrary chosen constants.

The minimum MSEs of the estimators $t^{*}_{BP_{i}}(i=1,2,3)$ are, respectively, given as:

$$\begin{aligned} min.MSE(t^{*}_{BP_{i}})={\bar{Y}}^2\left( 1-\frac{H^{2}_{i}}{G_{i}}\right) \end{aligned}$$

(3.47)

where

$$\begin{aligned} G_{i}= & {} \{1+f_{2}C^{2}_{y}+2\eta _{i(opt)}^{2}f_{i} C^{2}_{x}+\eta _{i(opt)} f_{i} C_{x}(C_{x}-4\rho _{yx}C_{y})\}\\ H_{i}= & {} \{1+\frac{\eta _{i(opt)}^{2}}{2}f_{i} C^{2}_{x}+\frac{\eta _{i(opt)}}{2}f_{i} C_{x}(C_{x}-2\rho _{yx}C_{y})\} \end{aligned}$$

The optimum values are given as: $\omega _{i(opt)}=\frac{H_{i}}{G_{i}}(i=1,2,3)$ and $\eta _{i(opt)}=\rho _{yx}\frac{C_{y}}{C_{x}}$.

The above existing imputation methods and resultant estimators are based on single use of an auxiliary variable . We propose some imputation techniques based on dual use of an auxiliary variable given in next section.

4 Suggested Imputation Methods

In this section, we consider the double (rank) use of an auxiliary variable to impute the missing data in three strategies ie, Strategy I, Strategy II and Strategy III which are defined as follows:

Strategy I: When ${\bar{X}}$, ${\bar{x}}_{n}$ and ${\bar{r}}_{x(n)}$ are used.

Strategy II: When ${\bar{X}}$, ${\bar{x}}_{r}$ and ${\bar{r}}_{x(r)}$ are used.

Strategy III: When $({\bar{x}}_{n}, {\bar{x}}_{r})$ and $({\bar{r}}_{x(n)}, {\bar{r}}_{x(r)})$ are used.

We suggest three generalized class of difference-cum-ratio type imputation methods in three strategies given above under which the data, respectively, take forms as:

$$\begin{aligned}&y_{\cdot i,P_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha _{1}{\bar{y}}_{r}+\beta _{1}(x^{*}_{1}-x_{i})+\gamma _{1}(r^{*}_{1}-r_{i})\right] \left[ \frac{u{\bar{X}}+v}{u{\bar{x}}_{n}+v}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(4.1)

$$\begin{aligned}&y_{\cdot i,P_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha _{2}{\bar{y}}_{r}+\beta _{2}(x^{*}_{2}+\frac{n}{r}x_{i}) +\gamma _{2}(r^{*}_{2}+\frac{n}{r}r_{i})\right] \left[ \frac{u{\bar{X}}+v}{u{\bar{x}}_{r}+v}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(4.2)

$$\begin{aligned}&y_{\cdot i,P_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha _{3}{\bar{y}}_{r}+\beta _{3}(x^{*}_{3}+\frac{n}{r}x_{i}) +\gamma _{3}(r^{*}_{3}+\frac{n}{r}r_{i})\right] \left[ \frac{u{\bar{x}}_{n}+v}{u{\bar{x}}_{r}+v}\right] , &{} {\quad if\;\;\,i \in {\bar{A}}} \end{array}\right. }\nonumber \\ \end{aligned}$$

(4.3)

where

$$\begin{aligned}&x^{*}_{1}=\frac{n{\bar{X}}-r{\bar{x}}_{r}}{n-r},\qquad r^{*}_{1}=\frac{n{\bar{R}}_{x}-r{\bar{r}}_{x(r)}}{n-r},\\&x^{*}_{2}=\frac{n}{n-r}\left[ {\bar{X}}-\frac{n}{r}{\bar{x}}_{n}\right] ,\qquad r^{*}_{2}=\frac{n}{n-r}\left[ {\bar{R}}_{x}-\frac{n}{r}{\bar{r}}_{x(n)}\right] ,\\&x^{*}_{3}=\frac{n}{n-r}\left( 1-\frac{n}{r}\right) {\bar{x}}_{n},\qquad r^{*}_{3}=\frac{n}{n-r}\left( 1-\frac{n}{r}\right) {\bar{r}}_{x(n)}. \end{aligned}$$

The corresponding estimators of population mean ${\bar{Y}}$ of study variable y in case of missing observations are defined as:

$$\begin{aligned}&t_{P_{1}}=\left[ \alpha _{1}{\bar{y}}_{r}+\beta _{1}({\bar{X}}-{\bar{x}}_{n})+\gamma _{1}({\bar{R}}_{x}-{\bar{r}}_{x(n)})\right] \left[ \frac{u{\bar{X}}+v}{u{\bar{x}}_{n}+v}\right] \end{aligned}$$

(4.4)

$$\begin{aligned}&t_{P_{2}}=\left[ \alpha _{2}{\bar{y}}_{r}+\beta _{2}({\bar{X}}-{\bar{x}}_{r})+\gamma _{2}({\bar{R}}_{x}-{\bar{r}}_{x(r)})\right] \left[ \frac{u{\bar{X}}+v}{u{\bar{x}}_{r}+v}\right] \end{aligned}$$

(4.5)

$$\begin{aligned}&t_{P_{3}}=\left[ \alpha _{3}{\bar{y}}_{r}+\beta _{3}({\bar{x}}_{n}-{\bar{x}}_{r})+\gamma _{3}({\bar{r}}_{x(n)}-{\bar{r}}_{x(r)})\right] \left[ \frac{u{\bar{x}}_{n}+v}{u{\bar{x}}_{r}+v}\right] \end{aligned}$$

(4.6)

Here, $(\alpha _{i}, \beta _{i}, \gamma _{i})(i=1,2,3)$ are the arbitrary chosen constants, $u(\ne 0),v$ are the real numbers or the functions of known parameters of the auxiliary variable which may be readily known or guessed from past surveys such as coefficient of variation $C_{x}$, correlation coefficient $\rho _{yx}$, etc.

4.1 Some Special Cases of Proposed Estimators

(i) If we set $\alpha _{i}=1;(i=1,2,3)$ and $(u,v)=(0,1)$ in (4.1), (4.2) and (4.3), we can get the conventional difference type imputation methods based on dual of an auxiliary variable, given as:

$$\begin{aligned}&y_{\cdot i,d_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}+\left[ \phi _{1}(x^{*}_{1}-x_{i})+\varphi _{1}(r^{*}_{1}-r_{i})\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(4.7)

$$\begin{aligned}&y_{\cdot i,d_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}+\left[ \phi _{2}(x^{*}_{2}+\frac{n}{r}x_{i})+\varphi _{2}(r^{*}_{2}+\frac{n}{r}r_{i})\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(4.8)

$$\begin{aligned}&y_{\cdot i,d_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}+\left[ \phi _{3}(x^{*}_{3}+\frac{n}{r}x_{i})+\varphi _{3}(r^{*}_{3}+\frac{n}{r}r_{i})\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$

(4.9)

The respective estimators are given by

$$\begin{aligned}&t_{d_{1}}=\left[ {\bar{y}}_{r}+\phi _{1}({\bar{X}}-{\bar{x}}_{n})+\varphi _{1}({\bar{R}}_{x}-{\bar{r}}_{x(n)})\right] \end{aligned}$$

(4.10)

$$\begin{aligned}&t_{d_{2}}=\left[ {\bar{y}}_{r}+\phi _{2}({\bar{X}}-{\bar{x}}_{r})+\varphi _{2}({\bar{R}}_{x}-{\bar{r}}_{x(r)})\right] \end{aligned}$$

(4.11)

$$\begin{aligned}&t_{d_{3}}=\left[ {\bar{y}}_{r}+\phi _{3}({\bar{x}}_{n}-{\bar{x}}_{r})+\varphi _{3}({\bar{r}}_{x(n)}-{\bar{r}}_{x(r)})\right] \end{aligned}$$

(4.12)

(ii) If we set $(u,v)=(0,1)$ in (4.1), (4.2) and (4.3), we get the improved difference type imputation methods given by

$$\begin{aligned}&y_{\cdot i,D_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha ^{*}_{1}{\bar{y}}_{r}+\beta ^{*}_{1}(x^{*}_{1}-x_{i})+\gamma ^{*}_{1}(r^{*}_{1}-r_{i})\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(4.13)

$$\begin{aligned}&y_{\cdot i,D_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha ^{*}_{2}{\bar{y}}_{r}+\beta ^{*}_{2}(x^{*}_{2}+\frac{n}{r}x_{i}) +\gamma ^{*}_{2}(r^{*}_{2}+\frac{n}{r}r_{i})\right] , &{} {\quad if\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$

(4.14)

$$\begin{aligned}&y_{\cdot i,D_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha ^{*}_{3}{\bar{y}}_{r}+\beta ^{*}_{3}(x^{*}_{3}+\frac{n}{r}x_{i}) +\gamma ^{*}_{3}(r^{*}_{3}+\frac{n}{r}r_{i})\right] , &{} {\quad if\;\;\,i \in {\bar{A}}} \end{array}\right. }\nonumber \\ \end{aligned}$$

(4.15)

The respective estimators are given by

$$\begin{aligned}&t_{D_{1}}=\left[ \alpha ^{*}_{1}{\bar{y}}_{r}+\beta ^{*}_{1}({\bar{X}}-{\bar{x}}_{n})+\gamma ^{*}_{1}({\bar{R}}_{x}-{\bar{r}}_{x(n)})\right] \end{aligned}$$

(4.16)

$$\begin{aligned}&t_{D_{2}}=\left[ \alpha ^{*}_{2}{\bar{y}}_{r}+\beta ^{*}_{2}({\bar{X}}-{\bar{x}}_{r})+\gamma ^{*}_{2}({\bar{R}}_{x}-{\bar{r}}_{x(r)})\right] \end{aligned}$$

(4.17)

$$\begin{aligned}&t_{D_{3}}=\left[ \alpha ^{*}_{3}{\bar{y}}_{r}+\beta ^{*}_{3}({\bar{x}}_{n}-{\bar{x}}_{r})+\gamma ^{*}_{3}({\bar{r}}_{x(n)}-{\bar{r}}_{x(r)})\right] \end{aligned}$$

(4.18)

which are paralleling improvement over the estimators $t_{d_{i}}(i=1,2,3)$.

4.2 Properties of Proposed Estimators

Theorem 4.1

The biases and MSEs of the estimators $t_{P_{i}}(i=1,2,3)$ to the first-order approximations are given by

$$\begin{aligned} B(t_{P_{i}})={\bar{Y}}\left[ \alpha _{i}F_{i}+R_{1}\beta _{i}G_{i}+R_{2}\gamma _{1}H_{i}-1\right] \end{aligned}$$

(4.19)

and

$$\begin{aligned} \begin{aligned} MSE(t_{P_{i}})=&{\bar{Y}}^{2}[\alpha ^{2}_{i}A'_{i}+\beta ^{2}_{i}B'_{i}+\gamma ^{2}_{i}C'_{i} +2\alpha _{i}\beta _{i}D'_{i}+2\alpha _{i}\gamma _{i}E'_{i}+2\beta _{i}\gamma _{i}F'_{i}\\&-2\alpha _{i}G'_{i}-2\beta _{i}H'_{i}-2\gamma _{i}I'_{i}+1] \end{aligned} \end{aligned}$$

(4.20)

The minimum MSEs of the estimators $t_{P_{i}}(i=1,2,3)$ are given by

$$\begin{aligned} min.MSE(t_{P_{i}})={\bar{Y}}^{2}\left[ 1-\frac{Q^{2}_{i}}{P_{i}}\right] \end{aligned}$$

(4.21)

at the optimum values of $\alpha _{i}(i=1,2,3)$, $\beta _{i}$ and $\gamma _{i}$, are given by

$$\begin{aligned}&\alpha _{i(opt)}=\frac{E_{i}-\beta _{i(opt)}L_{i}-\gamma _{i(opt)}M_{i}}{P_{i}}, \end{aligned}$$

(4.22)

$$\begin{aligned}&\beta _{i(opt)}=\frac{C^{*}_{i}E^{*}_{i}-B^{*}_{i}D^{*}_{i}}{C^{*}_{i}D^{*}_{i}-A^{*}_{i}E^{*}_{i}}\gamma _{i(opt)} \end{aligned}$$

(4.23)

$$\begin{aligned}&\gamma _{i(opt)}=\frac{C^{*}_{i}D^{*}_{i}-A^{*}_{i}E^{*}_{i}}{D_{i}} \end{aligned}$$

(4.24)

where

$$\begin{aligned}&A'_{i}=[1+f_{2}C^{2}_{y}+f_{i}\{\psi C_{x}(3\psi C_{x}-4\rho _{yx}C_{y})\}],\\&B'_{i}=R^{2}_{1}f_{i} C^{2}_{x},\qquad R_{1}=\frac{{\bar{X}}}{{\bar{Y}}},\\&C'_{i}=R^{2}_{2}f_{i} C^{2}_{r},\qquad R_{2}=\frac{\bar{R_{x}}}{{\bar{Y}}},\\&D'_{i}=R_{1}f_{i} C_{x}(2\psi C_{x}-\rho _{yx}C_{y}),\quad E'_{i}=R_{2}f_{i} C_{r}(2\psi \rho _{xr}C_{x}-\rho _{yr}C_{y}),\\&F'_{i}=R_{1}R_{2}f_{i} \rho _{xr}C_{x}C_{r},\quad G'_{i}=[1+\psi f_{i} C_{x}(\psi C_{x}-\rho _{yx}C_{y})],\\&H'_{i}=R_{1}\psi f_{i} C^{2}_{x},\quad I'_{i}=R_{2}\psi f_{i} \rho _{xr}C_{x}C_{r},\\&F_{i}=1-\psi f_{i}C_{x}(\rho _{yx}C_{y}-\psi C_{x}),\\&G_{i}=\psi f_{i}C^{2}_{x}, \quad H_{i}=\psi f_{i}\rho _{xr_{x}}C_{x}C_{r},\\&Q^{2}_{i}=E^{2}_{i}-(A^{*}_{i}E^{*2}_{i}+B^{*}_{i}D^{*2}_{i}-2C^{*}_{i}D^{*}_{i}E^{*}_{i})/D_{i},\\&P_{i}=1+f_{2}C^{2}_{y}+f_{i}[\psi C_{x}(3\psi C_{x}-4\rho _{yx}C_{y})],\\&D_{i}=C^{*2}_{i}-A^{*}_{i}B^{*}_{i},\\&A^{*}_{i}=R^{2}_{1}f_{i}C^{2}_{x}[(1+f_{2}C^{2}_{y})-f_{i}(\rho ^{2}_{yx})-\psi ^{2} C^{2}_{x})],\\&B^{*}_{i}=R^{2}_{2}f_{i}C^{2}_{r}\left[ (1+f_{2}C^{2}_{y})-f_{i}\{\rho ^{2}_{yr}-\psi ^{2}C^{2}_{x}(3-4\rho ^{2}_{xr})-4\psi C_{y}C_{x}(\rho _{yx}-\rho _{yr}\rho _{xr})\}\right] ,\\&C^{*}_{i}=R_{2} f_{i}\rho _{xr}C_{x}C_{r}[R_{1}(1+f_{2}C^{2}_{y}+3f_{i}C^{2}_{x}-4f_{i}\rho _{yx}C_{y}C_{x})\\&\quad -R_{2}\psi f_{i}C_{r}(2\psi \rho _{xr}C_{x}-\rho _{yr}C_{y})],\\&D^{*}_{i}=R_{1}f_{i}C_{x}[(\psi C_{x}-\rho _{yx}C_{y})(\psi ^{2}f_{i}C^{2}_{x}-1)+\psi f_{i}C^{2}_{y}C_{x}(1-\rho ^{2}_{yx})],\\&E^{*}_{i}=R_{2}f_{i}C_{r}[\rho _{yr}C_{y}-\psi \rho _{xr}C_{x}(1-\psi ^{2}f_{i}C^{2}_{x})\\&\quad +\psi f_{i}C_{y}C_{x}\{C_{y}(\rho _{xr}-\rho _{yx}\rho _{yr})+\psi C_{x}(\rho _{yr}-2\rho _{yx}\rho _{xr})\}],\\&E_{i}=1+\psi f_{i}C_{x}(\psi C_{x}-\rho _{yx}C_{y}),\quad L_{i}=R_{1}f_{i}C_{x}(2\psi C_{x}-\rho _{yx}C_{y}),\\&and\;\;M_{i}=R_{2}f_{i}C_{r}(2\psi \rho _{xr}C_{x}-\rho _{yr}C_{y}). \end{aligned}$$

Proof

Expressing (4.4), (4.5) and (4.6) in terms of errors, we get

$$\begin{aligned} t_{P_{1}}= & {} \left[ {\bar{Y}}\alpha _{1}(1+\xi _{0})-{\bar{X}}\beta _{1}\xi _{1}-{\bar{R}}_{x}\gamma _{1}\xi _{2}\right] (1+\psi \xi _{1})^{-1} \end{aligned}$$

(4.25)

$$\begin{aligned} t_{P_{2}}= & {} \left[ {\bar{Y}}\alpha _{2}(1+\xi _{0})-{\bar{X}}\beta _{2}\xi '_{1}-{\bar{R}}_{x}\gamma _{2}\xi '_{2}\right] (1+\psi \xi '_{1})^{-1} \end{aligned}$$

(4.26)

$$\begin{aligned} t_{P_{3}}= & {} \left[ {\bar{Y}}\alpha _{3}(1+\xi _{0})-{\bar{X}}\beta _{3}(\xi _{1}-\xi '_{1})-{\bar{R}}_{x}\gamma _{3}(\xi _{2}-\xi '_{2})\right] \nonumber \\&(1+\psi \xi _{1})(1+\psi \xi '_{1})^{-1} \end{aligned}$$

(4.27)

where $\psi =\frac{u{\bar{X}}}{u{\bar{X}}+v}$. We assume that $|\xi _{1}|<1$ and $|\xi '_{1}|<1$, so that $(1+\xi _{1})^{-1}$ and $(1+\xi '_{1})^{-1}$ are expandable. Now, expanding $(1+\xi _{1})^{-1}$ and $(1+\xi '_{1})^{-1}$ binomially, multiplying over right-hand sides (r.h.s.) of (4.25), (4.26) and (4.27) and neglecting the terms of errors having power greater than two, we can get, respectively, as:

$$\begin{aligned}&t_{P_{1}}={\bar{Y}}\left[ \alpha _{1}\left( 1+\xi _{0}-\psi \xi _{1}+\psi \xi ^{2}_{2}-\psi \xi _{0}\xi _{1}\right) \right. \nonumber \\&\quad \left. -\beta _{1}R_{1}(\xi _{1}-\psi \xi ^{2}_{1}) -\gamma _{1}R_{2}(\xi _{2}-\psi \xi _{1}\xi _{2})\right] \end{aligned}$$

(4.28)

$$\begin{aligned}&t_{P_{2}}={\bar{Y}}\left[ \alpha _{2}\left( 1+\xi _{0}-\psi \xi '_{1}+\psi \xi '^{2}_{2}-\psi \xi _{0}\xi '_{1}\right) \right. \nonumber \\&\quad \left. -\beta _{2}R_{1}(\xi '_{1}-\psi \xi '^{2}_{1}) -\gamma _{2}R_{2}(\xi '_{2}-\psi \xi '_{1}\xi '_{2})\right] \end{aligned}$$

(4.29)

$$\begin{aligned}&t_{P_{3}}={\bar{Y}}\left[ \alpha _{3}\left( 1+\xi _{0}-\psi \xi '_{1}+\psi \xi '^{2}_{2}-\psi \xi _{0}\xi '_{1}\right) \right. \nonumber \\&\quad \left. -\beta _{3}R_{1}(\xi '_{1}-\psi \xi '^{2}_{1}) -\gamma _{3}R_{2}(\xi '_{2}-\psi \xi '_{1}\xi '_{2})\right] \end{aligned}$$

(4.30)

The biases of the estimators $t_{P_{i}}(i=1,2,3)$ can be derived as:

$$\begin{aligned}&\begin{aligned} B(t_{P_{1}})=&E\left[ t_{P_{1}}-{\bar{Y}}\right] \\ =&E\left[ {\bar{Y}}\left[ \alpha _{1}\left( 1+\xi _{0}-\psi \xi _{1}+\psi \xi ^{2}_{2}-\psi \xi _{0}\xi _{1}\right) -\beta _{1}R_{1}(\xi _{1}-\psi \xi ^{2}_{1})\right. \right. \\&\left. \left. -\gamma _{1}R_{2}(\xi _{2}-\psi \xi _{1}\xi _{2})\right] -{\bar{Y}}\right] \end{aligned} \qquad \end{aligned}$$

(4.31)

$$\begin{aligned}&\begin{aligned} B(t_{P_{2}})=&E\left[ t_{P_{2}}-{\bar{Y}}\right] \\ =&E\left[ {\bar{Y}}\left[ \alpha _{2}\left( 1+\xi _{0}-\psi \xi '_{1}+\psi \xi '^{2}_{2}-\psi \xi _{0}\xi '_{1}\right) -\beta _{2}R_{1}(\xi '_{1}-\psi \xi '^{2}_{1})\right. \right. \\&\left. \left. -\gamma _{2}R_{2}(\xi '_{2}-\psi \xi '_{1}\xi '_{2})\right] -{\bar{Y}}\right] \end{aligned} \qquad \qquad \end{aligned}$$

(4.32)

$$\begin{aligned}&\begin{aligned} B(t_{P_{3}})=&E\left[ t_{P_{3}}-{\bar{Y}}\right] \\ =&E\left[ {\bar{Y}}\left[ \alpha _{3}\left( 1+\xi _{0}-\psi \xi '_{1}+\psi \xi '^{2}_{2}-\psi \xi _{0}\xi '_{1}\right) -\beta _{3}R_{1}(\xi '_{1}-\psi \xi '^{2}_{1})\right. \right. \\&\left. \left. -\gamma _{3}R_{2}(\xi '_{2}-\psi \xi '_{1}\xi '_{2})\right] -{\bar{Y}}\right] \end{aligned}\qquad \qquad \end{aligned}$$

(4.33)

Taking expectations of both sides of (4.31)–(4.33), we get the expressions for biases given in (4.19).

The MSEs of the estimators $t_{P_{i}}(i=1,2,3)$ to the first-order approximation can be derived as:

$$\begin{aligned}&\begin{aligned} M(t_{P_{1}})=&E\left[ t_{P_{1}}-{\bar{Y}}\right] ^{2}\\ =&E\left[ {\bar{Y}}^{2}\left[ \alpha ^{2}_{1}(1+2\xi _{0}-2\psi \xi _{1}+\xi ^{2}_{0}+3\psi ^{2}\xi ^{2}_{1}-4\psi \xi _{0}\xi _{1}) +\beta ^{2}_{1}R^{2}_{1}\xi ^{2}_{1}+\gamma ^{2}_{1}R^{2}_{2}\xi ^{2}_{2}\right. \right. \\&+2\alpha _{1}\beta _{1}R_{1}(2\psi \xi ^{2}_{1}-\xi _{0}\xi _{1})+2\alpha _{1}\gamma _{1}R_{2}(2\psi \xi _{1}\xi _{2}-\xi _{0}\xi _{2}) +2\alpha _{1}\beta _{1}R_{1}R_{2}\xi _{1}\xi _{2}\\&\left. \left. -2\alpha _{1}(1+\psi ^{2}\xi ^{2}_{1}-\psi \xi _{0}\xi _{1})-2\beta _{1}R_{1}\psi \xi ^{2}_{1}-2\gamma _{3}R_{2}\psi \xi _{1}\xi _{2}+1\right] \right] \end{aligned} \end{aligned}$$

(4.34)

$$\begin{aligned}&\begin{aligned} M(t_{P_{2}})=&E\left[ t_{P_{2}}-{\bar{Y}}\right] ^{2}\\ =&E\left[ {\bar{Y}}^{2}\left[ \alpha ^{2}_{2}(1+2\xi _{0}-2\psi \xi '_{1}+\xi ^{2}_{0}+3\psi ^{2}\xi '^{2}_{1}-4\psi \xi _{0}\xi '_{1}) +\beta ^{2}_{2}R^{2}_{1}\xi '^{2}_{1}+\gamma ^{2}_{2}R^{2}_{2}\xi '^{2}_{2}\right. \right. \\&+2\alpha _{2}\beta _{2}R_{1}(2\psi \xi '^{2}_{1}-\xi _{0}\xi '_{1})+2\alpha _{2}\gamma _{2}R_{2}(2\psi \xi '_{1}\xi '_{2}-\xi _{0}\xi '_{2}) +2\alpha _{2}\beta _{2}R_{1}R_{2}\xi _{1}\xi '_{2}\\&\left. \left. -2\alpha _{2}(1+\psi ^{2}\xi '^{2}_{1}-\psi \xi _{0}\xi '_{1})-2\beta _{2}R_{1}\psi \xi '^{2}_{1}-2\gamma _{2}R_{2}\psi \xi '_{1}\xi '_{2}+1\right] \right] \end{aligned} \end{aligned}$$

(4.35)

$$\begin{aligned}&\begin{aligned} M(t_{P_{3}})=&E\left[ t_{P_{3}}-{\bar{Y}}\right] ^{2}\\ =&E\left[ {\bar{Y}}^{2}\left[ \alpha ^{2}_{3}(1+2\xi _{0}-2\psi \xi '_{1}+2\psi \xi _{1}+\xi ^{2}_{0}+3\psi ^{2}\xi '^{2}_{1}+4\psi \xi _{0}\xi _{1}\right. \right. \\&\quad -4\psi \xi _{0}\xi '_{1}-4\psi ^{2}\xi _{1}\xi '_{1})\\&+\beta ^{2}_{3}R^{2}_{1}(\xi ^{2}_{1}+\xi '^{2}_{1}-2\xi _{1}\xi '_{1}) +\gamma ^{2}_{3}R^{2}_{2}(\xi ^{2}_{2}+\xi '^{2}_{2}-2\xi _{2}\xi '_{2})\\&+2\alpha _{3}\beta _{3}R_{1}(2\psi \xi '^{2}_{1}+2\psi \xi '^{2}_{1}+\xi _{0}\xi _{1}-\xi _{0}\xi '_{1}-4\psi \xi _{1}\xi '_{1})\\&+2\alpha _{3}\gamma _{3}R_{2}(\xi _{0}\xi _{2}-\xi _{0}\xi '_{2}+2\psi \xi _{1}\xi _{2}-4\psi \xi '_{1}\xi _{2}+2\psi \xi '_{1}\xi '_{2})\\&+2\beta _{3}\gamma _{3}R_{1}R_{2}(\xi _{1}\xi _{2}-\xi _{1}\xi '_{2}-\xi '_{1}\xi _{2}-\xi '_{1}\xi '_{2})\\&-2\alpha _{3}(1+\psi ^{2}\xi '^{2}_{1}+\psi \xi _{0}\xi '_{1}-\psi \xi _{0}\xi '_{1}-\psi ^{2}\xi _{1}\xi '_{1})\\&-2\beta _{3}R_{1}(\psi \xi ^{2}_{1}+\psi \xi '^{2}_{1}-2\psi \xi _{1}\xi '_{1})\\&\left. \left. -2\gamma _{3}R_{2}(\psi \xi _{1}\xi _{2}-\psi \xi _{1}\xi '_{2}-\psi \xi '_{1}\xi _{2}+\psi \xi '_{1}\xi '_{2})+1\right] \right] \end{aligned} \end{aligned}$$

(4.36)

Now, taking the expectations of both sides of (4.34)–(4.36), we get the expressions for MSEs given in (4.20).

To obtain the optimum choices of $\alpha _{i}$, $\beta _{i}$ and $\gamma _{i}$, we differentiate the expressions of MSEs given in (3.20) partially with respect to $\alpha _{i}$, $\beta _{i}$ and $\gamma _{i}$ and equate them to zero, we get

$$\begin{aligned}&\frac{\partial }{\partial \alpha _{i}} MSE(t_{P_{i}}) =\alpha _{i}A'_{i}+\beta _{i}D'_{i}+\gamma _{i}E'_{i}-G'_{i}=0 \end{aligned}$$

(4.37)

$$\begin{aligned}&\frac{\partial }{\partial \beta _{i}}MSE(t_{P_{i}}) =\alpha _{i}D'_{i}+\beta _{i}B'_{i}+\gamma _{i}F'_{i}-H'_{i}=0 \end{aligned}$$

(4.38)

$$\begin{aligned}&\begin{aligned} \frac{\partial }{\partial \gamma _{i}}MSE(t_{P_{i}}) =\alpha _{i}E'_{i}+\beta _{i}F'_{i}+\gamma _{i}C'_{i}-I'_{i}=0 \end{aligned} \end{aligned}$$

(4.39)

Solving Eqs. (4.37)–(4.39) for $\alpha _{i}$, $\beta _{i}$ and $\gamma _{i}$, we get the optimum values given in (4.22)–(4.24). By putting these optimum values in (4.20), we get the expressions for minimum MSEs of $t_{P_{i}}$ given in (4.21). $\square $

Corollary 4.1

The MSEs of the unbiased estimators $t_{d_{i}}(i=1,2,3)$ to the first-order approximations are given by

$$\begin{aligned} MSE(t_{d_{i}})= & {} {\bar{Y}}^{2}\left[ f_{2}C^{2}_{y}+\beta ^{2}_{i}R^{2}_{1}f_{i}C^{2}_{x}+\gamma ^{2}_{i}R^{2}_{2}f_{i}C^{2}_{r}+2\beta _{i}\gamma _{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}\right. \nonumber \\&\left. -2\beta _{i}R_{1}f_{i}\rho _{yx}C_{y}C_{x}+2\gamma _{i}R_{2}f_{i}\rho _{yr}C_{y}C_{r}\right] \end{aligned}$$

(4.40)

The minimum MSEs of the estimators $t_{d_{i}}(i=1,2,3)$ to the first-order approximations are given by

$$\begin{aligned} min.MSE(t_{d_{i}})={\bar{Y}}^{2}C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}}) \end{aligned}$$

(4.41)

for the optimum values

$$\begin{aligned}&\phi _{i(opt)}=\frac{C_{y}}{C_{x}}\frac{(\rho _{yx}-\rho _{yr_{x}}\rho _{xr_{x}})}{1-\rho ^{2}_{xr_{x}}} \end{aligned}$$

(4.42)

$$\begin{aligned}&\varphi _{i(opt)}=\frac{C_{y}}{C_{r}}\frac{(\rho _{yr_{x}}-\rho _{yx}\rho _{xr_{x}})}{1-\rho ^{2}_{xr_{x}}} \end{aligned}$$

(4.43)

where $R^{2}_{y.xr_{x}}=\left( \frac{\rho ^{2}_{yx}+\rho ^{2}_{yr_{x}}-2\rho _{yx}\rho _{yr_{x}}\rho _{xr_{x}}}{1-\rho ^{2}_{xr_{x}}}\right) $.

Proof

By putting $\alpha _{i}=1$ and $\psi =0$ in (4.20), we get the expressions for MSEs of the estimators $t_{d_{i}}$ given in (4.40).

To obtain minimum MSEs of $t_{d_{i}}$, we differentiate the expressions of MSEs given in (4.40) partially with respect to $\phi _{i}$ and $\varphi _{i}$ and equate them to zero, we get

$$\begin{aligned} \frac{\partial }{\partial \phi _{i}}MSE(t_{d_{i}})= & {} \phi _{i}R^{2}_{1}f_{i}C^{2}_{x}\nonumber \\&+\varphi _{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r} -R_{1}f_{i}\rho _{yx}C_{y}C_{x}=0 \end{aligned}$$

(4.44)

$$\begin{aligned} \frac{\partial }{\partial \varphi _{i}}MSE(t_{d_{i}})= & {} \phi _{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}\nonumber \\&+\varphi _{i}R^{2}_{2}f_{i}C^{2}_{r} -R_{1}f_{i}\rho _{yx}C_{y}C_{x}=0 \end{aligned}$$

(4.45)

Solving Eqs. (4.44) and (4.45) for $\phi _{i}$ and $\varphi _{i}$, we get the optimum values given in (4.42) and (4.43). By putting these optimum values in (4.40), we get the minimum MSE of $t_{d_{i}}$ given in (4.41). $\square $

Corollary 4.2

The biases and MSEs of the estimators $t_{D_{i}}(i=1,2,3)$ to the first-order approximations are given by

$$\begin{aligned} B(t_{D_{i}})={\bar{Y}}(\alpha ^{*}_{i}-1) \end{aligned}$$

(4.46)

and

$$\begin{aligned} \begin{aligned}&MSE(t_{D_{i}})\\&\quad ={\bar{Y}}^{2}\left[ \alpha ^{*2}_{i}(1+f_{2}C^{2}_{y})+\beta ^{*2}_{i}R^{2}_{1}f_{i}C^{2}_{x}+\gamma ^{*2}_{i}R^{2}_{2}f_{i}C^{2}_{r}-2\alpha ^{*}_{i}\beta ^{*}_{i}R_{1}f_{i}\rho _{yx}C_{y}C_{x}\right. \\&\qquad \left. +2\alpha ^{*}_{i}\gamma ^{*}_{i}R_{2}f_{i}\rho _{yr}C_{y}C_{r}+2\beta ^{*}_{i}\gamma ^{*}_{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}-2\alpha ^{*}_{i}\right] \end{aligned}\nonumber \\ \end{aligned}$$

(4.47)

The minimum MSEs of the estimators $t_{D_{i}}$ are given by

$$\begin{aligned} min.MSE(t_{D_{i}})=\frac{{\bar{Y}}^{2}min.MSE(t_{d_{i}})}{{\bar{Y}}^{2}+min.MSE(t_{d_{i}})} \end{aligned}$$

(4.48)

for the optimum values

$$\begin{aligned}&\alpha ^{*}_{i(opt)}=\frac{1}{1+C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})} \end{aligned}$$

(4.49)

$$\begin{aligned}&\beta ^{*}_{i(opt)}=\alpha ^{*}_{i(opt)}\frac{C_{y}}{C_{x}}\frac{(\rho _{yx}-\rho _{yr_{x}}\rho _{xr_{x}})}{1-\rho ^{2}_{xr_{x}}} \end{aligned}$$

(4.50)

$$\begin{aligned}&\gamma ^{*}_{i(opt)}=\alpha ^{*}_{i(opt)}\frac{C_{y}}{C_{r}}\frac{(\rho _{yr_{x}}-\rho _{yx}\rho _{xr_{x}})}{1-\rho ^{2}_{xr_{x}}} \end{aligned}$$

(4.51)

Proof

By putting $\psi =0$ in (4.19) and (4.20), we get the expressions for biases and MSEs of the estimators $t_{D_{i}}$ given in (4.46) and (4.47).

To obtain minimum MSEs of $t_{D_{i}}$, we differentiate the expressions of MSEs given in (4.47) partially with respect to $\alpha ^{*}_{i}$, $\beta ^{*}_{i}$ and $\gamma ^{*}_{i}$ and equate them to zero, we get

$$\begin{aligned}&\frac{\partial }{\partial \alpha ^{*}_{i}}MSE(t_{D_{i}}) =\alpha ^{*}_{i}(1+f_{2}C^{2}_{y})-\beta ^{*}_{i}R_{1}f_{i}\rho _{yx}C_{y}C_{x} +\gamma ^{*}_{i}R_{2}f_{i}\rho _{yr}C_{y}C_{r}-1=0 \end{aligned}$$

(4.52)

$$\begin{aligned}&\frac{\partial }{\partial \beta ^{*}_{i}}MSE(t_{D_{i}}) =\alpha ^{*}_{i}R_{1}f_{i}\rho _{yx}C_{y}C_{x}-\beta ^{*}_{i}R^{2}_{1}f_{i}C^{2}_{x} -\gamma ^{*}_{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}=0 \end{aligned}$$

(4.53)

$$\begin{aligned}&\frac{\partial }{\partial \gamma ^{*}_{i}}MSE(t_{D_{i}}) =\alpha ^{*}_{i}R_{2}f_{i}\rho _{yr}C_{y}C_{r}+\beta ^{*}_{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}+\gamma ^{*}_{i}R^{2}_{2}f_{i}C^{2}_{r}=0\nonumber \\ \end{aligned}$$

(4.54)

Solving Equations (4.52)–(4.54) for $\alpha ^{*}_{i}$, $\beta ^{*}_{i}$ and $\gamma ^{*}_{i}$, we get the optimum values given in (4.49)–(4.51). By putting these optimum values in (4.47), we get the minimum MSE of $t_{D_{i}}$ given in (4.48).

4.3 Practicability of the suggested estimators

The suggested imputation methods $t_{P_{i}}(i=1,2,3)$ are designed using the scalars $(\alpha _{i}, \beta _{i}, \gamma _{i})$. Therefore, we have to choose the appropriate value for these scalars in order to estimate the population mean. We have seen in (4.22), (4.23) and (4.24) that the optimum values of $\alpha _{i}$, $\beta _{i}$, and $\gamma _{i})$ depend on the parameters ${\bar{Y}}$, ${\bar{X}}$, $\rho _{yx}$, $C_{y}$, $C_{x}$, etc., which may not be available every time. In such situations, they can be estimated using a pilot survey or guessed from a past survey and subsequently employed for estimation of the population mean. Similarly, the optimum values of scalars used in estimators $t_{d_{i}}$ and $t_{D_{i}}$ can be obtained at the estimation stage. $\square $

4.4 Some other members of proposed classes of estimators $t_{P_{1}}$, $t_{P_{2}}$ and $t_{P_{3}}$

Many estimators can be generated using $t_{P_{1}}$, $t_{P_{2}}$ and $t_{P_{3}}$ families of estimators by choosing various values of u and v in (4.4), (4.5) and (4.6). Some of them in strategy I, strategy II and strategy III, respectively, are given in Table 1.

Table 1 Some members of proposed family of estimators $t_{P_{1}}$, $t_{P_{2}}$ and $t_{P_{3}}$

Full size table

The respective imputations for the data can be formed just by putting the suitable values of u and v in (4.1), (4.2) and (4.3). The biases and minimum MSEs of the estimators $t^{(j)}_{P_{i}}(i=1,2,3)\,\mathrm{and}\,(j=1,2,...,10)$ can be easily obtained by putting suitable values in (4.19) and (4.20).

5 Theoretical Comparisons

In this section, we compare the proposed estimators with above existing estimators based on their theoretical results.

5.1 Parallel Comparisons

5.1.1 Comparisons of $t_{P_{i}}(i=1,2,3)$ with Other Existing Estimators

From (3.2), (3.9), (3.16), (3.23), (3.30), (3.37), (3.47) and (4.20), we get

(i) $min.MSE(t_{P_{i}})<MSE({\bar{y}}_{r})$, if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+f_{2}C^{2}_{y}>1 \end{aligned}$$

(5.1)

(ii) $min.MSE(t_{P_{i}})<MSE(t_{R_{i}})$, if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+[f_{2}C^{2}_{y}+f_{i}C_{x}(C_{x}-2\rho _{yx}C_{y})]>1 \end{aligned}$$

(5.2)

(iii) $min.MSE(t_{P_{i}})<MSE(t_{KC_{i}})$, if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+[f_{2}C^{2}_{y}+f_{i}(C^{2}_{x}-\rho ^{2}_{yx}C^{2}_{y})]>1 \end{aligned}$$

(5.3)

(iv) $min.MSE(t_{P_{i}})<min.MSE(t_{G_{i}})\,\mathrm{or}\,MSE(t_{DP_{i}})$,, if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+C^{2}_{y}(f_{2}-f_{i}\rho ^{2}_{yx})>1 \end{aligned}$$

(5.4)

(v) $min.MSE(t_{P_{i}})<min.MSE(t_{BP_{i}})$, if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+\frac{1}{1+\frac{1}{C^{2}_{y}(f_{2}-f_{i}\rho ^{2}_{yx})}}>1 \end{aligned}$$

(5.5)

(vi) $min.MSE(t_{P_{i}})<min.MSE(t^{*}_{BP_{i}})$, if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+\left( 1-\frac{H^{2}_{i}}{G_{i}}\right) >1 \end{aligned}$$

(5.6)

5.1.2 Comparisons of $t_{d_{i}}(i=1,2,3)$ with Other Existing Estimators

From (3.2), (3.9), (3.16), (3.23), (3.30), (3.37), (3.47) and (4.41), we get

(i) $min.MSE(t_{d_{i}})<MSE({\bar{y}}_{r})$, if

$$\begin{aligned} R^{2}_{y.xr_{x}}>0 \end{aligned}$$

(5.7)

(ii) $min.MSE(t_{d_{i}})<MSE(t_{R_{i}})$, if

$$\begin{aligned} R^{2}_{y.xr_{x}}+C_{x}(C_{x}-2\rho _{yx}C_{y})>0 \end{aligned}$$

(5.8)

(iii) $min.MSE(t_{d_{i}})<MSE(t_{KC_{i}})$, if

$$\begin{aligned} R^{2}_{y.xr_{x}}+(C^{2}_{x}-\rho ^{2}_{yx}C^{2}_{y})]>0 \end{aligned}$$

(5.9)

(iv) $min.MSE(t_{d_{i}})<min.MSE(t_{G_{i}})\,\mathrm{or}\,MSE(t_{DP_{i}})$, if

$$\begin{aligned} (\rho _{yr_{x}}-\rho _{yx}\rho _{x_{r_{}x}})^{2}>0 \end{aligned}$$

(5.10)

(v) $min.MSE(t_{d_{i}})<min.MSE(t_{BP_{i}})$, if

$$\begin{aligned} \left( \frac{1}{f_{2}-f_{i}R^{2}_{y.xr_{x}}}-\frac{1}{f_{2}-f_{i}\rho ^{2}_{yx}}\right) -C^{2}_{y}>0 \end{aligned}$$

(5.11)

(vi) $min.MSE(t_{d_{i}})<min.MSE(t^{*}_{BP_{i}})$, if

$$\begin{aligned} \left( 1-\frac{H^{2}_{i}}{G_{i}}\right) -C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})>0 \end{aligned}$$

(5.12)

From (5.10), it is clear that the estimator $t_{d_{i}}(i=1,2,3)$ are always better than $t_{DP_{i}}$ which contradicts the statement of Diana and Perri [8]. Diana and Perri [8] stated “Using the same amount of auxiliary information, no further improvement upon the regression estimator is possible, at least if the first order approximation is considered” which appears to be false over here.

5.1.3 Comparisons of $t_{D_{i}}(i=1,2,3)$ with Other Existing Estimators

From (3.2), (3.9), (3.16), (3.23), (3.30), (3.37), (3.47) and (4.48), we get

(i) $min.MSE(t_{D_{i}})<MSE({\bar{y}}_{r})$, if

$$\begin{aligned} f_{2}C^{2}_{y}\left( 1-\frac{f_{2}}{f_{i}R^{2}_{y.xr_{x}}}\right) >1 \end{aligned}$$

(5.13)

(ii) $min.MSE(t_{D_{i}})<MSE(t_{R_{i}})$, if

$$\begin{aligned} \left[ 1+\frac{1}{C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})}\right] [f_{2}C^{2}_{y}+f_{i}C_{x}(C_{x}-2\rho _{yx}C_{y})]>1 \end{aligned}$$

(5.14)

(iii) $min.MSE(t_{D_{i}})<MSE(t_{KC_{i}})$, if

$$\begin{aligned} \left[ 1+\frac{1}{C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})}\right] [f_{2}C^{2}_{y}+f_{i}(C^{2}_{x}-\rho ^{2}_{yx}C^{2}_{y})]>1 \end{aligned}$$

(5.15)

(iv) $min.MSE(t_{D_{i}})<min.MSE(t_{G_{i}})\,\mathrm{or}\,MSE(t_{DP_{i}})$, if

$$\begin{aligned} \left[ C^{2}_{y}+\frac{1}{(f_{2}-f_{i}R^{2}_{y.xr_{x}})}\right] (f_{2}-f_{i}\rho ^{2}_{yx})>1 \end{aligned}$$

(5.16)

(v) $min.MSE(t_{D_{i}})<min.MSE(t_{BP_{i}})$, if

$$\begin{aligned} \frac{R^{2}_{y.xr_{x}}}{\rho ^{2}_{yx}}>1 \end{aligned}$$

(5.17)

(vi) $min.MSE(t_{D_{i}})<min.MSE(t^{*}_{BP_{i}})$, if

$$\begin{aligned} \left[ 1+\frac{1}{C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})}\right] \left( 1-\frac{H^{2}_{i}}{G_{i}}\right) >1 \end{aligned}$$

(5.18)

Thus, the estimators $t_{P_{i}}(i=1,2,3)$, $t_{d_{i}}$ and $t_{D_{i}}$ are better than the traditional estimators $t_{R_{i}}$, $t_{G_{i}}$, $t_{KC_{i}}$, $t_{DP_{i}}$, $t_{BP_{i}}$ and $t^{*}_{BP_{i}}$ in parallel if the conditions (5.1)–(5.6), (5.7)–(5.12) and (5.13)–(5.18) are, respectively, satisfied.

5.2 Mutual Comparisons of the Proposed Estimators

Here, we discuss the mutual comparisons of the estimators $t_{P_{i}}(i=1,2,3)$, $t_{d_{i}}$ and $t_{D_{i}}$.

(i) $min.MSE(t_{\bullet _{1}})<min.MSE(t_{\bullet _{3}})$, if

$$\begin{aligned} r>\left( \frac{nN}{2N-n}\right) \end{aligned}$$

(5.19)

(ii) $min.MSE(t_{\bullet _{2}})<min.MSE(t_{\bullet _{1}})$, if

$$\begin{aligned} r<n \end{aligned}$$

(5.20)

(iii) $min.MSE(t_{\bullet _{2}})<min.MSE(t_{\bullet _{3}})$, if

$$\begin{aligned} n<N \end{aligned}$$

(5.21)

This means that in the respective sets of estimators, the second estimators $t_{P_{2}}$, $t_{d_{2}}$ and $t_{D_{2}}$ are always better than first estimators $t_{P_{1}}$, $t_{d_{1}}$ and $t_{D_{1}}$ and third estimators $t_{P_{3}}$, $t_{d_{3}}$ and $t_{D_{3}}$, whereas the first estimators $t_{P_{1}}$, $t_{d_{1}}$ and $t_{D_{1}}$ are better than the estimators $t_{P_{3}}$, $t_{d_{3}}$ and $t_{D_{3}}$, respectively, if the condition (5.19) holds. The similar conditions also holds for other existing estimators discussed above in the respective strategies.

6 Empirical Comparisons and Computations

To judge the merits of the proposed class of estimators $t_{P_{1}}$, $t_{P_{2}}$ and $t_{P_{3}}$ over the other considered estimators in the respective strategies, we have chosen 10 real data sets whose parametric details are given as follows:

Data set-1: [13, p-428]: The data are on capital expenditures (y) and approximations (x) for the years 1953-1967 on a quarterly basis. These data are from the National Industrial Conference Board. The description of the parameters for this data is: $N=60$, $n=20$, $r=16$, ${\bar{Y}}=3092.417$, ${\bar{X}}=3319.483$, $C_{y}=0.3725059$, $C_{x}=0.4159578$, $C_{r}=0.5725904$, $\rho _{yx}=0.8832073$, $\rho _{yr_{x}}=0.7818964$, $\rho _{xr_{x}}=0.9592037$.

Data set-2: [13, p-108]: The data present experience and salary structure of University of Michigan economists in 1983-1984. Let y be the salary (thousands of dollars) and x be the years of experience (defined as years since receiving Ph.D.). The description of the parameters for these data is: $N=32$, $n=12$, $r=8$ ${\bar{Y}}=47.37812$, ${\bar{X}}=18.375$, $C_{y}=0.1819515$, $C_{x}=0.4548528$, $C_{r}=0.5677532$, $\rho _{yx}=0.4245114$, $\rho _{yr_{x}}=0.3368753$, $\rho _{xr_{x}}=0.9447145$.

Data set-3: [13, p-41]: The data are on the weekly cash inflows (x) and outflows (y) of a business firm for 30 weeks. The description of the parameters for these data is: $N=30$, $n=12$, $r=8$, ${\bar{Y}}=51.73333$, ${\bar{X}}=62.93333$, $C_{y}=0.4261637$, $C_{x}=0.3361672$, $C_{r}=0.5676459$, $\rho _{yx}=-0.009132783$, $\rho _{yr_{x}}=-0.02862014$, $\rho _{xr_{x}}=0.9927513$.

Data set-4: [19, p-108]: A list of 70 villages in a Tehsil of India along with their population in 1981 and cultivated area (in acres) in the same year is taken into consideration. Let y be the cultivated area(in acres) and x be the population of village. The description of the parameters for these data is: $N=70$, $n=20$, $r=15$, ${\bar{Y}}=981.2857$, ${\bar{X}}=1755.529$, $C_{y}=0.625359$, $C_{x}=0.8009741$, $C_{r}=0.57327$, $\rho _{yx}=0.7779$, $\rho _{yr_{x}}=0.7588$, $\rho _{xr_{x}}= 0.8497$.

Data set-5: [18]: Let y be the number of successful students and x be the number of teachers considered in a survey data of 923 districts of Turkey in 2007. The description of the parameters for these data is: $N=261$, $n=90$, $r=70$, ${\bar{Y}}=222.5824$, ${\bar{X}}=306.4483$, $C_{y}=1.8654$, $C_{x}=1.7595$, $C_{r}=0.57623$, $\rho _{yx}=0.9705$, $\rho _{yr_{x}}=0.6371$, $\rho _{xr_{x}}=0.6265$.

Data set-6: [20, p-1111]: Let y be the amount (in $000) of real estate farm loans and x be the amount (in $000) of non-real estate farm loans in different states of USA during 1997. The details of the parameters for this data set are: $N=50$, $n=20$ $r=8$, ${\bar{Y}}= 878.1626$, ${\bar{X}}=555.4345$, $C_{y}=1.235167$, $C_{x}= 1.052916$, $C_{r}=0.571662$, $\rho _{yx}=0.8038$, $\rho _{yr_{x}}=0.7461$, $\rho _{xr_{x}}=0.9236$.

Data set-7: [7, p-182]: Let y be the number of paralytic polio cases in the placebo group and x be the number of placebo children. The details of the parameters for this data set are: $N=34$, $n=12$ $r=8$, ${\bar{Y}}= 2.588235$, ${\bar{X}}=4.923529$, $C_{y}=1.233278$, $C_{x}=1.023331$, $C_{r}=0.5687383$, $\rho _{yx}=0.7328235$, $\rho _{yr_{x}}=0.6571887$, $\rho _{xr_{x}}=0.8165117$.

Data set-8: [7, p-152]: The data show the number of inhabitants in LARGE UNITED STATES CITIES (in 1000’s). Let y be the number of inhabitants in 1930 and x be the number of inhabitants in 1920. The details of the parameters for this data set are: $N=49$, $n=15$ $r=12$, ${\bar{Y}}=127.7959$, ${\bar{X}}=103.1429$, $C_{y}=0.9634205$, $C_{x}=1.012237$, $C_{r}=0.5714601$, $\rho _{yx}=0.981742$, $\rho _{yr_{x}}=0.7207159$, $\rho _{xr_{x}}=0.7915108$.

Data set-9: [7, p-34]: We investigate food cost of family for y and the family size for x. The values of the population parameters are: $N=33$, $n=10$ $r=8$, ${\bar{Y}}=27.49091$, ${\bar{X}}=3.727273$, $C_{y}=0.3685139$, $C_{x}=0.4094911$, $C_{r}=0.555573$, $\rho _{yx}=0.432738$, $\rho _{yr_{x}}=0.4495658$, $\rho _{xr_{x}}=0.9820251$.

Data set-10: [14, p-399]: Consider y as area under wheat in 1964 and x as the area under wheat in 1963. The statistical summary of the population is: $N=34$, $n=11$ $r=8$, ${\bar{Y}}=199.4412$, ${\bar{X}}=208.8824$, $C_{y}=0.7531797$, $C_{x}=0.7205298$, $C_{r}=0.5689992$, $\rho _{yx}=0.9800867$, $\rho _{yr_{x}}=0.9152007$, $\rho _{xr_{x}}=0.9416689$.

We have computed the MSEs of the estimators ${\bar{y}}_{r}$, $t_{R_{i}}(i=1,2,3)$, $t_{G_{i}}$, $t_{KC_{i}}$, $t_{DP_{i}}$, $t_{BP_{i}}$, $t^{*}_{BP_{i}}$ and the different members of $t_{P_{i}}$ at their optimum situations based on their theoretical results, are given in Tables 2 and 3. The relative performance of all the above estimators is computed in terms of percentage relative efficiency (PRE) with respect to mean estimator ${\bar{y}}_{r}$. To calculate the PREs of the estimators ${\bar{y}}_{r}$, $t_{R_{i}}$, $t_{G_{i}}$, $t_{KC_{i}}$, $t_{DP_{i}}$, $t_{BP_{i}}$, $t^{*}_{BP_{i}}$, $t_{d_{i}}$, $t_{D_{i}}$ and $t_{P_{i}}$ with respect to ${\bar{y}}_{r}$, we have used the formula, given by

$$\begin{aligned} PRE(t., {\bar{y}}_{r})=\frac{V({\bar{y}}_{r})}{MSE(t.)}\times 100 \end{aligned}$$

(6.1)

The results are shown in Table 4.

Table 2 MSEs of the different estimators

Full size table

Table 3 MSEs of the different estimators belong to proposed family of estimators $t_{P_{1}}$, $t_{P_{2}}$ and $t_{P_{2}}$

Full size table

Table 4 PREs of the different estimators with respect to mean estimator ${\bar{y}}_{r}$

Full size table

Interpretation of the results:

In Table 2, we see that the lowest amount of MSE values occurred for the members $t^{*}_{P_{i}}(i=1,2,3)$ of suggested class of estimators $t_{P_{i}}$, respectively, in parallel comparison with the estimators $t_{R_{i}}$, $t_{G_{i}}$, $t_{KC_{i}}$, $t_{DP_{i}}$, $t_{BP_{i}}$, $t^{*}_{BP_{i}}$, $t_{d_{i}}$ and $t_{D_{i}}$ as well as ${\bar{y}}_{r}$. Further, we see that the special members $t_{D_{i}}(i=1,2,3)$ of proposed estimators $t_{P_{i}}$, respectively, have the second lowest amount of MSE values in the parallel comparisons of other existing estimators. In Table 3, we observe that the MSE values of all the discussed members of the proposed estimators in respective strategies are same in the Populations 1, 2, 3, 4, 5 and 10 but a tiny bit change in the Populations 6, 7, 8 and 9 which can also be admitted as negligible difference. Thus, it can be argued that the MSEs of all the members of proposed class of estimators (in Table 3) are equal and smaller than all other discussed estimators in the respective strategies. Subsequently, their PREs with respect to ${\bar{y}}_{r}$ will also be same. Therefore, we have considered the notation $t_{P_{i}}(i=1,2,3)$ only, for all the members $t^{(j)}_{P_{i}}(i=1,2,3; j=1,2,...,10)$ in Table 4 for our convenience which present the PREs of all the members of proposed classes of estimators in the respective strategies.

From Table 4, we report that

(i)
The performance of ratio estimators $t_{R_{i}}(i=1,2,3)$ is good in Populations 1, 4, 5, 6, 7, 8 and 10, while in Populations 2, 3 and 9 it is poor. Note that in the populations 2, 3 and 9, the values of $\rho _{yx}\frac{C_{y}}{C_{x}}$ are 0.16981, -0.01157 and 0.38943, respectively, which are not satisfying the condition $(\rho C_{y}/C_{x})>1/2$ for $t_{R_{i}}$ to overcome the mean estimator ${\bar{y}}_{r}$.
(ii)
Theoretically it has been stated above that the performances of the estimators $t_{G_{i}}(i=1,2,3)$ are always better than ${\bar{y}}_{r}$, $t_{R_{i}}$ and $t_{KC_{i}}$ which confirmed by this empirical study.
(iii)
The estimators $t_{KC_{i}}(i=1,2,3)$ perform good only in Populations 5 and 10 where the numerical values of $\rho _{yx}\frac{C_{y}}{C_{x}}$ are 1.0289 and 1.02449, respectively. Note that the condition $(\rho C_{y}/C_{x})>1$ is satisfied for the said populations while for the remaining populations it does not holds.
(iv)
We see that $t_{DP_{i}}(i=1,2,3)$ and $t_{G_{i}}$ are equally efficient and $t_{BP_{i}}$ are paralleling the improvement over $t_{DP_{i}}$ in all the populations.
(vi)
The performance of the estimators $t^{*}_{BP_{i}}(i=1,2,3)$ in parallel is very near to $t_{BP_{i}}$.
(vii)
We see that the proposed estimator $t_{d_{i}}(i=1,2,3)$ are always better than $t_{DP_{i}}$ and $t_{BP_{i}}$, respectively, in all the populations. Therefore, it can be argued that the regression type estimator based on dual use of auxiliary information always outperform both the conventional regression and difference type estimators which are based on only the prime information of an auxiliary variable. Hence, a better overcome on missing data problem can be attained just by using dual of an auxiliary variable.
(viii)
The proposed estimators $t_{D_{i}}(i=1,2,3)$ are paralleling the improvement over $t_{d_{i}}$.
(ix)
The performance of the proposed class of estimators $t_{P_{i}}(i=1,2,3)$ is:
1. (a)
  Good (better than ${\bar{y}}_{r}$) in all the Populations 1-10.
2. (b)
  Paralleling more efficient than the estimators $t_{R_{i}}$, $t_{G_{i}}$, $t_{KC_{i}}$, $t_{DP_{i}}$, $t_{BP_{i}}$, $t^{*}_{BP_{i}}$ and their special members $t_{d_{i}}(i=1,2,3)$ and $t_{D_{i}}$. Thus, $t_{P_{i}}$ paralleling accomplish the maximum gain in efficiency among all the other estimators in all the populations considered in this empirical study.
3. (c)
  We observe that the second one $t_{P_{2}}$ is always better than first one $t_{P_{1}}$ and third one $t_{P_{3}}$.
4. (c)
  We also observe that the PREs of first proposed estimator $t_{P_{1}}$ are greater than the third proposed estimator $t_{P_{3}}$ in all the populations where the condition (5.19) holding.

Thus, the suggested class of estimators $t_{P_{i}}(i=1,2,3)$ in parallel outperform all other estimators $t_{R_{i}}$, $t_{G_{i}}$, $t_{KC_{i}}$, $t_{DP_{i}}$, $t_{BP_{i}}$, $t^{*}_{BP_{i}}$, $t_{d_{i}}$ and $t_{D_{i}}$considered in this study. We see that the second proposed estimator $t_{P_{2}}$ is always better than the first and third proposed estimators $t_{P_{1}}$ and $t_{P_{3}}$. We also see that the proposed estimator $t_{P_{2}}$ is the most efficient among all the other estimators discussed in this study.

On the basis of this empirical study, we conclude that the proposed estimators formulated under double(rank) use of an auxiliary variable are capable to enhance their precision over relevant estimators based on single/prime use of an auxiliary variable.

7 Conclusions

To exercise the problem of missing data efficiently, there are several notable researchers, but no one has discussed the imputation technique using dual of auxiliary information in literature so far. In the present study, we have suggested three imputation techniques and corresponding estimators using an auxiliary information as well as its rank in three different sampling strategies (based on different amounts of auxiliary information) under non-response. In the empirical study consisting 10 real data sets, it has been found that the performance of the suggested set of estimators paralleling are more efficient than usual mean estimator and the works Lee et al. [12], Kadilar and Cingi [11], Gira [9], Diana and Perri [8], Bhusan and Pandey [4] and Bhusan et al. (2018). The present study is important in survey sampling to estimate the population mean because it overcomes to the missing data more effectively than the works (based on prime/single use of an auxiliary information) discussed above, just by using the dual(rank) of an auxiliary information. Since, for the optimal solution, both from a theoretical and practical perspectives, the proposed estimators are very simple to apply. Hence, the suggested classes of estimators are appreciable and recommended to use for sampling practitioners when the non-response cannot be ignored.

References

Al-Omari AI, Bouza CN, Herrera C (2013) Imputation methods of missing data for estimating the population mean using simple random sampling with known correlation coefficient. Qual Quant 47(1):353–365
Article Google Scholar
Bouza-Herrera CN, Viada CE (2021) Imputation of individual values of a variable using product predictors. Revista Investig Oper 41(7):979–989
MATH Google Scholar
Bouza CN, Viada C, Vishwakarma GK (2020) Studying the total under missingness by guessing the value of a superpopulation model for imputation. Revista Investig Oper 42(3):321–325
Google Scholar
Bhushan S, Pandey AP (2016) Optimal imputation of missing data for estimation of population mean. J Stat Manage Syst 19(6):755–769
Google Scholar
Bhushan S, Pandey AP (2018) Optimality of ratio type estimation methods for population mean in the presence of missing data. Commun Stat Theory Methods 47(11):2576–2589
Article MathSciNet Google Scholar
Bhushan S. Pratap, Pandey A, Pandey A (2020) On optimality of imputation methods for estimation of population mean using higher order moment of an auxiliary variable. Commun Stat Simul Comput 49(6):1560–1574
Article MathSciNet Google Scholar
Cochran WG (1977) Sampling techniques, 3rd edn. Wiley, New York
MATH Google Scholar
Diana G, Perri PF (2010) Improved estimators of the population mean for missing data. Commun Stat Theory Methods 39:3245–3251
Article MathSciNet Google Scholar
Gira AA (2015) Estimation of population mean with a new imputation method. Appl Math Sci 9(34):1663–1672
Google Scholar
Heitzin FD, Basu S (1996) Distinguishing missing at random and missing completely at random. Am Stat 50(3):207–213
MathSciNet Google Scholar
Kadilar C, Cingi H (2008) Estimators for the population mean in the case of missing data. Commun Stat Theory Methods 37(14):2226–2236
Article MathSciNet Google Scholar
Lee H, Rancourt E, Sarndal CE (1994) Experiments with variance estimation from survey data with imputed values. J Official Stat 10(3):231–243
Google Scholar
Maddala GS (1992) Introduction to econometrics, 2nd edn. Macmillan, New York
Google Scholar
Murthy MN (1967) Sampling theory and methods. Statistical Publishing Society Calcutta (India). Vol. I and, 2, 1-1220
Prasad S (2017) Ratio exponential type estimators with imputation for missing data in sample surveys. Model Assist Stat Appl 12(2):95–106
Google Scholar
Prasad S (2017) An exponential imputation in the case of missing data. J Stat Manage Syst 20(6):1127–1140. https://doi.org/10.1080/09720510.2017.1407515
Article Google Scholar
Rubin DB (1976) Inference and missing data. Biometrica 63:581–593
Article MathSciNet Google Scholar
Satici E, KAdilar, C. (2011) Ratio estimator for the population mean at the current occasion in the presence of non-response in successive sampling. Hacettepe J Math Stat 40(1):115–124
Singh D, Chaudhary FS (1986) Theory and analysis of sample survey designs. Wiley, New York
Google Scholar
Singh S (2003) Advanced sampling theory with applications, vol 2. Springer, Berlin
Book Google Scholar
Singh S (2009) A new method of imputation in survey sampling. Statistics 43(5):499–511
Article MathSciNet Google Scholar
Singh S, Deo B (2003) Imputation by power transformation. Stat Papers 44(4):555–579
Article MathSciNet Google Scholar
Singh S, Horn S (2000) Compromised imputation in survey sampling. Metrika 51(3):267–276
Article MathSciNet Google Scholar
Singh GN, Suman S (2019) Estimation of population mean using imputation methods for missing data under two-phase sampling design. J Stat Theory Pract. https://doi.org/10.1007/s42519-018-0016-5
Article MathSciNet MATH Google Scholar
Singh GN, Pandey AK, Sharma AK (2020) Some improved and alternative imputation methods for finite population mean in presence of missing information. Commun Stat Theory Methods. https://doi.org/10.1080/03610926.2020.1713375
Article Google Scholar
Singh GN, Usman M (2021) Some efficient estimators in case of missing data. Commun Stat Simul Comput 1-27

Download references

Acknowledgements

Authors are highly thankful to the Indian Institute of Technology (Indian School of Mines), Dhanbad, for providing financial assistance and necessary infrastructure to accomplish the present research work.

Author information

Authors and Affiliations

Department of Mathematics & Computing, Indian Institute of Technology (Indian School of Mines), Dhanbad, 826007, India
G. N. Singh & M. Usman
Department of Statistics and Operations Research, Aligarh Muslim University, Aligarh, India
B. Khatoon

Authors

G. N. Singh
View author publications
You can also search for this author in PubMed Google Scholar
M. Usman
View author publications
You can also search for this author in PubMed Google Scholar
B. Khatoon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Usman.

Ethics declarations

Conflict of interest

There is no conflict of interest associated with the present article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singh, G.N., Usman, M. & Khatoon, B. Some General Classes of Efficient Estimators in Case of Missing Data. J Stat Theory Pract 16, 21 (2022). https://doi.org/10.1007/s42519-022-00252-0

Download citation

Accepted: 25 February 2022
Published: 30 March 2022
DOI: https://doi.org/10.1007/s42519-022-00252-0

Keywords

Mathematics Subject Classification

62D05

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Some General Classes of Efficient Estimators in Case of Missing Data

Abstract

Similar content being viewed by others

Efficient Imputation Methods to Handle Missing Data in Sample Surveys

Efficient and alternative approaches for imputing missing data to estimate population mean

A new robust ratio estimator by modified Cook’s distance for missing data imputation

1 Introduction

2 Methodology and Notations

3 Some Conventional Imputation Methods

3.1 Mean Imputation Method

3.2 Lee et al. [12] Imputation Methods

3.3 Kadilar and Cingi [11] Imputation Methods

3.4 Gira [9] Imputation Methods

3.5 Diana and Perri [8] Estimators

3.6 Bhusan and Pandey [4] Imputation Methods

4 Suggested Imputation Methods

4.1 Some Special Cases of Proposed Estimators

4.2 Properties of Proposed Estimators

Theorem 4.1

Proof

Corollary 4.1

Proof

Corollary 4.2

Proof

4.3 Practicability of the suggested estimators

4.4 Some other members of proposed classes of estimators \(t_{P_{1}}\), \(t_{P_{2}}\) and \(t_{P_{3}}\)

5 Theoretical Comparisons

5.1 Parallel Comparisons

5.1.1 Comparisons of \(t_{P_{i}}(i=1,2,3)\) with Other Existing Estimators

5.1.2 Comparisons of \(t_{d_{i}}(i=1,2,3)\) with Other Existing Estimators

5.1.3 Comparisons of \(t_{D_{i}}(i=1,2,3)\) with Other Existing Estimators

5.2 Mutual Comparisons of the Proposed Estimators

6 Empirical Comparisons and Computations

7 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation