1 Introduction

Missing data are an inherent phenomena in sample surveys which may need an appropriate methodology to handle the data sets. For example, in an experiment of three kinds of new drinks in the market for 30 days, some of the data may accidently be missing for some days; in an experiment of animals, some of the animals may die in a laboratory study, or the technician may accidentally omit some of the results of the experiments; in a medical investigation patients may not turn up or may not co-operate or die before completing the actual periods. Similarly, in agricultural experiments, the plants may be eaten away by animals or washed away by floods, etc. In all such situations, the experiments result in incomplete data may mislead the inferences. In case of missing data, several statisticians have proved that the inferences or predictions regarding the population parameters may be highly distorted, especially when the respondents and non-respondents are differ. Therefore, the knowledge of the appropriate pattern of the incomplete mechanism is needed at the estimation stage to overcome the missing data problems. Rubin [17] introduced three fundamental concepts of the missing patterns of the data: missing at random (MAR), observed at random (OAR) and parameter distribution (PD). The combination of MAR and OAR termed as the notion of missing completely at random (MCAR). Heitzan and Basu (1996) have differentiated the meaning of MAR and MCAR very systematically. Following these authors, we have assumed the MCAR mechanism in the present study to deal with the problem of missing data.

Imputation is one of the effective techniques in surveys to compensate for the missing data. In various fields like the energy storage system, which provides a peak reduction service to local electricity network, the food composition databases, the clinical trials, the industrial databases, etc., the imputation technique for missing observations play a contributory role regarding the estimation of population parameters under non-response. A number of imputation methods using the MCAR mechanism have been discussed by the authors including Lee et al. [12], Singh and Horn [23], Singh and Deo [22], Kadilar and Cingi [11], Singh [21], Diana and Perri [8], Al-Omari et al. [1], Gira [9] and recently Bhusan and Pandey [4], Prasad [15], Bhusan and Pandey [5], Singh and Suman [24], Singh et al. [25], Bhusan et al. [6] and Singh and Usman [26]. These authors have made the utilization of information available on each unit of an auxiliary variable which is often used in surveys to increase the precision of estimate of the population mean. Some other related references are Prasad [16], Bouza et al. [3] and Bouza-Herrera and Viada [2].

The aim of the present study is to develop the imputation methods and subsequent estimators with enhanced precision by incorporating the double use of an auxiliary information to estimate the population mean over some relevant estimators which are based on the prime/single use of auxiliary information in case of missing data under the MCAR mechanism. For this, we have considered the rank(dual) of an auxiliary variable which may behaves like an additional auxiliary variable. To our knowledge, no one has tried this type of work for imputation to handle the missing data in estimating the population mean so far. The rest part of the paper is organized as follows: In Sect. 2, the methodology and notations have been discussed and some conventional imputation methods have been reviewed in Sect. 3. In Sect. 4, we have suggested three general class of estimators using imputation techniques and studied their properties. Section 5 talks about the theoretical comparisons of the estimators and the empirical comparisons based on real data sets are presented in Sect. 6. Finally, some concluding remarks are made in Sect. 7.

2 Methodology and Notations

Consider an identifiable population \(U=\{U_{1},U_{2},U_{3},...,U_{N}\}\) of size N where our goal is to estimate the population mean \({\bar{Y}}\) of study variable y which possess a proper correlation with an auxiliary variable x. Let \((y_{i}, x_{i})\) be the \(i^{th}\) observations of y and x. Suppose that the information is readily available at each unit of auxiliary variable x in the population. Let \(R_{x}=\{r_{x,1},r_{x,2},...,r_{x,N}\}\) denote the values of corresponding ranks of \(X=\{x_{1},x_{2},...,x_{N}\}\) in U. Remember that the rank \(R_{x}\) can also hold an adequate amount of correlation with the study variable y. Let a sample of size n be drawn using simple random sampling without replacement (SRSWOR) technique from the population and surveyed. Unfortunately, response is observed only on \(r(<n)\) units for study variable y. For the remaining \((n-r)\) non-responding units, we propose some new imputation methods using the rank of an auxiliary variable, given in section 4. Let A and \({\bar{A}}\) be the sets of responding units and non-responding units, respectively, in the sample. For the sampled units \(i\in {A}\), the values \(y_{i}\) are observed while for the units \(i\in {\bar{A}}\) some imputation techniques are used.

We define some useful notations as follows:

  • \({\bar{Y}}=\sum _{i=1}^{N}y_{i}/N\)\({\bar{y}}_{r}=\sum _{i=1}^{r}y_{i}/r\): The population mean and the response mean of study variable y,

  • \({\bar{X}}=\sum _{i=1}^{N}x_{i}/N\)\({\bar{x}}_{r}=\sum _{i=1}^{r}x_{i}/r\)\({\bar{x}}_{r}=\sum _{i=1}^{r}x_{i}/r\): The population mean, sample mean and the response mean of auxiliary variable x,

  • \({\bar{R}}_{x}=\sum _{i=1}^{N}r_{x,i}/N\)\({\bar{r}}_{x(n)}=\sum _{i=1}^{n}r_{x,i}/n\)\({\bar{r}}_{x(r)}=\sum _{i=1}^{r}r_{x,i}/r\): The population mean, sample mean and the response mean of \(R_{x}\),

  • \(S_{y}^{2}=\sum _{i=1}^{N}(y_{i}-{\bar{Y}})^2/(N-1)\): The population variance of y,

  • \(S_{x}^{2}=\sum _{i=1}^{N}(x_{i}-{\bar{X}})^2/(N-1)\): The population variance of x,

  • \(S_{r_{x}}^{2}=\sum _{i=1}^{N}(r_{x,i}-{\bar{R}}_{x})^2/(N-1)\): The population variance of \(R_{x}\),

  • \(C_{y}=S_{y}/{\bar{Y}}\): The coefficient of variation of y,

  • \(C_{x}=S_{x}/{\bar{X}}\): The coefficient of variation of x,

  • \(C_{r_{x}}=S_{r_{x}}/{\bar{R}}_{x}\): The coefficient of variation of \(R_{x}\),

  • \(\rho _{yx}=S_{yx}/S_{y}S_{x}\): The correlation coefficient between y and x.

  • \(\rho _{yr_{x}}=S_{yr_{x}}/S_{y}S_{r_{x}}\): The correlation coefficient between y and \(R_{x}\).

  • \(\rho _{xr_{x}}=S_{yr_{x}}/S_{y}S_{r_{x}}\): The correlation coefficient between x and \(R_{x}\).

To obtain the biases and mean square errors (MSEs) of the proposed estimators, we define the following error transformation, as:

$$\begin{aligned} \frac{{\bar{y}}_{r}}{{\bar{Y}}}=(1+\xi _{0}),\;\frac{{\bar{x}}_{r}}{{\bar{X}}}=(1+\xi '_{1}),\;\frac{{\bar{x}}_{n}}{{\bar{X}}}=(1+\xi _{1}),\; \frac{{\bar{r}}_{x(r)}}{{\bar{R}}_{x}}=(1+\xi '_{2}),\;\frac{{\bar{r}}_{x(n)}}{{\bar{R}}_{x}}=(1+\xi _{2}), \end{aligned}$$

such that

$$\begin{aligned} E(\xi _{0})= E(\xi '_{1})= E(\xi _{1})= E(\xi '_{2})= E(\xi _{2})=0 \end{aligned}$$

and

$$\begin{aligned} E(\xi ^{2}_{0})= & {} f_{2}C^{2}_{y},\quad E(\xi ^{'2}_{1})=f_{2}C^{2}_{x},\quad E(\xi ^{2}_{1})=f_{1} C^{2}_{x},\quad E(\xi ^{'2}_{2})=f_{2}C^{2}_{r_{x}},\quad \\&E(\xi ^{2}_{2})=f_{1} C^{2}_{r_{x}},\\ E(\xi _{0}\xi '_{1})= & {} f_{2}\rho _{yx}\,C_{y}C_{x},\quad E(\xi _{0}\xi _{1})=f_{1} \rho _{yx}\,C_{y}C_{x},\quad E(\xi _{0}\xi '_{2})=f_{2}\rho _{yr_{x}}\,C_{y}C_{r_{x}},\\ E(\xi _{0}\xi _{2})= & {} f_{1} \rho _{yr_{x}} C_{y}C_{r_{x}},\quad E(\xi '_{2}\xi _{2})=f_{1} C^{2}_{r_{x}},E(\xi '_{1}\xi _{1})=f_{1} C^{2}_{x},\\ E(\xi '_{1}\xi '_{2})= & {} f_{2}\rho _{xr_{x}}\,C_{x}C_{x},\quad E(\xi '_{1}\xi _{2})=f_{1} \rho _{xr_{x}}\,C_{x}C_{r{x}} \end{aligned}$$

where \(f_{1}=\left( \frac{1}{n}-\frac{1}{N}\right) \) and \(f_{2}=\left( \frac{1}{r}-\frac{1}{N}\right) \). We also write \(f_{3}=\left( \frac{1}{r}-\frac{1}{n}\right) =f_{2}-f_{1}\).

3 Some Conventional Imputation Methods

In this section, we discuss some customary imputation methods and corresponding estimators under three different sampling strategies which are discussed as follows:

Strategy I: When \({\bar{X}}\) and \({\bar{x}}_{n}\) are used.

Strategy II: When \({\bar{X}}\) and \({\bar{x}}_{r}\) are used.

Strategy III: When \({\bar{x}}_{n}\) and \({\bar{x}}_{r}\) are used.

3.1 Mean Imputation Method

The usual mean method of imputation which is free from auxiliary information, is given by

$$\begin{aligned} y_{i,m}= {\left\{ \begin{array}{ll} y_{i}, &{}\quad { \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}}. \end{array}\right. } \end{aligned}$$
(3.1)

The estimator of population mean \({\bar{Y}}\) under mean method of imputation is, say \(t_{m}={\bar{y}}_{r}\) whose variance is given by

$$\begin{aligned} V(t_{m})=V({\bar{y}}_{r})={\bar{Y}}^{2}f_{2}C^{2}_{y} \end{aligned}$$
(3.2)

3.2 Lee et al. [12] Imputation Methods

When there is an auxiliary information, then the ratio method of imputation for the data due to Lee et al. [12] can be considered in three strategies as:

$$\begin{aligned}&y_{\cdot i,R_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{1}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{n}}\right) -r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}}. \end{array}\right. }\end{aligned}$$
(3.3)
$$\begin{aligned}&y_{\cdot i,R_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{1}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{r}}\right) -r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}}. \end{array}\right. }\end{aligned}$$
(3.4)
$$\begin{aligned}&y_{\cdot i,R_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{1}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}\right) -r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}}. \end{array}\right. } \end{aligned}$$
(3.5)

The corresponding ratio type estimators are defined as:

$$\begin{aligned} t_{R_{1}}= & {} {\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{n}}\right) \end{aligned}$$
(3.6)
$$\begin{aligned} t_{R_{2}}= & {} {\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{r}}\right) \end{aligned}$$
(3.7)
$$\begin{aligned} t_{R_{3}}= & {} {\bar{y}}_{r}\left( \frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}\right) \end{aligned}$$
(3.8)

The MSEs of \(t_{R_{i}}(i=1,2,3)\) to the first-order approximation, are given by

$$\begin{aligned} MSE(t_{R_{i}})=V({\bar{y}}_{r})+{\bar{Y}}^{2}f_{i} C_{x}(C_{x}-2\rho C_{y}) \end{aligned}$$
(3.9)

The ratio estimators \(t_{R_{i}}(i=1,2,3)\) are better than mean estimator \({\bar{y}}_{r}\) if \((\rho C_{y}/C_{x})>1/2\).

3.3 Kadilar and Cingi [11] Imputation Methods

The imputation methods under three strategies are given by

$$\begin{aligned}&y_{i,KC_{1}}= {\left\{ \begin{array}{ll} {\left[ \frac{ny_{i}}{r}+b({\bar{X}}-x_{i})\right] }\frac{{\bar{X}}}{{\bar{x}}_{n}}, &{} {\quad \text {if}\;\;\,i \in A } \\ {[}b({\bar{X}}-x_{i})]\frac{{\bar{X}}}{{\bar{x}}_{n}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(3.10)
$$\begin{aligned}&y_{i,KC_{2}}= {\left\{ \begin{array}{ll} {\left[ \frac{ny_{i}}{r}-b\frac{nx_{i}}{r}\right] }\frac{{\bar{X}}}{{\bar{x}}_{r}}, &{} {\quad \text {if}\;\;\,i \in A } \\ b\frac{n{\bar{X}}}{n-r}\frac{{\bar{X}}}{{\bar{x}}_{r}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(3.11)
$$\begin{aligned}&y_{i,KC_{3}}= {\left\{ \begin{array}{ll} {\left[ \frac{ny_{i}}{r}-b\frac{nx_{i}}{r}\right] }\frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}, &{} {\quad \text {if}\;\;\,i \in A } \\ b\frac{n{\bar{x}}_{n}}{n-r}\frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$
(3.12)

The respective estimators are given as:

$$\begin{aligned} t_{KC_{1}}= & {} \frac{{\bar{y}}_{r}+b({\bar{X}}-{\bar{x}}_{n})}{{\bar{x}}_{n}}{\bar{X}}\end{aligned}$$
(3.13)
$$\begin{aligned} t_{KC_{2}}= & {} \frac{{\bar{y}}_{r}+b({\bar{X}}-{\bar{x}}_{r})}{{\bar{x}}_{r}}{\bar{X}}\end{aligned}$$
(3.14)
$$\begin{aligned} t_{KC_{3}}= & {} \frac{{\bar{y}}_{r}+b({\bar{x}}_{n}-{\bar{x}}_{r})}{{\bar{x}}_{r}}{\bar{x}}_{n} \end{aligned}$$
(3.15)

where \(b=s_{yx}/s^{2}_{x}\) is the least squares estimated regression coefficient of y on x. Here \(s_{yx}=\sum _{i=1}^{n}(x_{i}-{\bar{x}}_{n})(y_{i}-{\bar{y}}_{n})/(n-1)\) and \(s^{2}_{x}=\sum _{i=1}^{n}(x_{i}-{\bar{x}}_{n})^{2}/(n-1)\).

The MSEs of the estimators \(t_{KC_{i}}(i=1,2,3)\) to the first-order approximation, are respectively, given as:

$$\begin{aligned} MSE(t_{KC_{i}})=V({\bar{y}}_{r})+f_{i}{\bar{Y}}^{2}(C^{2}_{x}-\rho ^{2}_{yx}C^{2}_{y}) \end{aligned}$$
(3.16)

The estimator \(t_{KC_{i}} (i=1,2,3)\) are better than mean estimator \({\bar{y}}_{r}\) if \((\rho C_{y}/C_{x})>1\).

3.4 Gira [9] Imputation Methods

On the lines of Gira [9], we have considered three ratio type imputation methods to deal with missing data, given by

$$\begin{aligned}&y_{\cdot i,G_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}\left[ n\left( \frac{\nu _{1}-{\bar{x}}_{n}}{\nu _{1}-{\bar{X}}}\right) -r\right] \frac{x_{i}}{\sum _{i\in {\bar{A}}} x_{i}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(3.17)
$$\begin{aligned}&y_{\cdot i,G_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}\left[ n\left( \frac{\nu _{2}-{\bar{x}}_{r}}{\nu _{2}-{\bar{X}}}\right) -r\right] \frac{x_{i}}{\sum _{i\in {\bar{A}}} x_{i}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(3.18)
$$\begin{aligned}&y_{\cdot i,G_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}\left[ n\left( \frac{\nu _{3}-{\bar{x}}_{r}}{\nu _{3}-{\bar{x}}_{n}}\right) -r\right] \frac{x_{i}}{\sum _{i\in {\bar{A}}} x_{i}}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$
(3.19)

where \(\nu _{i}(i=1,2,3)\) are the suitably chosen constants.

The point estimators under (3.17), (3.18) and (3.19) are, respectively, given by

$$\begin{aligned}&t_{G_{1}}={\bar{y}}_{r}\left( \frac{\nu _{1}-{\bar{x}}_{n}}{\nu _{1}-{\bar{X}}}\right) \end{aligned}$$
(3.20)
$$\begin{aligned}&t_{G_{2}}={\bar{y}}_{r}\left( \frac{\nu _{2}-{\bar{x}}_{r}}{\nu _{2}-{\bar{X}}}\right) \end{aligned}$$
(3.21)
$$\begin{aligned}&t_{G_{3}}={\bar{y}}_{r}\left( \frac{\nu _{3}-{\bar{x}}_{r}}{\nu _{3}-{\bar{x}}_{n}}\right) \end{aligned}$$
(3.22)

The minimum MSEs of the estimators \(t_{G_{i}}(i=1,2,3)\) are given by

$$\begin{aligned} min.MSE(t_{G_{i}})={\bar{Y}}^{2}C^{2}_{y}(f_{2}-f_{i} \rho ^{2}_{yx}) \end{aligned}$$
(3.23)

The optimum values are given as: \(\nu _{i(opt)}={\bar{X}}\left[ 1+\frac{1}{\rho _{yx}\frac{C_{y}}{C_{x}}}\right] \).

From (3.2), (3.9) (3.16) and (3.23), it is clear that \(t_{G_{i}}(i=1,2,3)\) is always better than \({\bar{y}}_{r}\), \(t_{R_{i}}\), \(t_{KC_{i}}\) in the respective strategies. Note that Gira [9] showed both theoretically and empirically that his method is equally efficient to the methods propounded by Singh and Horn [23], Singh and Deo [22] and Singh [21].

3.5 Diana and Perri [8] Estimators

Diana and Perri [8] established three regression-type imputation methods, under which the resultant data take the form as:

$$\begin{aligned}&y_{i,DP_{1}}= {\left\{ \begin{array}{ll} \frac{ny_{i}}{r}+b({\bar{X}}-x_{i}), &{} {\quad \text {if}\;\;\,i \in A } \\ b({\bar{X}}-x_{i}), &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(3.24)
$$\begin{aligned}&y_{i,DP_{2}}= {\left\{ \begin{array}{ll} \frac{ny_{i}}{r}-b\frac{nx_{i}}{r}, &{} {\quad \text {if}\;\;\,i \in A } \\ b\frac{n{\bar{X}}}{n-r}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(3.25)
$$\begin{aligned}&y_{i,DP_{3}}= {\left\{ \begin{array}{ll} \frac{ny_{i}}{r}-b\frac{nx_{i}}{r}, &{} {\quad \text {if}\;\;\,i \in A } \\ b\frac{n{\bar{x}}_{n}}{n-r}, &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$
(3.26)

The subsequent estimators are, respectively, given as:

$$\begin{aligned} t_{DP_{1}}= & {} {\bar{y}}_{r}+b({\bar{X}}-{\bar{x}}_{n})\end{aligned}$$
(3.27)
$$\begin{aligned} t_{DP_{2}}= & {} {\bar{y}}_{r}+b({\bar{X}}-{\bar{x}}_{r})\end{aligned}$$
(3.28)
$$\begin{aligned} t_{DP_{3}}= & {} {\bar{y}}_{r}+b({\bar{x}}_{n}-{\bar{x}}_{r}) \end{aligned}$$
(3.29)

The MSEs of the estimators \(t_{DP_{i}}(i=1,2,3)\) are given as:

$$\begin{aligned} MSE(t_{DP_{i}})=min.MSE(t_{G_{i}})={\bar{Y}}^{2}C^{2}_{y}\left[ f_{2}-f_{i}\rho ^{2}_{yx}\right] \end{aligned}$$
(3.30)

3.6 Bhusan and Pandey [4] Imputation Methods

Bhusan and Pandey [4] proposed three different types of imputation methods which are paralleling the improvement over the Diana and Perri [8], are given as:

$$\begin{aligned}&y_{i,BP_{1}}= {\left\{ \begin{array}{ll} \mu _{1}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \mu _{1}{\bar{y}}_{r}+\frac{n\lambda _{1}}{n-r}({\bar{X}}-{\bar{x}}_{n}), &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(3.31)
$$\begin{aligned}&y_{i,BP_{2}}= {\left\{ \begin{array}{ll} \mu _{2}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \mu _{2}{\bar{y}}_{r}+\frac{n\lambda _{2}}{n-r}({\bar{X}}-{\bar{x}}_{r}), &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(3.32)
$$\begin{aligned}&y_{i,BP_{3}}= {\left\{ \begin{array}{ll} \mu _{3}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \mu _{3}{\bar{y}}_{r}+\lambda _{3}(x_{i}-{\bar{x}}_{r}), &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$
(3.33)

where \((\mu _{i}, \lambda _{i}) (i=1,2,3)\) are the arbitrary chosen constants.

The corresponding estimators are defined as:

$$\begin{aligned} t_{BP_{1}}= & {} \mu _{1}{\bar{y}}_{r}+\lambda _{1}({\bar{X}}-{\bar{x}}_{n}) \end{aligned}$$
(3.34)
$$\begin{aligned} t_{BP_{2}}= & {} \mu _{2}{\bar{y}}_{r}+\lambda _{2}({\bar{X}}-{\bar{x}}_{r}) \end{aligned}$$
(3.35)
$$\begin{aligned} t_{BP_{3}}= & {} \mu _{3}{\bar{y}}_{r}+\lambda _{3}({\bar{x}}_{n}-{\bar{x}}_{r}) \end{aligned}$$
(3.36)

The minimum MSE of the estimators \(t_{BP_{i}} (i=1,2,3)\) are, respectively, given as:

$$\begin{aligned} min.MSE(t_{BP_{i}})=\frac{{\bar{Y}}^2 MSE(t_{DP_{i}})}{{\bar{Y}}^2+MSE(t_{DP_{i}})} \end{aligned}$$
(3.37)

for the optimum values

$$\begin{aligned}&\mu _{1(opt)}=\frac{1}{1+[f_{3}+f_{1}(1-\rho ^{2}_{yx})]C^{2}_{y}},\qquad \mathrm{and} \quad \beta _{1(opt)}=\left( \rho _{yx}\frac{ S_{y}}{S_{x}}\right) \mu _{1(opt)} \end{aligned}$$
(3.38)
$$\begin{aligned}&\mu _{2(opt)}=\frac{1}{1+[f_{2}(1-\rho ^{2}_{yx})]C^{2}_{y}},\qquad \mathrm{and} \quad \beta _{2(opt)}=\left( \rho _{yx}\frac{ S_{y}}{S_{x}}\right) \mu _{2(opt)} \end{aligned}$$
(3.39)
$$\begin{aligned}&\mu _{3(opt)}=\frac{1}{1+[f_{1}+f_{3}(1-\rho ^{2}_{yx})]C^{2}_{y}},\qquad \; \mathrm{and} \quad \beta _{3(opt)}=\left( \rho _{yx}\frac{ S_{y}}{S_{x}}\right) \mu _{3(opt)}\nonumber \\ \end{aligned}$$
(3.40)

Bhusan and Pandey [5] have also given the improvement over the usual ratio type imputation methods due to Lee. et al. [12] under which the data becomes

$$\begin{aligned}&y_{i,BP^{*}_{1}}= {\left\{ \begin{array}{ll} \omega _{1}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{\omega _{1}}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{n}}\right) ^{\eta _{1}}-r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$
(3.41)
$$\begin{aligned}&y_{i,BP^{*}_{2}}= {\left\{ \begin{array}{ll} \omega _{2}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{\omega _{2}}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{r}}\right) ^{\eta _{2}}-r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$
(3.42)
$$\begin{aligned}&y_{i,BP^{*}_{3}}= {\left\{ \begin{array}{ll} \omega _{3}y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \frac{\omega _{3}}{n-r}\left[ n{\bar{y}}_{r}\left( \frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}\right) ^{\eta _{3}}-r{\bar{y}}_{r}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$
(3.43)

The respective estimators are given by

$$\begin{aligned}&t^{*}_{BP_{1}}=\omega _{1}{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{n}}\right) ^{\eta _{1}} \end{aligned}$$
(3.44)
$$\begin{aligned}&t^{*}_{BP_{2}}=\omega _{2}{\bar{y}}_{r}\left( \frac{{\bar{X}}}{{\bar{x}}_{r}}\right) ^{\eta _{2}} \end{aligned}$$
(3.45)
$$\begin{aligned}&t^{*}_{BP_{3}}=\omega _{3}{\bar{y}}_{r}\left( \frac{{\bar{x}}_{n}}{{\bar{x}}_{r}}\right) ^{\eta _{3}} \end{aligned}$$
(3.46)

where \((\omega _{i}, \eta _{i}) (i=1,2,3)\) are the arbitrary chosen constants.

The minimum MSEs of the estimators \(t^{*}_{BP_{i}}(i=1,2,3)\) are, respectively, given as:

$$\begin{aligned} min.MSE(t^{*}_{BP_{i}})={\bar{Y}}^2\left( 1-\frac{H^{2}_{i}}{G_{i}}\right) \end{aligned}$$
(3.47)

where

$$\begin{aligned} G_{i}= & {} \{1+f_{2}C^{2}_{y}+2\eta _{i(opt)}^{2}f_{i} C^{2}_{x}+\eta _{i(opt)} f_{i} C_{x}(C_{x}-4\rho _{yx}C_{y})\}\\ H_{i}= & {} \{1+\frac{\eta _{i(opt)}^{2}}{2}f_{i} C^{2}_{x}+\frac{\eta _{i(opt)}}{2}f_{i} C_{x}(C_{x}-2\rho _{yx}C_{y})\} \end{aligned}$$

The optimum values are given as: \(\omega _{i(opt)}=\frac{H_{i}}{G_{i}}(i=1,2,3)\) and \(\eta _{i(opt)}=\rho _{yx}\frac{C_{y}}{C_{x}}\).

The above existing imputation methods and resultant estimators are based on single use of an auxiliary variable . We propose some imputation techniques based on dual use of an auxiliary variable given in next section.

4 Suggested Imputation Methods

In this section, we consider the double (rank) use of an auxiliary variable to impute the missing data in three strategies ie, Strategy I, Strategy II and Strategy III which are defined as follows:

Strategy I: When \({\bar{X}}\), \({\bar{x}}_{n}\) and \({\bar{r}}_{x(n)}\) are used.

Strategy II: When \({\bar{X}}\), \({\bar{x}}_{r}\) and \({\bar{r}}_{x(r)}\) are used.

Strategy III: When \(({\bar{x}}_{n}, {\bar{x}}_{r})\) and \(({\bar{r}}_{x(n)}, {\bar{r}}_{x(r)})\) are used.

We suggest three generalized class of difference-cum-ratio type imputation methods in three strategies given above under which the data, respectively, take forms as:

$$\begin{aligned}&y_{\cdot i,P_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha _{1}{\bar{y}}_{r}+\beta _{1}(x^{*}_{1}-x_{i})+\gamma _{1}(r^{*}_{1}-r_{i})\right] \left[ \frac{u{\bar{X}}+v}{u{\bar{x}}_{n}+v}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(4.1)
$$\begin{aligned}&y_{\cdot i,P_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha _{2}{\bar{y}}_{r}+\beta _{2}(x^{*}_{2}+\frac{n}{r}x_{i}) +\gamma _{2}(r^{*}_{2}+\frac{n}{r}r_{i})\right] \left[ \frac{u{\bar{X}}+v}{u{\bar{x}}_{r}+v}\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(4.2)
$$\begin{aligned}&y_{\cdot i,P_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha _{3}{\bar{y}}_{r}+\beta _{3}(x^{*}_{3}+\frac{n}{r}x_{i}) +\gamma _{3}(r^{*}_{3}+\frac{n}{r}r_{i})\right] \left[ \frac{u{\bar{x}}_{n}+v}{u{\bar{x}}_{r}+v}\right] , &{} {\quad if\;\;\,i \in {\bar{A}}} \end{array}\right. }\nonumber \\ \end{aligned}$$
(4.3)

where

$$\begin{aligned}&x^{*}_{1}=\frac{n{\bar{X}}-r{\bar{x}}_{r}}{n-r},\qquad r^{*}_{1}=\frac{n{\bar{R}}_{x}-r{\bar{r}}_{x(r)}}{n-r},\\&x^{*}_{2}=\frac{n}{n-r}\left[ {\bar{X}}-\frac{n}{r}{\bar{x}}_{n}\right] ,\qquad r^{*}_{2}=\frac{n}{n-r}\left[ {\bar{R}}_{x}-\frac{n}{r}{\bar{r}}_{x(n)}\right] ,\\&x^{*}_{3}=\frac{n}{n-r}\left( 1-\frac{n}{r}\right) {\bar{x}}_{n},\qquad r^{*}_{3}=\frac{n}{n-r}\left( 1-\frac{n}{r}\right) {\bar{r}}_{x(n)}. \end{aligned}$$

The corresponding estimators of population mean \({\bar{Y}}\) of study variable y in case of missing observations are defined as:

$$\begin{aligned}&t_{P_{1}}=\left[ \alpha _{1}{\bar{y}}_{r}+\beta _{1}({\bar{X}}-{\bar{x}}_{n})+\gamma _{1}({\bar{R}}_{x}-{\bar{r}}_{x(n)})\right] \left[ \frac{u{\bar{X}}+v}{u{\bar{x}}_{n}+v}\right] \end{aligned}$$
(4.4)
$$\begin{aligned}&t_{P_{2}}=\left[ \alpha _{2}{\bar{y}}_{r}+\beta _{2}({\bar{X}}-{\bar{x}}_{r})+\gamma _{2}({\bar{R}}_{x}-{\bar{r}}_{x(r)})\right] \left[ \frac{u{\bar{X}}+v}{u{\bar{x}}_{r}+v}\right] \end{aligned}$$
(4.5)
$$\begin{aligned}&t_{P_{3}}=\left[ \alpha _{3}{\bar{y}}_{r}+\beta _{3}({\bar{x}}_{n}-{\bar{x}}_{r})+\gamma _{3}({\bar{r}}_{x(n)}-{\bar{r}}_{x(r)})\right] \left[ \frac{u{\bar{x}}_{n}+v}{u{\bar{x}}_{r}+v}\right] \end{aligned}$$
(4.6)

Here, \((\alpha _{i}, \beta _{i}, \gamma _{i})(i=1,2,3)\) are the arbitrary chosen constants, \(u(\ne 0),v\) are the real numbers or the functions of known parameters of the auxiliary variable which may be readily known or guessed from past surveys such as coefficient of variation \(C_{x}\), correlation coefficient \(\rho _{yx}\), etc.

4.1 Some Special Cases of Proposed Estimators

(i) If we set \(\alpha _{i}=1;(i=1,2,3)\) and \((u,v)=(0,1)\) in (4.1), (4.2) and (4.3), we can get the conventional difference type imputation methods based on dual of an auxiliary variable, given as:

$$\begin{aligned}&y_{\cdot i,d_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}+\left[ \phi _{1}(x^{*}_{1}-x_{i})+\varphi _{1}(r^{*}_{1}-r_{i})\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(4.7)
$$\begin{aligned}&y_{\cdot i,d_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}+\left[ \phi _{2}(x^{*}_{2}+\frac{n}{r}x_{i})+\varphi _{2}(r^{*}_{2}+\frac{n}{r}r_{i})\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(4.8)
$$\begin{aligned}&y_{\cdot i,d_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ {\bar{y}}_{r}+\left[ \phi _{3}(x^{*}_{3}+\frac{n}{r}x_{i})+\varphi _{3}(r^{*}_{3}+\frac{n}{r}r_{i})\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. } \end{aligned}$$
(4.9)

The respective estimators are given by

$$\begin{aligned}&t_{d_{1}}=\left[ {\bar{y}}_{r}+\phi _{1}({\bar{X}}-{\bar{x}}_{n})+\varphi _{1}({\bar{R}}_{x}-{\bar{r}}_{x(n)})\right] \end{aligned}$$
(4.10)
$$\begin{aligned}&t_{d_{2}}=\left[ {\bar{y}}_{r}+\phi _{2}({\bar{X}}-{\bar{x}}_{r})+\varphi _{2}({\bar{R}}_{x}-{\bar{r}}_{x(r)})\right] \end{aligned}$$
(4.11)
$$\begin{aligned}&t_{d_{3}}=\left[ {\bar{y}}_{r}+\phi _{3}({\bar{x}}_{n}-{\bar{x}}_{r})+\varphi _{3}({\bar{r}}_{x(n)}-{\bar{r}}_{x(r)})\right] \end{aligned}$$
(4.12)

(ii) If we set \((u,v)=(0,1)\) in (4.1), (4.2) and (4.3), we get the improved difference type imputation methods given by

$$\begin{aligned}&y_{\cdot i,D_{1}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha ^{*}_{1}{\bar{y}}_{r}+\beta ^{*}_{1}(x^{*}_{1}-x_{i})+\gamma ^{*}_{1}(r^{*}_{1}-r_{i})\right] , &{} {\quad \text {if}\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(4.13)
$$\begin{aligned}&y_{\cdot i,D_{2}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha ^{*}_{2}{\bar{y}}_{r}+\beta ^{*}_{2}(x^{*}_{2}+\frac{n}{r}x_{i}) +\gamma ^{*}_{2}(r^{*}_{2}+\frac{n}{r}r_{i})\right] , &{} {\quad if\;\;\,i \in {\bar{A}}} \end{array}\right. }\end{aligned}$$
(4.14)
$$\begin{aligned}&y_{\cdot i,D_{3}}= {\left\{ \begin{array}{ll} y_{i}, &{} {\quad \text {if}\;\;\,i \in A } \\ \left( 1-\frac{n}{n-r}\right) {\bar{y}}_{r}+\left[ \frac{n}{n-r}\alpha ^{*}_{3}{\bar{y}}_{r}+\beta ^{*}_{3}(x^{*}_{3}+\frac{n}{r}x_{i}) +\gamma ^{*}_{3}(r^{*}_{3}+\frac{n}{r}r_{i})\right] , &{} {\quad if\;\;\,i \in {\bar{A}}} \end{array}\right. }\nonumber \\ \end{aligned}$$
(4.15)

The respective estimators are given by

$$\begin{aligned}&t_{D_{1}}=\left[ \alpha ^{*}_{1}{\bar{y}}_{r}+\beta ^{*}_{1}({\bar{X}}-{\bar{x}}_{n})+\gamma ^{*}_{1}({\bar{R}}_{x}-{\bar{r}}_{x(n)})\right] \end{aligned}$$
(4.16)
$$\begin{aligned}&t_{D_{2}}=\left[ \alpha ^{*}_{2}{\bar{y}}_{r}+\beta ^{*}_{2}({\bar{X}}-{\bar{x}}_{r})+\gamma ^{*}_{2}({\bar{R}}_{x}-{\bar{r}}_{x(r)})\right] \end{aligned}$$
(4.17)
$$\begin{aligned}&t_{D_{3}}=\left[ \alpha ^{*}_{3}{\bar{y}}_{r}+\beta ^{*}_{3}({\bar{x}}_{n}-{\bar{x}}_{r})+\gamma ^{*}_{3}({\bar{r}}_{x(n)}-{\bar{r}}_{x(r)})\right] \end{aligned}$$
(4.18)

which are paralleling improvement over the estimators \(t_{d_{i}}(i=1,2,3)\).

4.2 Properties of Proposed Estimators

Theorem 4.1

The biases and MSEs of the estimators \(t_{P_{i}}(i=1,2,3)\) to the first-order approximations are given by

$$\begin{aligned} B(t_{P_{i}})={\bar{Y}}\left[ \alpha _{i}F_{i}+R_{1}\beta _{i}G_{i}+R_{2}\gamma _{1}H_{i}-1\right] \end{aligned}$$
(4.19)

and

$$\begin{aligned} \begin{aligned} MSE(t_{P_{i}})=&{\bar{Y}}^{2}[\alpha ^{2}_{i}A'_{i}+\beta ^{2}_{i}B'_{i}+\gamma ^{2}_{i}C'_{i} +2\alpha _{i}\beta _{i}D'_{i}+2\alpha _{i}\gamma _{i}E'_{i}+2\beta _{i}\gamma _{i}F'_{i}\\&-2\alpha _{i}G'_{i}-2\beta _{i}H'_{i}-2\gamma _{i}I'_{i}+1] \end{aligned} \end{aligned}$$
(4.20)

The minimum MSEs of the estimators \(t_{P_{i}}(i=1,2,3)\) are given by

$$\begin{aligned} min.MSE(t_{P_{i}})={\bar{Y}}^{2}\left[ 1-\frac{Q^{2}_{i}}{P_{i}}\right] \end{aligned}$$
(4.21)

at the optimum values of \(\alpha _{i}(i=1,2,3)\), \(\beta _{i}\) and \(\gamma _{i}\), are given by

$$\begin{aligned}&\alpha _{i(opt)}=\frac{E_{i}-\beta _{i(opt)}L_{i}-\gamma _{i(opt)}M_{i}}{P_{i}}, \end{aligned}$$
(4.22)
$$\begin{aligned}&\beta _{i(opt)}=\frac{C^{*}_{i}E^{*}_{i}-B^{*}_{i}D^{*}_{i}}{C^{*}_{i}D^{*}_{i}-A^{*}_{i}E^{*}_{i}}\gamma _{i(opt)} \end{aligned}$$
(4.23)
$$\begin{aligned}&\gamma _{i(opt)}=\frac{C^{*}_{i}D^{*}_{i}-A^{*}_{i}E^{*}_{i}}{D_{i}} \end{aligned}$$
(4.24)

where

$$\begin{aligned}&A'_{i}=[1+f_{2}C^{2}_{y}+f_{i}\{\psi C_{x}(3\psi C_{x}-4\rho _{yx}C_{y})\}],\\&B'_{i}=R^{2}_{1}f_{i} C^{2}_{x},\qquad R_{1}=\frac{{\bar{X}}}{{\bar{Y}}},\\&C'_{i}=R^{2}_{2}f_{i} C^{2}_{r},\qquad R_{2}=\frac{\bar{R_{x}}}{{\bar{Y}}},\\&D'_{i}=R_{1}f_{i} C_{x}(2\psi C_{x}-\rho _{yx}C_{y}),\quad E'_{i}=R_{2}f_{i} C_{r}(2\psi \rho _{xr}C_{x}-\rho _{yr}C_{y}),\\&F'_{i}=R_{1}R_{2}f_{i} \rho _{xr}C_{x}C_{r},\quad G'_{i}=[1+\psi f_{i} C_{x}(\psi C_{x}-\rho _{yx}C_{y})],\\&H'_{i}=R_{1}\psi f_{i} C^{2}_{x},\quad I'_{i}=R_{2}\psi f_{i} \rho _{xr}C_{x}C_{r},\\&F_{i}=1-\psi f_{i}C_{x}(\rho _{yx}C_{y}-\psi C_{x}),\\&G_{i}=\psi f_{i}C^{2}_{x}, \quad H_{i}=\psi f_{i}\rho _{xr_{x}}C_{x}C_{r},\\&Q^{2}_{i}=E^{2}_{i}-(A^{*}_{i}E^{*2}_{i}+B^{*}_{i}D^{*2}_{i}-2C^{*}_{i}D^{*}_{i}E^{*}_{i})/D_{i},\\&P_{i}=1+f_{2}C^{2}_{y}+f_{i}[\psi C_{x}(3\psi C_{x}-4\rho _{yx}C_{y})],\\&D_{i}=C^{*2}_{i}-A^{*}_{i}B^{*}_{i},\\&A^{*}_{i}=R^{2}_{1}f_{i}C^{2}_{x}[(1+f_{2}C^{2}_{y})-f_{i}(\rho ^{2}_{yx})-\psi ^{2} C^{2}_{x})],\\&B^{*}_{i}=R^{2}_{2}f_{i}C^{2}_{r}\left[ (1+f_{2}C^{2}_{y})-f_{i}\{\rho ^{2}_{yr}-\psi ^{2}C^{2}_{x}(3-4\rho ^{2}_{xr})-4\psi C_{y}C_{x}(\rho _{yx}-\rho _{yr}\rho _{xr})\}\right] ,\\&C^{*}_{i}=R_{2} f_{i}\rho _{xr}C_{x}C_{r}[R_{1}(1+f_{2}C^{2}_{y}+3f_{i}C^{2}_{x}-4f_{i}\rho _{yx}C_{y}C_{x})\\&\quad -R_{2}\psi f_{i}C_{r}(2\psi \rho _{xr}C_{x}-\rho _{yr}C_{y})],\\&D^{*}_{i}=R_{1}f_{i}C_{x}[(\psi C_{x}-\rho _{yx}C_{y})(\psi ^{2}f_{i}C^{2}_{x}-1)+\psi f_{i}C^{2}_{y}C_{x}(1-\rho ^{2}_{yx})],\\&E^{*}_{i}=R_{2}f_{i}C_{r}[\rho _{yr}C_{y}-\psi \rho _{xr}C_{x}(1-\psi ^{2}f_{i}C^{2}_{x})\\&\quad +\psi f_{i}C_{y}C_{x}\{C_{y}(\rho _{xr}-\rho _{yx}\rho _{yr})+\psi C_{x}(\rho _{yr}-2\rho _{yx}\rho _{xr})\}],\\&E_{i}=1+\psi f_{i}C_{x}(\psi C_{x}-\rho _{yx}C_{y}),\quad L_{i}=R_{1}f_{i}C_{x}(2\psi C_{x}-\rho _{yx}C_{y}),\\&and\;\;M_{i}=R_{2}f_{i}C_{r}(2\psi \rho _{xr}C_{x}-\rho _{yr}C_{y}). \end{aligned}$$

Proof

Expressing (4.4), (4.5) and (4.6) in terms of errors, we get

$$\begin{aligned} t_{P_{1}}= & {} \left[ {\bar{Y}}\alpha _{1}(1+\xi _{0})-{\bar{X}}\beta _{1}\xi _{1}-{\bar{R}}_{x}\gamma _{1}\xi _{2}\right] (1+\psi \xi _{1})^{-1} \end{aligned}$$
(4.25)
$$\begin{aligned} t_{P_{2}}= & {} \left[ {\bar{Y}}\alpha _{2}(1+\xi _{0})-{\bar{X}}\beta _{2}\xi '_{1}-{\bar{R}}_{x}\gamma _{2}\xi '_{2}\right] (1+\psi \xi '_{1})^{-1} \end{aligned}$$
(4.26)
$$\begin{aligned} t_{P_{3}}= & {} \left[ {\bar{Y}}\alpha _{3}(1+\xi _{0})-{\bar{X}}\beta _{3}(\xi _{1}-\xi '_{1})-{\bar{R}}_{x}\gamma _{3}(\xi _{2}-\xi '_{2})\right] \nonumber \\&(1+\psi \xi _{1})(1+\psi \xi '_{1})^{-1} \end{aligned}$$
(4.27)

where \(\psi =\frac{u{\bar{X}}}{u{\bar{X}}+v}\). We assume that \(|\xi _{1}|<1\) and \(|\xi '_{1}|<1\), so that \((1+\xi _{1})^{-1}\) and \((1+\xi '_{1})^{-1}\) are expandable. Now, expanding \((1+\xi _{1})^{-1}\) and \((1+\xi '_{1})^{-1}\) binomially, multiplying over right-hand sides (r.h.s.) of (4.25), (4.26) and (4.27) and neglecting the terms of errors having power greater than two, we can get, respectively, as:

$$\begin{aligned}&t_{P_{1}}={\bar{Y}}\left[ \alpha _{1}\left( 1+\xi _{0}-\psi \xi _{1}+\psi \xi ^{2}_{2}-\psi \xi _{0}\xi _{1}\right) \right. \nonumber \\&\quad \left. -\beta _{1}R_{1}(\xi _{1}-\psi \xi ^{2}_{1}) -\gamma _{1}R_{2}(\xi _{2}-\psi \xi _{1}\xi _{2})\right] \end{aligned}$$
(4.28)
$$\begin{aligned}&t_{P_{2}}={\bar{Y}}\left[ \alpha _{2}\left( 1+\xi _{0}-\psi \xi '_{1}+\psi \xi '^{2}_{2}-\psi \xi _{0}\xi '_{1}\right) \right. \nonumber \\&\quad \left. -\beta _{2}R_{1}(\xi '_{1}-\psi \xi '^{2}_{1}) -\gamma _{2}R_{2}(\xi '_{2}-\psi \xi '_{1}\xi '_{2})\right] \end{aligned}$$
(4.29)
$$\begin{aligned}&t_{P_{3}}={\bar{Y}}\left[ \alpha _{3}\left( 1+\xi _{0}-\psi \xi '_{1}+\psi \xi '^{2}_{2}-\psi \xi _{0}\xi '_{1}\right) \right. \nonumber \\&\quad \left. -\beta _{3}R_{1}(\xi '_{1}-\psi \xi '^{2}_{1}) -\gamma _{3}R_{2}(\xi '_{2}-\psi \xi '_{1}\xi '_{2})\right] \end{aligned}$$
(4.30)

The biases of the estimators \(t_{P_{i}}(i=1,2,3)\) can be derived as:

$$\begin{aligned}&\begin{aligned} B(t_{P_{1}})=&E\left[ t_{P_{1}}-{\bar{Y}}\right] \\ =&E\left[ {\bar{Y}}\left[ \alpha _{1}\left( 1+\xi _{0}-\psi \xi _{1}+\psi \xi ^{2}_{2}-\psi \xi _{0}\xi _{1}\right) -\beta _{1}R_{1}(\xi _{1}-\psi \xi ^{2}_{1})\right. \right. \\&\left. \left. -\gamma _{1}R_{2}(\xi _{2}-\psi \xi _{1}\xi _{2})\right] -{\bar{Y}}\right] \end{aligned} \qquad \end{aligned}$$
(4.31)
$$\begin{aligned}&\begin{aligned} B(t_{P_{2}})=&E\left[ t_{P_{2}}-{\bar{Y}}\right] \\ =&E\left[ {\bar{Y}}\left[ \alpha _{2}\left( 1+\xi _{0}-\psi \xi '_{1}+\psi \xi '^{2}_{2}-\psi \xi _{0}\xi '_{1}\right) -\beta _{2}R_{1}(\xi '_{1}-\psi \xi '^{2}_{1})\right. \right. \\&\left. \left. -\gamma _{2}R_{2}(\xi '_{2}-\psi \xi '_{1}\xi '_{2})\right] -{\bar{Y}}\right] \end{aligned} \qquad \qquad \end{aligned}$$
(4.32)
$$\begin{aligned}&\begin{aligned} B(t_{P_{3}})=&E\left[ t_{P_{3}}-{\bar{Y}}\right] \\ =&E\left[ {\bar{Y}}\left[ \alpha _{3}\left( 1+\xi _{0}-\psi \xi '_{1}+\psi \xi '^{2}_{2}-\psi \xi _{0}\xi '_{1}\right) -\beta _{3}R_{1}(\xi '_{1}-\psi \xi '^{2}_{1})\right. \right. \\&\left. \left. -\gamma _{3}R_{2}(\xi '_{2}-\psi \xi '_{1}\xi '_{2})\right] -{\bar{Y}}\right] \end{aligned}\qquad \qquad \end{aligned}$$
(4.33)

Taking expectations of both sides of (4.31)–(4.33), we get the expressions for biases given in (4.19).

The MSEs of the estimators \(t_{P_{i}}(i=1,2,3)\) to the first-order approximation can be derived as:

$$\begin{aligned}&\begin{aligned} M(t_{P_{1}})=&E\left[ t_{P_{1}}-{\bar{Y}}\right] ^{2}\\ =&E\left[ {\bar{Y}}^{2}\left[ \alpha ^{2}_{1}(1+2\xi _{0}-2\psi \xi _{1}+\xi ^{2}_{0}+3\psi ^{2}\xi ^{2}_{1}-4\psi \xi _{0}\xi _{1}) +\beta ^{2}_{1}R^{2}_{1}\xi ^{2}_{1}+\gamma ^{2}_{1}R^{2}_{2}\xi ^{2}_{2}\right. \right. \\&+2\alpha _{1}\beta _{1}R_{1}(2\psi \xi ^{2}_{1}-\xi _{0}\xi _{1})+2\alpha _{1}\gamma _{1}R_{2}(2\psi \xi _{1}\xi _{2}-\xi _{0}\xi _{2}) +2\alpha _{1}\beta _{1}R_{1}R_{2}\xi _{1}\xi _{2}\\&\left. \left. -2\alpha _{1}(1+\psi ^{2}\xi ^{2}_{1}-\psi \xi _{0}\xi _{1})-2\beta _{1}R_{1}\psi \xi ^{2}_{1}-2\gamma _{3}R_{2}\psi \xi _{1}\xi _{2}+1\right] \right] \end{aligned} \end{aligned}$$
(4.34)
$$\begin{aligned}&\begin{aligned} M(t_{P_{2}})=&E\left[ t_{P_{2}}-{\bar{Y}}\right] ^{2}\\ =&E\left[ {\bar{Y}}^{2}\left[ \alpha ^{2}_{2}(1+2\xi _{0}-2\psi \xi '_{1}+\xi ^{2}_{0}+3\psi ^{2}\xi '^{2}_{1}-4\psi \xi _{0}\xi '_{1}) +\beta ^{2}_{2}R^{2}_{1}\xi '^{2}_{1}+\gamma ^{2}_{2}R^{2}_{2}\xi '^{2}_{2}\right. \right. \\&+2\alpha _{2}\beta _{2}R_{1}(2\psi \xi '^{2}_{1}-\xi _{0}\xi '_{1})+2\alpha _{2}\gamma _{2}R_{2}(2\psi \xi '_{1}\xi '_{2}-\xi _{0}\xi '_{2}) +2\alpha _{2}\beta _{2}R_{1}R_{2}\xi _{1}\xi '_{2}\\&\left. \left. -2\alpha _{2}(1+\psi ^{2}\xi '^{2}_{1}-\psi \xi _{0}\xi '_{1})-2\beta _{2}R_{1}\psi \xi '^{2}_{1}-2\gamma _{2}R_{2}\psi \xi '_{1}\xi '_{2}+1\right] \right] \end{aligned} \end{aligned}$$
(4.35)
$$\begin{aligned}&\begin{aligned} M(t_{P_{3}})=&E\left[ t_{P_{3}}-{\bar{Y}}\right] ^{2}\\ =&E\left[ {\bar{Y}}^{2}\left[ \alpha ^{2}_{3}(1+2\xi _{0}-2\psi \xi '_{1}+2\psi \xi _{1}+\xi ^{2}_{0}+3\psi ^{2}\xi '^{2}_{1}+4\psi \xi _{0}\xi _{1}\right. \right. \\&\quad -4\psi \xi _{0}\xi '_{1}-4\psi ^{2}\xi _{1}\xi '_{1})\\&+\beta ^{2}_{3}R^{2}_{1}(\xi ^{2}_{1}+\xi '^{2}_{1}-2\xi _{1}\xi '_{1}) +\gamma ^{2}_{3}R^{2}_{2}(\xi ^{2}_{2}+\xi '^{2}_{2}-2\xi _{2}\xi '_{2})\\&+2\alpha _{3}\beta _{3}R_{1}(2\psi \xi '^{2}_{1}+2\psi \xi '^{2}_{1}+\xi _{0}\xi _{1}-\xi _{0}\xi '_{1}-4\psi \xi _{1}\xi '_{1})\\&+2\alpha _{3}\gamma _{3}R_{2}(\xi _{0}\xi _{2}-\xi _{0}\xi '_{2}+2\psi \xi _{1}\xi _{2}-4\psi \xi '_{1}\xi _{2}+2\psi \xi '_{1}\xi '_{2})\\&+2\beta _{3}\gamma _{3}R_{1}R_{2}(\xi _{1}\xi _{2}-\xi _{1}\xi '_{2}-\xi '_{1}\xi _{2}-\xi '_{1}\xi '_{2})\\&-2\alpha _{3}(1+\psi ^{2}\xi '^{2}_{1}+\psi \xi _{0}\xi '_{1}-\psi \xi _{0}\xi '_{1}-\psi ^{2}\xi _{1}\xi '_{1})\\&-2\beta _{3}R_{1}(\psi \xi ^{2}_{1}+\psi \xi '^{2}_{1}-2\psi \xi _{1}\xi '_{1})\\&\left. \left. -2\gamma _{3}R_{2}(\psi \xi _{1}\xi _{2}-\psi \xi _{1}\xi '_{2}-\psi \xi '_{1}\xi _{2}+\psi \xi '_{1}\xi '_{2})+1\right] \right] \end{aligned} \end{aligned}$$
(4.36)

Now, taking the expectations of both sides of (4.34)–(4.36), we get the expressions for MSEs given in (4.20).

To obtain the optimum choices of \(\alpha _{i}\), \(\beta _{i}\) and \(\gamma _{i}\), we differentiate the expressions of MSEs given in (3.20) partially with respect to \(\alpha _{i}\), \(\beta _{i}\) and \(\gamma _{i}\) and equate them to zero, we get

$$\begin{aligned}&\frac{\partial }{\partial \alpha _{i}} MSE(t_{P_{i}}) =\alpha _{i}A'_{i}+\beta _{i}D'_{i}+\gamma _{i}E'_{i}-G'_{i}=0 \end{aligned}$$
(4.37)
$$\begin{aligned}&\frac{\partial }{\partial \beta _{i}}MSE(t_{P_{i}}) =\alpha _{i}D'_{i}+\beta _{i}B'_{i}+\gamma _{i}F'_{i}-H'_{i}=0 \end{aligned}$$
(4.38)
$$\begin{aligned}&\begin{aligned} \frac{\partial }{\partial \gamma _{i}}MSE(t_{P_{i}}) =\alpha _{i}E'_{i}+\beta _{i}F'_{i}+\gamma _{i}C'_{i}-I'_{i}=0 \end{aligned} \end{aligned}$$
(4.39)

Solving Eqs. (4.37)–(4.39) for \(\alpha _{i}\), \(\beta _{i}\) and \(\gamma _{i}\), we get the optimum values given in (4.22)–(4.24). By putting these optimum values in (4.20), we get the expressions for minimum MSEs of \(t_{P_{i}}\) given in (4.21). \(\square \)

Corollary 4.1

The MSEs of the unbiased estimators \(t_{d_{i}}(i=1,2,3)\) to the first-order approximations are given by

$$\begin{aligned} MSE(t_{d_{i}})= & {} {\bar{Y}}^{2}\left[ f_{2}C^{2}_{y}+\beta ^{2}_{i}R^{2}_{1}f_{i}C^{2}_{x}+\gamma ^{2}_{i}R^{2}_{2}f_{i}C^{2}_{r}+2\beta _{i}\gamma _{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}\right. \nonumber \\&\left. -2\beta _{i}R_{1}f_{i}\rho _{yx}C_{y}C_{x}+2\gamma _{i}R_{2}f_{i}\rho _{yr}C_{y}C_{r}\right] \end{aligned}$$
(4.40)

The minimum MSEs of the estimators \(t_{d_{i}}(i=1,2,3)\) to the first-order approximations are given by

$$\begin{aligned} min.MSE(t_{d_{i}})={\bar{Y}}^{2}C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}}) \end{aligned}$$
(4.41)

for the optimum values

$$\begin{aligned}&\phi _{i(opt)}=\frac{C_{y}}{C_{x}}\frac{(\rho _{yx}-\rho _{yr_{x}}\rho _{xr_{x}})}{1-\rho ^{2}_{xr_{x}}} \end{aligned}$$
(4.42)
$$\begin{aligned}&\varphi _{i(opt)}=\frac{C_{y}}{C_{r}}\frac{(\rho _{yr_{x}}-\rho _{yx}\rho _{xr_{x}})}{1-\rho ^{2}_{xr_{x}}} \end{aligned}$$
(4.43)

where \(R^{2}_{y.xr_{x}}=\left( \frac{\rho ^{2}_{yx}+\rho ^{2}_{yr_{x}}-2\rho _{yx}\rho _{yr_{x}}\rho _{xr_{x}}}{1-\rho ^{2}_{xr_{x}}}\right) \).

Proof

By putting \(\alpha _{i}=1\) and \(\psi =0\) in (4.20), we get the expressions for MSEs of the estimators \(t_{d_{i}}\) given in (4.40).

To obtain minimum MSEs of \(t_{d_{i}}\), we differentiate the expressions of MSEs given in (4.40) partially with respect to \(\phi _{i}\) and \(\varphi _{i}\) and equate them to zero, we get

$$\begin{aligned} \frac{\partial }{\partial \phi _{i}}MSE(t_{d_{i}})= & {} \phi _{i}R^{2}_{1}f_{i}C^{2}_{x}\nonumber \\&+\varphi _{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r} -R_{1}f_{i}\rho _{yx}C_{y}C_{x}=0 \end{aligned}$$
(4.44)
$$\begin{aligned} \frac{\partial }{\partial \varphi _{i}}MSE(t_{d_{i}})= & {} \phi _{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}\nonumber \\&+\varphi _{i}R^{2}_{2}f_{i}C^{2}_{r} -R_{1}f_{i}\rho _{yx}C_{y}C_{x}=0 \end{aligned}$$
(4.45)

Solving Eqs. (4.44) and (4.45) for \(\phi _{i}\) and \(\varphi _{i}\), we get the optimum values given in (4.42) and (4.43). By putting these optimum values in (4.40), we get the minimum MSE of \(t_{d_{i}}\) given in (4.41). \(\square \)

Corollary 4.2

The biases and MSEs of the estimators \(t_{D_{i}}(i=1,2,3)\) to the first-order approximations are given by

$$\begin{aligned} B(t_{D_{i}})={\bar{Y}}(\alpha ^{*}_{i}-1) \end{aligned}$$
(4.46)

and

$$\begin{aligned} \begin{aligned}&MSE(t_{D_{i}})\\&\quad ={\bar{Y}}^{2}\left[ \alpha ^{*2}_{i}(1+f_{2}C^{2}_{y})+\beta ^{*2}_{i}R^{2}_{1}f_{i}C^{2}_{x}+\gamma ^{*2}_{i}R^{2}_{2}f_{i}C^{2}_{r}-2\alpha ^{*}_{i}\beta ^{*}_{i}R_{1}f_{i}\rho _{yx}C_{y}C_{x}\right. \\&\qquad \left. +2\alpha ^{*}_{i}\gamma ^{*}_{i}R_{2}f_{i}\rho _{yr}C_{y}C_{r}+2\beta ^{*}_{i}\gamma ^{*}_{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}-2\alpha ^{*}_{i}\right] \end{aligned}\nonumber \\ \end{aligned}$$
(4.47)

The minimum MSEs of the estimators \(t_{D_{i}}\) are given by

$$\begin{aligned} min.MSE(t_{D_{i}})=\frac{{\bar{Y}}^{2}min.MSE(t_{d_{i}})}{{\bar{Y}}^{2}+min.MSE(t_{d_{i}})} \end{aligned}$$
(4.48)

for the optimum values

$$\begin{aligned}&\alpha ^{*}_{i(opt)}=\frac{1}{1+C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})} \end{aligned}$$
(4.49)
$$\begin{aligned}&\beta ^{*}_{i(opt)}=\alpha ^{*}_{i(opt)}\frac{C_{y}}{C_{x}}\frac{(\rho _{yx}-\rho _{yr_{x}}\rho _{xr_{x}})}{1-\rho ^{2}_{xr_{x}}} \end{aligned}$$
(4.50)
$$\begin{aligned}&\gamma ^{*}_{i(opt)}=\alpha ^{*}_{i(opt)}\frac{C_{y}}{C_{r}}\frac{(\rho _{yr_{x}}-\rho _{yx}\rho _{xr_{x}})}{1-\rho ^{2}_{xr_{x}}} \end{aligned}$$
(4.51)

Proof

By putting \(\psi =0\) in (4.19) and (4.20), we get the expressions for biases and MSEs of the estimators \(t_{D_{i}}\) given in (4.46) and (4.47).

To obtain minimum MSEs of \(t_{D_{i}}\), we differentiate the expressions of MSEs given in (4.47) partially with respect to \(\alpha ^{*}_{i}\), \(\beta ^{*}_{i}\) and \(\gamma ^{*}_{i}\) and equate them to zero, we get

$$\begin{aligned}&\frac{\partial }{\partial \alpha ^{*}_{i}}MSE(t_{D_{i}}) =\alpha ^{*}_{i}(1+f_{2}C^{2}_{y})-\beta ^{*}_{i}R_{1}f_{i}\rho _{yx}C_{y}C_{x} +\gamma ^{*}_{i}R_{2}f_{i}\rho _{yr}C_{y}C_{r}-1=0 \end{aligned}$$
(4.52)
$$\begin{aligned}&\frac{\partial }{\partial \beta ^{*}_{i}}MSE(t_{D_{i}}) =\alpha ^{*}_{i}R_{1}f_{i}\rho _{yx}C_{y}C_{x}-\beta ^{*}_{i}R^{2}_{1}f_{i}C^{2}_{x} -\gamma ^{*}_{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}=0 \end{aligned}$$
(4.53)
$$\begin{aligned}&\frac{\partial }{\partial \gamma ^{*}_{i}}MSE(t_{D_{i}}) =\alpha ^{*}_{i}R_{2}f_{i}\rho _{yr}C_{y}C_{r}+\beta ^{*}_{i}R_{1}R_{2}f_{i}\rho _{xr}C_{x}C_{r}+\gamma ^{*}_{i}R^{2}_{2}f_{i}C^{2}_{r}=0\nonumber \\ \end{aligned}$$
(4.54)

Solving Equations (4.52)–(4.54) for \(\alpha ^{*}_{i}\), \(\beta ^{*}_{i}\) and \(\gamma ^{*}_{i}\), we get the optimum values given in (4.49)–(4.51). By putting these optimum values in (4.47), we get the minimum MSE of \(t_{D_{i}}\) given in (4.48).

4.3 Practicability of the suggested estimators

The suggested imputation methods \(t_{P_{i}}(i=1,2,3)\) are designed using the scalars \((\alpha _{i}, \beta _{i}, \gamma _{i})\). Therefore, we have to choose the appropriate value for these scalars in order to estimate the population mean. We have seen in (4.22), (4.23) and (4.24) that the optimum values of \(\alpha _{i}\), \(\beta _{i}\), and \(\gamma _{i})\) depend on the parameters \({\bar{Y}}\), \({\bar{X}}\), \(\rho _{yx}\), \(C_{y}\), \(C_{x}\), etc., which may not be available every time. In such situations, they can be estimated using a pilot survey or guessed from a past survey and subsequently employed for estimation of the population mean. Similarly, the optimum values of scalars used in estimators \(t_{d_{i}}\) and \(t_{D_{i}}\) can be obtained at the estimation stage. \(\square \)

4.4 Some other members of proposed classes of estimators \(t_{P_{1}}\), \(t_{P_{2}}\) and \(t_{P_{3}}\)

Many estimators can be generated using \(t_{P_{1}}\), \(t_{P_{2}}\) and \(t_{P_{3}}\) families of estimators by choosing various values of u and v in (4.4), (4.5) and (4.6). Some of them in strategy I, strategy II and strategy III, respectively, are given in Table 1.

Table 1 Some members of proposed family of estimators \(t_{P_{1}}\), \(t_{P_{2}}\) and \(t_{P_{3}}\)

The respective imputations for the data can be formed just by putting the suitable values of u and v in (4.1), (4.2) and (4.3). The biases and minimum MSEs of the estimators \(t^{(j)}_{P_{i}}(i=1,2,3)\,\mathrm{and}\,(j=1,2,...,10)\) can be easily obtained by putting suitable values in (4.19) and (4.20).

5 Theoretical Comparisons

In this section, we compare the proposed estimators with above existing estimators based on their theoretical results.

5.1 Parallel Comparisons

5.1.1 Comparisons of \(t_{P_{i}}(i=1,2,3)\) with Other Existing Estimators

From (3.2), (3.9), (3.16), (3.23), (3.30), (3.37), (3.47) and (4.20), we get

(i) \(min.MSE(t_{P_{i}})<MSE({\bar{y}}_{r})\), if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+f_{2}C^{2}_{y}>1 \end{aligned}$$
(5.1)

(ii) \(min.MSE(t_{P_{i}})<MSE(t_{R_{i}})\), if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+[f_{2}C^{2}_{y}+f_{i}C_{x}(C_{x}-2\rho _{yx}C_{y})]>1 \end{aligned}$$
(5.2)

(iii) \(min.MSE(t_{P_{i}})<MSE(t_{KC_{i}})\), if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+[f_{2}C^{2}_{y}+f_{i}(C^{2}_{x}-\rho ^{2}_{yx}C^{2}_{y})]>1 \end{aligned}$$
(5.3)

(iv) \(min.MSE(t_{P_{i}})<min.MSE(t_{G_{i}})\,\mathrm{or}\,MSE(t_{DP_{i}})\),, if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+C^{2}_{y}(f_{2}-f_{i}\rho ^{2}_{yx})>1 \end{aligned}$$
(5.4)

(v) \(min.MSE(t_{P_{i}})<min.MSE(t_{BP_{i}})\), if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+\frac{1}{1+\frac{1}{C^{2}_{y}(f_{2}-f_{i}\rho ^{2}_{yx})}}>1 \end{aligned}$$
(5.5)

(vi) \(min.MSE(t_{P_{i}})<min.MSE(t^{*}_{BP_{i}})\), if

$$\begin{aligned} \frac{Q^{2}_{i}}{P_{i}}+\left( 1-\frac{H^{2}_{i}}{G_{i}}\right) >1 \end{aligned}$$
(5.6)

5.1.2 Comparisons of \(t_{d_{i}}(i=1,2,3)\) with Other Existing Estimators

From (3.2), (3.9), (3.16), (3.23), (3.30), (3.37), (3.47) and (4.41), we get

(i) \(min.MSE(t_{d_{i}})<MSE({\bar{y}}_{r})\), if

$$\begin{aligned} R^{2}_{y.xr_{x}}>0 \end{aligned}$$
(5.7)

(ii) \(min.MSE(t_{d_{i}})<MSE(t_{R_{i}})\), if

$$\begin{aligned} R^{2}_{y.xr_{x}}+C_{x}(C_{x}-2\rho _{yx}C_{y})>0 \end{aligned}$$
(5.8)

(iii) \(min.MSE(t_{d_{i}})<MSE(t_{KC_{i}})\), if

$$\begin{aligned} R^{2}_{y.xr_{x}}+(C^{2}_{x}-\rho ^{2}_{yx}C^{2}_{y})]>0 \end{aligned}$$
(5.9)

(iv) \(min.MSE(t_{d_{i}})<min.MSE(t_{G_{i}})\,\mathrm{or}\,MSE(t_{DP_{i}})\), if

$$\begin{aligned} (\rho _{yr_{x}}-\rho _{yx}\rho _{x_{r_{}x}})^{2}>0 \end{aligned}$$
(5.10)

(v) \(min.MSE(t_{d_{i}})<min.MSE(t_{BP_{i}})\), if

$$\begin{aligned} \left( \frac{1}{f_{2}-f_{i}R^{2}_{y.xr_{x}}}-\frac{1}{f_{2}-f_{i}\rho ^{2}_{yx}}\right) -C^{2}_{y}>0 \end{aligned}$$
(5.11)

(vi) \(min.MSE(t_{d_{i}})<min.MSE(t^{*}_{BP_{i}})\), if

$$\begin{aligned} \left( 1-\frac{H^{2}_{i}}{G_{i}}\right) -C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})>0 \end{aligned}$$
(5.12)

From (5.10), it is clear that the estimator \(t_{d_{i}}(i=1,2,3)\) are always better than \(t_{DP_{i}}\) which contradicts the statement of Diana and Perri [8]. Diana and Perri [8] stated “Using the same amount of auxiliary information, no further improvement upon the regression estimator is possible, at least if the first order approximation is considered” which appears to be false over here.

5.1.3 Comparisons of \(t_{D_{i}}(i=1,2,3)\) with Other Existing Estimators

From (3.2), (3.9), (3.16), (3.23), (3.30), (3.37), (3.47) and (4.48), we get

(i) \(min.MSE(t_{D_{i}})<MSE({\bar{y}}_{r})\), if

$$\begin{aligned} f_{2}C^{2}_{y}\left( 1-\frac{f_{2}}{f_{i}R^{2}_{y.xr_{x}}}\right) >1 \end{aligned}$$
(5.13)

(ii) \(min.MSE(t_{D_{i}})<MSE(t_{R_{i}})\), if

$$\begin{aligned} \left[ 1+\frac{1}{C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})}\right] [f_{2}C^{2}_{y}+f_{i}C_{x}(C_{x}-2\rho _{yx}C_{y})]>1 \end{aligned}$$
(5.14)

(iii) \(min.MSE(t_{D_{i}})<MSE(t_{KC_{i}})\), if

$$\begin{aligned} \left[ 1+\frac{1}{C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})}\right] [f_{2}C^{2}_{y}+f_{i}(C^{2}_{x}-\rho ^{2}_{yx}C^{2}_{y})]>1 \end{aligned}$$
(5.15)

(iv) \(min.MSE(t_{D_{i}})<min.MSE(t_{G_{i}})\,\mathrm{or}\,MSE(t_{DP_{i}})\), if

$$\begin{aligned} \left[ C^{2}_{y}+\frac{1}{(f_{2}-f_{i}R^{2}_{y.xr_{x}})}\right] (f_{2}-f_{i}\rho ^{2}_{yx})>1 \end{aligned}$$
(5.16)

(v) \(min.MSE(t_{D_{i}})<min.MSE(t_{BP_{i}})\), if

$$\begin{aligned} \frac{R^{2}_{y.xr_{x}}}{\rho ^{2}_{yx}}>1 \end{aligned}$$
(5.17)

(vi) \(min.MSE(t_{D_{i}})<min.MSE(t^{*}_{BP_{i}})\), if

$$\begin{aligned} \left[ 1+\frac{1}{C^{2}_{y}(f_{2}-f_{i}R^{2}_{y.xr_{x}})}\right] \left( 1-\frac{H^{2}_{i}}{G_{i}}\right) >1 \end{aligned}$$
(5.18)

Thus, the estimators \(t_{P_{i}}(i=1,2,3)\), \(t_{d_{i}}\) and \(t_{D_{i}}\) are better than the traditional estimators \(t_{R_{i}}\), \(t_{G_{i}}\), \(t_{KC_{i}}\), \(t_{DP_{i}}\), \(t_{BP_{i}}\) and \(t^{*}_{BP_{i}}\) in parallel if the conditions (5.1)–(5.6), (5.7)–(5.12) and (5.13)–(5.18) are, respectively, satisfied.

5.2 Mutual Comparisons of the Proposed Estimators

Here, we discuss the mutual comparisons of the estimators \(t_{P_{i}}(i=1,2,3)\), \(t_{d_{i}}\) and \(t_{D_{i}}\).

(i) \(min.MSE(t_{\bullet _{1}})<min.MSE(t_{\bullet _{3}})\), if

$$\begin{aligned} r>\left( \frac{nN}{2N-n}\right) \end{aligned}$$
(5.19)

(ii) \(min.MSE(t_{\bullet _{2}})<min.MSE(t_{\bullet _{1}})\), if

$$\begin{aligned} r<n \end{aligned}$$
(5.20)

(iii) \(min.MSE(t_{\bullet _{2}})<min.MSE(t_{\bullet _{3}})\), if

$$\begin{aligned} n<N \end{aligned}$$
(5.21)

This means that in the respective sets of estimators, the second estimators \(t_{P_{2}}\), \(t_{d_{2}}\) and \(t_{D_{2}}\) are always better than first estimators \(t_{P_{1}}\), \(t_{d_{1}}\) and \(t_{D_{1}}\) and third estimators \(t_{P_{3}}\), \(t_{d_{3}}\) and \(t_{D_{3}}\), whereas the first estimators \(t_{P_{1}}\), \(t_{d_{1}}\) and \(t_{D_{1}}\) are better than the estimators \(t_{P_{3}}\), \(t_{d_{3}}\) and \(t_{D_{3}}\), respectively, if the condition (5.19) holds. The similar conditions also holds for other existing estimators discussed above in the respective strategies.

6 Empirical Comparisons and Computations

To judge the merits of the proposed class of estimators \(t_{P_{1}}\), \(t_{P_{2}}\) and \(t_{P_{3}}\) over the other considered estimators in the respective strategies, we have chosen 10 real data sets whose parametric details are given as follows:

Data set-1: [13, p-428]: The data are on capital expenditures (y) and approximations (x) for the years 1953-1967 on a quarterly basis. These data are from the National Industrial Conference Board. The description of the parameters for this data is: \(N=60\), \(n=20\), \(r=16\), \({\bar{Y}}=3092.417\), \({\bar{X}}=3319.483\), \(C_{y}=0.3725059\), \(C_{x}=0.4159578\), \(C_{r}=0.5725904\), \(\rho _{yx}=0.8832073\), \(\rho _{yr_{x}}=0.7818964\), \(\rho _{xr_{x}}=0.9592037\).

Data set-2: [13, p-108]: The data present experience and salary structure of University of Michigan economists in 1983-1984. Let y be the salary (thousands of dollars) and x be the years of experience (defined as years since receiving Ph.D.). The description of the parameters for these data is: \(N=32\), \(n=12\), \(r=8\) \({\bar{Y}}=47.37812\), \({\bar{X}}=18.375\), \(C_{y}=0.1819515\), \(C_{x}=0.4548528\), \(C_{r}=0.5677532\), \(\rho _{yx}=0.4245114\), \(\rho _{yr_{x}}=0.3368753\), \(\rho _{xr_{x}}=0.9447145\).

Data set-3: [13, p-41]: The data are on the weekly cash inflows (x) and outflows (y) of a business firm for 30 weeks. The description of the parameters for these data is: \(N=30\), \(n=12\), \(r=8\), \({\bar{Y}}=51.73333\), \({\bar{X}}=62.93333\), \(C_{y}=0.4261637\), \(C_{x}=0.3361672\), \(C_{r}=0.5676459\), \(\rho _{yx}=-0.009132783\), \(\rho _{yr_{x}}=-0.02862014\), \(\rho _{xr_{x}}=0.9927513\).

Data set-4: [19, p-108]: A list of 70 villages in a Tehsil of India along with their population in 1981 and cultivated area (in acres) in the same year is taken into consideration. Let y be the cultivated area(in acres) and x be the population of village. The description of the parameters for these data is: \(N=70\), \(n=20\), \(r=15\), \({\bar{Y}}=981.2857\), \({\bar{X}}=1755.529\), \(C_{y}=0.625359\), \(C_{x}=0.8009741\), \(C_{r}=0.57327\), \(\rho _{yx}=0.7779\), \(\rho _{yr_{x}}=0.7588\), \(\rho _{xr_{x}}= 0.8497\).

Data set-5: [18]: Let y be the number of successful students and x be the number of teachers considered in a survey data of 923 districts of Turkey in 2007. The description of the parameters for these data is: \(N=261\), \(n=90\), \(r=70\), \({\bar{Y}}=222.5824\), \({\bar{X}}=306.4483\), \(C_{y}=1.8654\), \(C_{x}=1.7595\), \(C_{r}=0.57623\), \(\rho _{yx}=0.9705\), \(\rho _{yr_{x}}=0.6371\), \(\rho _{xr_{x}}=0.6265\).

Data set-6: [20, p-1111]: Let y be the amount (in $000) of real estate farm loans and x be the amount (in $000) of non-real estate farm loans in different states of USA during 1997. The details of the parameters for this data set are: \(N=50\), \(n=20\) \(r=8\), \({\bar{Y}}= 878.1626\), \({\bar{X}}=555.4345\), \(C_{y}=1.235167\), \(C_{x}= 1.052916\), \(C_{r}=0.571662\), \(\rho _{yx}=0.8038\), \(\rho _{yr_{x}}=0.7461\), \(\rho _{xr_{x}}=0.9236\).

Data set-7: [7, p-182]: Let y be the number of paralytic polio cases in the placebo group and x be the number of placebo children. The details of the parameters for this data set are: \(N=34\), \(n=12\) \(r=8\), \({\bar{Y}}= 2.588235\), \({\bar{X}}=4.923529\), \(C_{y}=1.233278\), \(C_{x}=1.023331\), \(C_{r}=0.5687383\), \(\rho _{yx}=0.7328235\), \(\rho _{yr_{x}}=0.6571887\), \(\rho _{xr_{x}}=0.8165117\).

Data set-8: [7, p-152]: The data show the number of inhabitants in LARGE UNITED STATES CITIES (in 1000’s). Let y be the number of inhabitants in 1930 and x be the number of inhabitants in 1920. The details of the parameters for this data set are: \(N=49\), \(n=15\) \(r=12\), \({\bar{Y}}=127.7959\), \({\bar{X}}=103.1429\), \(C_{y}=0.9634205\), \(C_{x}=1.012237\), \(C_{r}=0.5714601\), \(\rho _{yx}=0.981742\), \(\rho _{yr_{x}}=0.7207159\), \(\rho _{xr_{x}}=0.7915108\).

Data set-9: [7, p-34]: We investigate food cost of family for y and the family size for x. The values of the population parameters are: \(N=33\), \(n=10\) \(r=8\), \({\bar{Y}}=27.49091\), \({\bar{X}}=3.727273\), \(C_{y}=0.3685139\), \(C_{x}=0.4094911\), \(C_{r}=0.555573\), \(\rho _{yx}=0.432738\), \(\rho _{yr_{x}}=0.4495658\), \(\rho _{xr_{x}}=0.9820251\).

Data set-10: [14, p-399]: Consider y as area under wheat in 1964 and x as the area under wheat in 1963. The statistical summary of the population is: \(N=34\), \(n=11\) \(r=8\), \({\bar{Y}}=199.4412\), \({\bar{X}}=208.8824\), \(C_{y}=0.7531797\), \(C_{x}=0.7205298\), \(C_{r}=0.5689992\), \(\rho _{yx}=0.9800867\), \(\rho _{yr_{x}}=0.9152007\), \(\rho _{xr_{x}}=0.9416689\).

We have computed the MSEs of the estimators \({\bar{y}}_{r}\), \(t_{R_{i}}(i=1,2,3)\), \(t_{G_{i}}\), \(t_{KC_{i}}\), \(t_{DP_{i}}\), \(t_{BP_{i}}\), \(t^{*}_{BP_{i}}\) and the different members of \(t_{P_{i}}\) at their optimum situations based on their theoretical results, are given in Tables 2 and 3. The relative performance of all the above estimators is computed in terms of percentage relative efficiency (PRE) with respect to mean estimator \({\bar{y}}_{r}\). To calculate the PREs of the estimators \({\bar{y}}_{r}\), \(t_{R_{i}}\), \(t_{G_{i}}\), \(t_{KC_{i}}\), \(t_{DP_{i}}\), \(t_{BP_{i}}\), \(t^{*}_{BP_{i}}\), \(t_{d_{i}}\), \(t_{D_{i}}\) and \(t_{P_{i}}\) with respect to \({\bar{y}}_{r}\), we have used the formula, given by

$$\begin{aligned} PRE(t., {\bar{y}}_{r})=\frac{V({\bar{y}}_{r})}{MSE(t.)}\times 100 \end{aligned}$$
(6.1)

The results are shown in Table 4.

Table 2 MSEs of the different estimators
Table 3 MSEs of the different estimators belong to proposed family of estimators \(t_{P_{1}}\), \(t_{P_{2}}\) and \(t_{P_{2}}\)
Table 4 PREs of the different estimators with respect to mean estimator \({\bar{y}}_{r}\)

Interpretation of the results:

In Table 2, we see that the lowest amount of MSE values occurred for the members \(t^{*}_{P_{i}}(i=1,2,3)\) of suggested class of estimators \(t_{P_{i}}\), respectively, in parallel comparison with the estimators \(t_{R_{i}}\), \(t_{G_{i}}\), \(t_{KC_{i}}\), \(t_{DP_{i}}\), \(t_{BP_{i}}\), \(t^{*}_{BP_{i}}\), \(t_{d_{i}}\) and \(t_{D_{i}}\) as well as \({\bar{y}}_{r}\). Further, we see that the special members \(t_{D_{i}}(i=1,2,3)\) of proposed estimators \(t_{P_{i}}\), respectively, have the second lowest amount of MSE values in the parallel comparisons of other existing estimators. In Table 3, we observe that the MSE values of all the discussed members of the proposed estimators in respective strategies are same in the Populations 1, 2, 3, 4, 5 and 10 but a tiny bit change in the Populations 6, 7, 8 and 9 which can also be admitted as negligible difference. Thus, it can be argued that the MSEs of all the members of proposed class of estimators (in Table 3) are equal and smaller than all other discussed estimators in the respective strategies. Subsequently, their PREs with respect to \({\bar{y}}_{r}\) will also be same. Therefore, we have considered the notation \(t_{P_{i}}(i=1,2,3)\) only, for all the members \(t^{(j)}_{P_{i}}(i=1,2,3; j=1,2,...,10)\) in Table 4 for our convenience which present the PREs of all the members of proposed classes of estimators in the respective strategies.

From Table 4, we report that

  1. (i)

    The performance of ratio estimators \(t_{R_{i}}(i=1,2,3)\) is good in Populations 1, 4, 5, 6, 7, 8 and 10, while in Populations 2, 3 and 9 it is poor. Note that in the populations 2, 3 and 9, the values of \(\rho _{yx}\frac{C_{y}}{C_{x}}\) are 0.16981, -0.01157 and 0.38943, respectively, which are not satisfying the condition \((\rho C_{y}/C_{x})>1/2\) for \(t_{R_{i}}\) to overcome the mean estimator \({\bar{y}}_{r}\).

  2. (ii)

    Theoretically it has been stated above that the performances of the estimators \(t_{G_{i}}(i=1,2,3)\) are always better than \({\bar{y}}_{r}\), \(t_{R_{i}}\) and \(t_{KC_{i}}\) which confirmed by this empirical study.

  3. (iii)

    The estimators \(t_{KC_{i}}(i=1,2,3)\) perform good only in Populations 5 and 10 where the numerical values of \(\rho _{yx}\frac{C_{y}}{C_{x}}\) are 1.0289 and 1.02449, respectively. Note that the condition \((\rho C_{y}/C_{x})>1\) is satisfied for the said populations while for the remaining populations it does not holds.

  4. (iv)

    We see that \(t_{DP_{i}}(i=1,2,3)\) and \(t_{G_{i}}\) are equally efficient and \(t_{BP_{i}}\) are paralleling the improvement over \(t_{DP_{i}}\) in all the populations.

  5. (vi)

    The performance of the estimators \(t^{*}_{BP_{i}}(i=1,2,3)\) in parallel is very near to \(t_{BP_{i}}\).

  6. (vii)

    We see that the proposed estimator \(t_{d_{i}}(i=1,2,3)\) are always better than \(t_{DP_{i}}\) and \(t_{BP_{i}}\), respectively, in all the populations. Therefore, it can be argued that the regression type estimator based on dual use of auxiliary information always outperform both the conventional regression and difference type estimators which are based on only the prime information of an auxiliary variable. Hence, a better overcome on missing data problem can be attained just by using dual of an auxiliary variable.

  7. (viii)

    The proposed estimators \(t_{D_{i}}(i=1,2,3)\) are paralleling the improvement over \(t_{d_{i}}\).

  8. (ix)

    The performance of the proposed class of estimators \(t_{P_{i}}(i=1,2,3)\) is:

    1. (a)

      Good (better than \({\bar{y}}_{r}\)) in all the Populations 1-10.

    2. (b)

      Paralleling more efficient than the estimators \(t_{R_{i}}\), \(t_{G_{i}}\), \(t_{KC_{i}}\), \(t_{DP_{i}}\), \(t_{BP_{i}}\), \(t^{*}_{BP_{i}}\) and their special members \(t_{d_{i}}(i=1,2,3)\) and \(t_{D_{i}}\). Thus, \(t_{P_{i}}\) paralleling accomplish the maximum gain in efficiency among all the other estimators in all the populations considered in this empirical study.

    3. (c)

      We observe that the second one \(t_{P_{2}}\) is always better than first one \(t_{P_{1}}\) and third one \(t_{P_{3}}\).

    4. (c)

      We also observe that the PREs of first proposed estimator \(t_{P_{1}}\) are greater than the third proposed estimator \(t_{P_{3}}\) in all the populations where the condition (5.19) holding.

Thus, the suggested class of estimators \(t_{P_{i}}(i=1,2,3)\) in parallel outperform all other estimators \(t_{R_{i}}\), \(t_{G_{i}}\), \(t_{KC_{i}}\), \(t_{DP_{i}}\), \(t_{BP_{i}}\), \(t^{*}_{BP_{i}}\), \(t_{d_{i}}\) and \(t_{D_{i}}\)considered in this study. We see that the second proposed estimator \(t_{P_{2}}\) is always better than the first and third proposed estimators \(t_{P_{1}}\) and \(t_{P_{3}}\). We also see that the proposed estimator \(t_{P_{2}}\) is the most efficient among all the other estimators discussed in this study.

On the basis of this empirical study, we conclude that the proposed estimators formulated under double(rank) use of an auxiliary variable are capable to enhance their precision over relevant estimators based on single/prime use of an auxiliary variable.

7 Conclusions

To exercise the problem of missing data efficiently, there are several notable researchers, but no one has discussed the imputation technique using dual of auxiliary information in literature so far. In the present study, we have suggested three imputation techniques and corresponding estimators using an auxiliary information as well as its rank in three different sampling strategies (based on different amounts of auxiliary information) under non-response. In the empirical study consisting 10 real data sets, it has been found that the performance of the suggested set of estimators paralleling are more efficient than usual mean estimator and the works Lee et al. [12], Kadilar and Cingi [11], Gira [9], Diana and Perri [8], Bhusan and Pandey [4] and Bhusan et al. (2018). The present study is important in survey sampling to estimate the population mean because it overcomes to the missing data more effectively than the works (based on prime/single use of an auxiliary information) discussed above, just by using the dual(rank) of an auxiliary information. Since, for the optimal solution, both from a theoretical and practical perspectives, the proposed estimators are very simple to apply. Hence, the suggested classes of estimators are appreciable and recommended to use for sampling practitioners when the non-response cannot be ignored.