Introduction

When the data set does not follow a normal distribution or contains outliers, estimations of parameters are affected badly. To overcome this difficulty and get reliable conclusions about the contaminated data, robust estimators are defined in statistical analysis. This relies on finding proper estimates of the data location and scale ([9, 10, 13]).

Using information of auxiliary variable in the estimates also increases the efficiency. We can list some important studies as follows: Hanif and Shahzad [11] considered the issue of estimating the population variance utilizing trace of kernel matrix in absence of non-response under simple random sampling (SRS) scheme. Shahzad et al. [25] defined a new class of ratio-type estimators for the population mean. Shahzad et al. [26] proposed a new class of exponential-type estimators, based on the known median of study variable.

The authors also studied robust ratio-type estimators when the data contained outliers. Zaman and Bulut [30] have studied the robust estimators in simple random sampling and they also extended their studies to stratified simple random sampling [31]. Recently, Ali et al. [1] generalized Zaman and Bulut [30]’s estimators and they have studied sensitive data case. Subzar et al. [28] adapted the various robust regression techniques to the ratio estimators. Shahzad et al. [27] defined class of regression-type estimators utilizing robust regression tools.

Ranked set sampling (RSS) is an effective design introduced by [20]. The efficiency of RSS depends on the sampling allocation whether balanced or unbalanced. The balanced RSS features an equal allocation of the ranked order statistics. It has been shown theoretically and empirically the variance of the balanced RSS estimator is less than that of the estimator SRS estimator regardless of ranking errors ([3]). In literature, many authors showed that this design performs better compared to the SRS and proposed new RSS designs such as median ranked set (MRSS) by Muttlak [22], double ranked set by Al-Saleh and Al-Kadiri [7], pair ranked set by Muttlak [21], L ranked set by Al-Nasser [2], neoteric ranked set sampling by Zamanzade and Al-Omari [32] and so on.

We can also list some important papers to improve estimation under ranked set sampling designs as: Al-Omari [6] studied ratio estimators under MRSS. Koyuncu [17] studied regression-type estimators under different ranked set sampling. Koyuncu [18] has proposed regression-type estimators and more general class of estimators under MRSS. Koyuncu [16] has studied difference-cum-ratio and exponential type estimators under MRSS.

The aim of this study is to propose more general estimators of the population mean using robust statistics under RSS and MRSS.

The article is constructed as follows: In "Robust estimators in SRS" section, we have reviewed the recent robust literature in SRS. In "Proposed class of robust-regression-type-estimators under SRS, RSS and MRSS" section, the new generalized robust estimators under SRS, RSS and MRSS are presented. The mean square error (MSE) is derived up to the first-order of approximation. The theoretical efficiency comparison is given in "Efficiency comparison" section. In "Simulation study" section, a simulation study is conducted using a real data set and summarized our findings in "Conclusion" section.

Table 1 Summary of tree data

Robust estimators in SRS

When the information about auxiliary variable is known and the using this in the estimator can results more efficient estimates. Also the normality assumption of data is not hold, we need to use robust statistics. Moving this direction Zaman and Bulut [30] suggested following robust estimators as follows:

$$\begin{aligned} {\bar{y}}_{z_{1}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}}{\bar{X}}, \end{aligned}$$
(2.1)
$$\begin{aligned} {\bar{y}}_{z_{2}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}+C_{x}}({\bar{X}} +C_{x}), \end{aligned}$$
(2.2)
$$\begin{aligned} {\bar{y}}_{z_{3}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}+\beta _{2}(x)}({\bar{X}} +\beta _{2}(x)), \end{aligned}$$
(2.3)
$$\begin{aligned} {\bar{y}}_{z_{4}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}\beta _{2}(x)+C_{x}}({\bar{X}}\beta _{2}(x) +C_{x}), \end{aligned}$$
(2.4)
$$\begin{aligned} {\bar{y}}_{z_{5}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}C_{x}+\beta _{2}(x)}({\bar{X}}C_{x} +\beta _{2}(x)) \end{aligned}$$
(2.5)

where \(b_{i}\) is slope or regression coefficient, calculated from the robust regression methods such as least absolute deviations (LAD), least median of squares method (LMS), least trimmed squares method (LTS), Huber-M, Hample-M, Tukey-M, Huber-MM. When the data contain outliers, these observations cause problems because they may strongly influence the result. Classical methods can be affected by outliers. The aim of using robust statistics in estimators is to describe well the data majority and get reliable estimates. We can summarized these well known robust methods are as follows:

LAD is a method which minimizes the sum of absolute error and is described as

$$\begin{aligned} \min \sum _{i=1}^{n} \mid \epsilon _{i} \mid . \end{aligned}$$

LMS method rather than minimize the sum of the least-squares function, this model minimizes the median of the squared residuals

$$\begin{aligned} \min \quad \mathrm{median} (\epsilon _{i}^{2}) \end{aligned}$$

LTS proceeds with OLS after eliminating the most extreme positive or negative residuals. LTS orders the squared residuals from smallest to largest: \((E^2)_{(1)}, (E^2)_{(2)},\ldots ,(E^2)_{(n)}\), and then, it calculates b that minimizes the sum of only the smaller half of the residuals

$$\begin{aligned} \sum _{i=1}^{m} (E^2)_{(i)} \end{aligned}$$

where \(m=[n/2]+1\); the square bracket indicates rounding down to the nearest integer.

Huber-M is based on minimizing another function of outliers instead of error squares. Objective function of M-estimator is given as

$$\begin{aligned} \min \sum _{i=1}^{n} \rho (e_{i}) \end{aligned}$$

and is asymmetric function of outliers. Huber’s function \(\rho\) is designed as

$$\begin{aligned} \rho (e) = \left\{ \begin{array}{ll} \frac{e^{2}}{2} &{} \ \mid e \mid \le k\\ k\mid e \mid -\frac{k^2}{2} &{} \ \mid e \mid > k \end{array} \right. \end{aligned}$$

The influence function is determined by taking the derivative

$$\begin{aligned}\varphi (y) = \left\{ \begin{array}{ll} e &{} \ \mid e \mid \le k\\ k\mathrm{sign}(e)&{} \ \mid e \mid > k \end{array} \right. \end{aligned}$$

where sgn(.) is sign function and represented as

$$\begin{aligned}sgn(x) = \left\{ \begin{array}{ll} -1 &{} \ e< k\\ 0 &{} \ e=k\\ 1 &{} \ e> k \end{array} \right. \end{aligned}$$

Hample-M Estimation function is defined as

$$\begin{aligned}\rho (y) = \left\{ \begin{array}{ll} \frac{y^2}{2} &{} \ 0<\mid y \mid< a\\ a\mid y \mid -\frac{y^2}{2} &{} \ a<\mid y \mid \le b\\ \frac{-a}{2(c-b)}(c-y)^2+\frac{a}{2}(b+c-a) &{} \ b<\mid y \mid \le c\\ \frac{a}{2}(b+c-a) &{} \ c<\mid y \mid \end{array} \right. \end{aligned}$$

where \(a=1.7\), \(b=3.4\) and \(c=8.5\).

Tukey M estimation function is given by

$$\begin{aligned}\rho (y) = \left\{ \begin{array}{ll} \frac{1}{6}(1-(1-(\frac{y}{k})^2)^3) &{} \ \mid y \mid \le k\\ \frac{1}{6} &{} \ \mid y \mid >k \end{array} \right. \end{aligned}$$

where \(k=5\) or \(k=6\).

Huber MM estimation method is described as follows:

  • A starting estimation with high breakdown point (0.5 if possible) is chosen.

  • Outliers are calculated as \(e_{i}(T_{0})=y_{i}-T_{0}x_{i}, \quad 1\le i \le n\)

where \(T_{0}\) is starting estimation. Under \(b/a=0.5\) constraints, b is calculated as below

$$\begin{aligned} b=\frac{1}{n}\sum _{i=1}^{n} \rho \left( \frac{e_{i}(\beta )}{s_{n}}\right) \end{aligned}$$

where \(s_{n}\) is M scale estimation and it is calculated as \(s_{n}=s(e(T_{0}))\). For more information robust estimators kindly see [12, 24, 29])

Ali et al. [1] generalized Zaman and Bulut [30] estimators as

$$\begin{aligned} {\bar{y}}_{z}=\dfrac{{\bar{y}}+b_{i}({\bar{X}}-{\bar{x}})}{(F{\bar{x}}+G)}(F{\bar{X}}+G) \quad \mathrm{for} \quad i=1,2,\ldots ,7 \end{aligned}$$
(2.6)

where i shows robust regression methods LAD, LMS, LTS, Huber-M, Hample-M, Tukey-M, Huber-MM, respectively.

\(F \ne 0\) and G are any constants, either (0,1) or known characteristics of the population such as, \(C_x\), the coefficients of variation, \(\beta _{2}(x)\), the coefficients of kurtosis from the population having N identifiable units. We can generate many new estimators using suitable variables for \(b_{i}, F\) and G.

The MSE of \({\bar{y}}_{z}\) is given as

$$\begin{aligned} \text {MSE}({\bar{y}}_{z})= & {} \left( \dfrac{1-f}{n}\right) \left[ S^{2}_{y}+ R^2_{FG} S^{2}_{x}+2B_{i} R_{FG} S^{2}_{x}+B^2_{i} S^{2}_{x} \right. \nonumber \\&-\left. 2R_{FG} S_{xy}-2B_{i}S_{xy} \right] \end{aligned}$$
(2.7)

where \(B_{i}\) robust regression coefficient of population, \(R_{FG}=\dfrac{F{\bar{Y}}}{F{\bar{X}}+G}\), \(S^{2}_{y}= \dfrac{\sum _{i=1} ^{N} (y_{i}-{\bar{Y}})^{2}}{N-1}\), \(S^{2}_{x}= \dfrac{\sum _{i=1} ^{N} (x_{i}-{\bar{X}})^{2}}{N-1}\) are the unbiased variances of Y and X, respectively, \(S_{xy}= \dfrac{\sum _{i=1} ^{N} (y_{i}-{\bar{Y}})(x_{i}-{\bar{X}})}{N-1}\) is the covariance between (XY). \({\bar{X}}\), \({\bar{Y}}\) are population means \({\bar{x}}\), \({\bar{y}}\) are the sample means of auxiliary and study variables, respectively.

Then, Ali et al. [1] have proposed classical regression-type estimators using robust b in the equation instead of least-square estimator. They found that their regression-type estimators are more efficient than Zaman and Bulut [30]. Ali et al. [1] estimator and MSE of estimator are given as

$$\begin{aligned} {\bar{y}}_{a}={\bar{y}}+b_{i}({\bar{X}}-{\bar{x}}) \quad \mathrm{for} \quad i=1,2,\dots ,7 \end{aligned}$$
(2.8)
$$\begin{aligned} \mathrm{MSE}({\bar{y}}_{a})=\left( \dfrac{1-f}{n}\right) \left[ S^{2}_{y}-2B_{i}S_{xy}+B^2_{i} S^{2}_{x}\right] \end{aligned}$$
(2.9)
Table 2 MSE of estimators under SRS

Proposed class of robust-regression-type-estimators under SRS, RSS and MRSS

Ranked-based sampling designs have found many applications in other fields including environmental monitoring [19], clinical trials and genetic quantitative trait loci mappings [8] and medicine ([33, 34]). In this section, RSS and MRSS designs are described and we have introduced a new generalized robust estimators under these designs.

RSS design

The RSS design introduced by McIntyre [20] can be described as follows:

  1. 1.

    Select a simple random sample of size \(m^{2}\) units from the target finite population and divide them into m samples each of size m.

  2. 2.

    Rank the units within each sample in increasing magnitude by using personal judgment, eye inspection or based on a concomitant variable.

  3. 3.

    Select the ith ranked unit from the ith sample.

  4. 4.

    Repeat steps 1 through 3, k times if needed to obtain a RSS of size \(n=mk\).

Let

$$\begin{aligned} (X_{11j},Y_{11j})&,&(X_{12j},Y_{12j}),\ldots ,(X_{1mj},Y_{1mj});\\ (X_{21j},Y_{21j})&,&(X_{22j},Y_{22j}),\ldots ,(X_{2mj},Y_{2mj});\\ \vdots&;&\\ (X_{m1j},Y_{m1j})&,&(X_{m2j},Y_{m2j}),\ldots ,(X_{mmj},Y_{mmj}) \end{aligned}$$

be m independent bivariate random samples with pdf f(xy),  each of size m in the jth cycle, \((j=1,2,\ldots ,k)\). Let

$$\begin{aligned}&(X_{i(1:m)j}, Y_{i[1:m]j}),(X_{i(2:m)j}, Y_{i[2:m]j}),\ldots ,\\&\quad (X_{i(m:m)j}, Y_{i[m:m]j}) \end{aligned}$$

be the order statistics of \(X_{i1j},X_{i2j},\) \(\ldots ,X_{imj}\) and the judgment order of \(Y_{i1j},Y_{i2j},\ldots ,Y_{imj}\) \((i=1,2,\ldots ,m)\), where round parentheses () and square brackets [] indicate that the ranking of X is perfect and ranking of Y has errors. Assume measured units using RSS are

$$\begin{aligned} (X_{1(1:m)j}, Y_{1[1:m]j}),(X_{2(2:m)j}, Y_{2[2:m]j}),\ldots , (X_{m(m:m)j}, Y_{m[m:m]j}). \end{aligned}$$

Then, the RSS estimators of population mean for the study and auxiliary variables can be written as

$$\begin{aligned} {\bar{y}}_{[\mathrm{RSS}])}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k}\sum \limits _{i=1}^{m}Y_{i[i:m]j}, \end{aligned}$$
(3.1)
$$\begin{aligned} {\bar{x}}_{(\mathrm{RSS})}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k}\sum \limits _{i=1}^{m}X_{i(i:m)j}. \end{aligned}$$
(3.2)
Table 3 MSE of estimators under RSS

MRSS design

The MRSS design can be described as in the following steps:

  1. 1.

    Select m random samples each of size m bivariate units from the population of interest.

  2. 2.

    The units within each sample are ranked by visual inspection or any other cost free method with respect to a variable of interest.

  3. 3.

    If m is odd, select the \(((m+1)/2)\) th-smallest ranked unit X together with the associated Y from each set, i.e., the median of each set. If m is even, from the first m/2 sets select the (m/2)th ranked unit X together with the associated Y and from the other sets select the \(((m+2)/2)\) the ranked unit X together with the associated Y.

  4. 4.

    The whole process can be repeated k times if needed to obtain a sample of size \(n=mk\) units.

For odd and even sample sizes, the units measured using MRSS are denoted by MRSSO and MRSSE, respectively. For odd sample size,

$$\begin{aligned}&(X_{1(\frac{m+1}{2}:m)j}, Y_{1[\frac{m+1}{2}:m]j}), (X_{2(\frac{m+1}{2}:j)}, Y_{2[\frac{m+1}{2}:j]}),\ldots ,\\&\quad (X_{m(\frac{m+1}{2}:j)}, Y_{m[\frac{m+1}{2}]j}) \end{aligned}$$

the sample mean estimators using MRSSO are given,

$$\begin{aligned} {\bar{x}}_{(\mathrm{MRSSO})}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k}\sum \limits _{i=1}^{m}X_{i(\frac{m+1}{2}:m)j} \end{aligned}$$
(3.3)
$$\begin{aligned} {\bar{y}}_{[\mathrm{MRSSO}]}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k}\sum \limits _{i=1}^{m}Y_{i[\frac{m+1}{2}:m]j}. \end{aligned}$$
(3.4)

For even sample size,

$$\begin{aligned}&(X_{1(\frac{m}{2}:m)j}, Y_{1[\frac{m}{2}:m]j}),(X_{2(\frac{m}{2}:m)j}, Y_{2[\frac{m}{2}:m]j}),\ldots , \\&\quad (X_{m(\frac{m}{2}:m)j}, Y_{m[\frac{m}{2}:m]j}),\\&\quad (X_{\frac{m+2}{2}(\frac{m+2}{2}:m)j}, Y_{\frac{m+2}{2}[\frac{m+2}{2}:m]j}),\\&\quad (X_{\frac{m+4}{2}(\frac{m+4}{2}:m)j}, Y_{\frac{m+4}{2}[\frac{m+4}{2}:m]j}),\ldots , (X_{m(\frac{m}{2}:m)j}, Y_{m[\frac{m}{2}:m]j}) \end{aligned}$$

the sample means of X and Y using MRSSE are given

$$\begin{aligned} {\bar{x}}_{(\mathrm{MRSSE})}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k} \left( \sum \limits _{i=1}^{\frac{m}{2}}X_{i(\frac{m}{2}:m)j} +\sum \limits _{i=\frac{m+2}{2}}^{m}X_{i(\frac{m+2}{2}:m)j}\right) \end{aligned}$$
(3.5)

and

$$\begin{aligned} {\bar{y}}_{[\mathrm{MRSSE}]}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k} \left( \sum \limits _{i=1}^{\frac{m}{2}}Y_{i[\frac{m}{2}:m]j} +\sum \limits _{i=\frac{m+2}{2}}^{m}Y_{i[\frac{m+2}{2}:m]j}\right) \end{aligned}$$
(3.6)

Proposed class of robust-regression-type-estimators in SRS

In this section, we can generalized Zaman et al. [30] and Ali et al. [1] estimators under SRS as

$$\begin{aligned} {\bar{y}}_{N(\mathrm{SRS}))}=[{\bar{y}}+b_{i}({\bar{X}}-{\bar{x}})](\frac{{F{\bar{X}}+G}}{F{\bar{x}}+G})^\alpha \end{aligned}$$
(3.7)

To obtain the MSE of the generalized estimators, let us define the following expectations under SRS:

$$\begin{aligned} e_{0}=\,({\bar{y}}-{\bar{Y}})/{{\bar{Y}}}, \quad e_{1}=({\bar{x}}-{\bar{X}})/{{\bar{X}}}. \end{aligned}$$

We can re-write the \({\bar{y}}_{N(\mathrm{SRS})}\) using e terms according to first order of approximation as follows:

$$\begin{aligned}&({\bar{y}}_{N(\mathrm{SRS})}-{\bar{Y}})^{2}\nonumber \\&\quad \cong {\bar{Y}}^2e_{0}^2+B_{i}^2{\bar{X}}^{2}e_{1}^2+{\bar{Y}}^2 \alpha ^2 R_{i}^2e_{1}^2\nonumber \\&\qquad -2B_{i}{\bar{X}}{\bar{Y}}e_{0}e_{1} -2{\bar{Y}}^2 \alpha R_{i}e_{0}e_{1}+2\alpha B_{i}{\bar{X}}{\bar{Y}} R_{i}e_{1}^2 \end{aligned}$$
(3.8)

The expectations of e terms are

$$\begin{aligned} E(e_{0}^2)= & {} \frac{(1-f)}{n}\frac{S_{y}^2}{{\bar{Y}}^2}, \quad E(e_{1}^2)=\frac{(1-f)}{n}\frac{S_{x}^2}{{\bar{X}}^2}, \quad \\ E(e_{0}e_{1})= & {} \frac{(1-f)}{n}\frac{S_{yx}}{{\bar{X}}{\bar{Y}}} \end{aligned}$$

Substituting these expectations in Eq. 3.8, we can get MSE as given by

$$\begin{aligned} \mathrm{MSE}({\bar{y}}_{N(\mathrm{SRS})})\cong & {} \frac{(1-f)}{n} [S_{y}^2+B_{i}^2S_{x}^2+ \alpha ^2 R_{FG}^2S_{x}^2\nonumber \\&-2B_{i}S_{yx}-2\alpha R_{FG}S_{yx}+2\alpha B_{i} R_{FG}S_{x}^2] \end{aligned}$$
(3.9)

Putting \(\alpha\)=0 in the \(\mathrm{MSE}({\bar{y}}_{N(\mathrm{SRS})})\), we can get Ali et al. [1] \(\mathrm{MSE}({\bar{y}}_{a})\) estimator. While for appropriate constants, we can also get MSEs of Zaman and Bulut [30] estimators.

Table 4 MSE of estimators under MRSS

Proposed class of robust-regression-type-estimators in RSS and MRSS

In RSS and MRSS to improve the efficiency of estimators, some authors used information of auxiliary variable in estimators. Koyuncu [18] has proposed regression-type estimators under MRSS. Al-Omari and Bouza [4] studied the ratio estimators of population mean with missing values using RSS. Al-Omari et al. [5] suggested ratio-type estimators of the mean using extreme RSS. Jemain et al. [15] suggested a modified ratio estimator for the population mean using double MRSS.

In this section, we generalized also our proposed estimators in the previous section to new classes based on RSS and MRSS designs as follows

$$\begin{aligned} {\bar{y}}_{N(j)}= \left[ {\bar{y}}_{(j)}+b_{i(j)} ({\bar{X}}-{\bar{x}}_{(j)})\right] \left( \frac{{F{\bar{X}}+G}}{F{\bar{x}}_{(j)}+G}\right) ^\alpha \end{aligned}$$
(3.10)

where (j) represents the sampling design such as SRS, RSS and MRSS.

To obtain the bias and the MSE of suggested class of estimators in Eq. (3.10) under RSS, let us define following notations

To obtain the MSE of the generalized estimators, let us define the following expectations under SRS:

$$\begin{aligned} e_{0(j)}= & {}\, ({\bar{y}}_{(j)}-{\bar{Y}})/{{\bar{Y}}}, \quad e_{1(j)}=({\bar{x}}_{(j)}-{\bar{X}})/{{\bar{X}}} \nonumber \\ \mathrm{MSE}({\bar{y}}_{N(j)})\cong & {} \, E \left[ {\bar{Y}}^2 e_{0(j)}^2+B_{i}^2{\bar{X}}^2 e_{1(j)}^2+ \alpha ^2 \psi ^2{\bar{Y}}^2 e_{1(j)}^2 \right. \nonumber \\&- \left. 2B_{i}{\bar{Y}}{\bar{X}}e_{0(j)}e_{1(j)}-2\alpha {\bar{Y}}^2 \psi e_{0(j)}e_{1(j)} \right. \nonumber \\&+ \left. 2B_{i} \alpha \psi {\bar{Y}}{\bar{X}} e_{1(j)}^2 \right] \end{aligned}$$
(3.11)

where \(\psi =\dfrac{F{\bar{X}}}{F{\bar{X}}+G}\). One can easily obtain the specific MSE from Eq. 3.11 putting expectation terms belong to design. For example if (j) represents the RSS design we can write following notations:

$$\begin{aligned} e_{0(\mathrm{RSS})}= & {} \, ({\bar{y}}_{(\mathrm{RSS})}-{\bar{Y}})/{{\bar{Y}}}, \quad \nonumber \\ e_{1(\mathrm{RSS})}= & {} \, ({\bar{x}}_{(\mathrm{RSS})}-{\bar{X}})/{{\bar{X}}} \nonumber \\ E(e_{0(\mathrm{RSS})}^2)= & {} \, \frac{var({\bar{y}}_{\mathrm{RSS}})}{{\bar{Y}}^2}\nonumber \\= & {} \, \frac{1}{{\bar{Y}}^2}\left[ \frac{S_{y}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{y[i]} -{\bar{Y}})^2\right] ,\nonumber \\ E(e_{1(\mathrm{RSS})}^2)= & {} \, \frac{var({\bar{x}}_{\mathrm{RSS}})}{{\bar{X}}^2}\nonumber \\= & {} \, \frac{1}{{\bar{X}}^2}\left[\frac{S_{x}^2}{m} -\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{x(i)}-{\bar{X}})^2 \right],\ \nonumber \\ E(e_{0(\mathrm{RSS})}e_{1(\mathrm{RSS})})= & {} \, \frac{cov({\bar{x}}_{\mathrm{RSS}},{\bar{y}}_{\mathrm{RSS}})}{{\bar{X}}{\bar{Y}}}\nonumber \\= & {} \, \frac{1}{{\bar{X}}{\bar{Y}}}\left[ \frac{S_{xy}}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{y[i]} -{\bar{Y}})(\mu _{x(i)}-{\bar{X}})\right] .\nonumber \\ \mathrm{MSE}({\bar{y}}_{N(\mathrm{RSS})})\cong & {} \left[ \left[ \frac{S_{y}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m} (\mu _{y[i]}-{\bar{Y}})^2\right] \right. \nonumber \\&\left. +B_{i}^2\left[ \frac{S_{x}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{x(i)}-{\bar{X}})^2\right] \right. \nonumber \\&\left. + \alpha ^2 R_{FG}^2\left[ \frac{S_{x}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{x(i)}-{\bar{X}})^2\right] \right. \nonumber \\&\left. -2B_{i}\left[ \frac{S_{xy}}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{y[i]} -{\bar{Y}})(\mu _{x(i)}-{\bar{X}})\right] \right. \nonumber \\&\left. -2\alpha R_{FG} \left[ \frac{S_{xy}}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{y[i]} -{\bar{Y}})(\mu _{x(i)}-{\bar{X}})\right] \right. \nonumber \\&\left. +2B_{i} \alpha R_{FG} \left[ \frac{S_{x}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{x(i)} -{\bar{X}})^2\right] \right] \end{aligned}$$
(3.12)

If (j) represents the MRSS design and the sample size n is odd we can write following notations:

$$\begin{aligned} e_{0(\mathrm{MRSS}_o)}= & {} \, \frac{{\bar{y}}_{\mathrm{MRSS}_o}-{\bar{Y}}}{{\bar{Y}}} \quad \mathrm{and} \quad \\ e_{1(\mathrm{MRSS}_o)}= & {} \, \frac{{\bar{x}}_{\mathrm{MRSS}_o}-{\bar{X}}}{{\bar{X}}}\ \\ E(e_{0(\mathrm{MRSS}_o)}^2)= & {} \, \frac{1}{n{\bar{Y}}^2}S_{y[\frac{n+1}{2}]}^2,\\ E(e_{1(\mathrm{MRSS}_o)}^2)= & {} \, \frac{1}{n{\bar{X}}^2}S_{x(\frac{n+1}{2})}^2,\ \\ E(e_{0(\mathrm{MRSS}_o)}e_{1(\mathrm{MRSS}_o)})= & {} \, \frac{1}{n{\bar{X}}{\bar{Y}}}S_{xy(\frac{n+1}{2})} \end{aligned}$$

Then, we can get the MSE as follows:

$$\begin{aligned}&\mathrm{MSE}({\bar{y}}_{N(\mathrm{MRSS}_o)})\nonumber \\&\quad \cong \frac{1}{n} \left[ S_{y\left[ \frac{n+1}{2}\right] }^2+B_{i}^2S_{x(\frac{n+1}{2})}^2 + \alpha ^2 R_{FG}^2 S_{x(\frac{n+1}{2})}^2 \right. \nonumber \\&\qquad \left. -2B_{i}S_{xy(\frac{n+1}{2})} -2\alpha R_{FG} S_{xy(\frac{n+1}{2})} +2B_{i} \alpha R_{FG} S_{x(\frac{n+1}{2})}^2 \right] \end{aligned}$$
(3.13)

If (j) represents the MRSS design and the sample size n is even we can write following notations:

$$\begin{aligned} E(e_{0(\mathrm{MRSS}_e)}^2)= & {} \, \frac{1}{2n{\bar{Y}}^2} \left( S_{y[\frac{n}{2}]}^2+S_{y[\frac{n+2}{2}]}^2 \right) ,\ \\ E(e_{1(\mathrm{MRSS}_e)}^2)= & {} \, \frac{1}{2n{\bar{X}}^2} \left( S_{x(\frac{n}{2})}^2+S_{x(\frac{n+2}{2})}^2 \right) \quad \mathrm{and} \\ E(e_{0(\mathrm{MRSS}_e)}e_{1(\mathrm{MRSS}_e)})= & {} \, \frac{1}{2n{\bar{X}}{\bar{Y}}} \left( S_{yx(\frac{n}{2})}+S_{yx(\frac{n+2}{2})} \right) . \end{aligned}$$

After defining these terms, the procedure is very easy. When we put these e terms in Eq. 3.11 we get the MSE is given by

$$\begin{aligned}&\mathrm{MSE}({\bar{y}}_{N(\mathrm{MRSS}_e)}) \nonumber \\&\quad \cong \left[ \left( S_{y[\frac{n}{2}]}^2+S_{y[\frac{n+2}{2}]}^2 \right) +B_{i}^2 \left( S_{x(\frac{n}{2})}^2 \right. \nonumber \left. +S_{x(\frac{n+2}{2})}^2 \right) \right. \nonumber \\&\qquad +\left. \alpha ^2 R_{FG}^2 \left( S_{x(\frac{n}{2})}^2+S_{x(\frac{n+2}{2})}^2 \right) \right. \nonumber \\&\qquad \left. -2B_{i} \left( S_{yx(\frac{n}{2})}+S_{yx(\frac{n+2}{2})}\right) \right. \nonumber \\&\qquad \left. -2\alpha R_{FG} \left( S_{yx(\frac{n}{2})}+S_{yx(\frac{n+2}{2})}\right) \right. \nonumber \\&\qquad \left. +2B_{i} \alpha R_{FG} \left( S_{x(\frac{n}{2})}^2+S_{x(\frac{n+2}{2})}^2\right) \right] \end{aligned}$$
(3.14)
Table 5 Percent relative efficiency (PRE) of proposed estimators under RSS over SRS

Efficiency comparison

In this section, the efficiency comparison between the MSE equations is obtained as below:

(1) Comparison of Ali et al. [1] \({\bar{y}}_{z}\) estimator which is general form of Zaman and Bulut [30] with proposed estimators under SRS

\(\mathrm{MSE}({\bar{y}}_{N(\mathrm{SRS})})<\mathrm{MSE}({\bar{y}}_{z})\) if

$$\begin{aligned}&\alpha ^2 R_{FG}^2S_{x}^2+2\alpha B_{i} R_{FG}S_{x}^2-2\alpha R_{FG}S_{yx}< R^2_{FG} S^{2}_{x}\nonumber \\&\qquad +2B_{i} R_{FG} S^{2}_{x}-2R_{FG} S_{xy}\nonumber \\&(\alpha ^2-1) R_{FG}^2S_{x}^2 +2(\alpha -1) (B_{i} R_{FG}S_{x}^2+R_{FG}S_{yx})< 0 \end{aligned}$$
(4.1)

From Eq. (4.1), one can easily see that when the \(\alpha =0\) we get same MSE.

(2) Comparison of Ali et al. [1] \({\bar{y}}_{a}\) estimator with proposed estimators under SRS

\(\mathrm{MSE}({\bar{y}}_{N(\mathrm{SRS})})< \mathrm{MSE}({\bar{y}}_{a})\) if

$$\begin{aligned} \alpha [\alpha R_{FG}^2S_{x}^2-2R_{FG}S_{yx}+2 B_{i} R_{FG}S_{x}^2] <0 \end{aligned}$$
(4.2)

When the conditions Eqs. (4.14.2) are satisfied the proposed estimator is more efficient than existing estimators.

Table 6 PRE of proposed estimators under MRSS over SRS

Simulation study

In this section, we used a real data set which is used by Jemain et al. [14] to illustrate the efficiency of proposed estimators over existing ones. These data consist of height and the diameter at breast height of 399 trees. We have used Height (H) in feet as study variable and Diameter (D) as auxiliary variable. The summary statistics of data is given in Table 1. The scatter plot of the data set is given in Fig. 1. From the scatter plot, we can say that the data set contains outliers. In the simulation study, we have selected 10000 samples with different sample sizes (\(n=3,4,5,7,8\)) under SRS, RSS and MRSS designs using R Software version 3.1.1 [23] We have calculated the MSE of Zaman and Bulut [30], Ali et al. [1] and the proposed estimators under SRS which is given by Table 2 based on linear, Huber-M, LMS, LTS, LAD, S and MM robust estimators. In this table, we can also see the efficiency of different robust betas in the same type estimator. The same procedure is also extended for the RSS and MRSS estimators as given Tables 3 and 4. We can summarized the simulation study as follows:

  • From Table 2, we can say that \({\bar{y}}_{N(\mathrm{SRS})_{2}}\) proposed estimator which used third quantile of auxiliary variable and LMS robust beta is the best for all sample sizes.

  • From Table 3, it can be seen that \({\bar{y}}_{N(\mathrm{RSS})_{4}}\) proposed estimator which used third quantile of auxiliary variable and LMS robust beta is the best for all sample sizes under RSS design.

  • From Table 4, under MRSS design \({\bar{y}}_{N(\mathrm{MRSS})_{4}}\) is the best using third quantile and LMS robust beta when the sample size \(n=3,4,5.\) When the sample size is increasing, proposed \({\bar{y}}_{N(\mathrm{MRSS})_{4}}\) with S robust beta is perform better.

  • PRE of estimators over different sampling designs is given in Tables 5 and 6. From these tables, we can see that ranked set sampling designs have better performance over SRS for all simulation cases.

  • When we compare the efficiency according to sample size, we can say that for all sample sizes the proposed estimator is more efficient than existing estimators. Especially for small sample sizes, efficiency over existing estimators is quite high than large sample sizes.

Fig. 1
figure 1

Scatter plot of the tree data

Conclusion

When the data contain outliers or not hold normality assumption to get more reliable results on estimates, we need to consider robust statistics. In this study, moving this direction we have considered robust techniques for estimation of population mean. First, we have examined newly proposed robust estimators in SRS and tried to define more general class of estimators in SRS which newly proposed estimators are member of our class. Then, we have extended our theoretical findings to different sampling designs such as RSS and MRSS. A general form of estimators of population mean and MSE formula are obtained. Theoretical MSEs of the robust family are also given for each designs. To see the performance of our estimators, we have conducted a simulation study using a real data set. We have compared the existing estimators with our proposed estimators and concluded that our proposed estimators perform better than recently proposed Zaman and Bulut [30] and Ali et al. [1] estimators. As a future work, these robust estimators can be introduced under new RSS designs and new robust methods also can be proposed.