Generalized robust-regression-type estimators under different ranked set sampling

Koyuncu, Nursel; Al-Omari, Amer Ibrahim

doi:10.1007/s40096-020-00360-7

Generalized robust-regression-type estimators under different ranked set sampling

Original Research
Published: 17 November 2020

Volume 15, pages 29–40, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Mathematical Sciences Aims and scope Submit manuscript

Generalized robust-regression-type estimators under different ranked set sampling

Download PDF

297 Accesses
8 Citations
Explore all metrics

Abstract

In this paper, we have proposed a new generalized robust estimators of population mean under different ranked set sampling. Robust estimators are recently defined by Zaman and Bulut (Commun Stat Theory Methods 48(8):2039–2048, 2019a) and Ali et al. (Commun Stat Theory Methods, 2019. https://doi.org/10.1080/03610926.2019.1645857) under simple random sampling. We have generalized robust-type estimators where Zaman and Bulut (2019a) and Ali et al. (2019) estimators are members of our generalized estimator. We have also extended our results to ranked set and median ranked set sampling designs. The simulation study showed that our proposed robust-type estimator performs better.

Improved ratio-type estimators using stratified double-ranked set sampling

Article 01 December 2016

Ratio estimators using stratified random sampling and stratified ranked set sampling

Article 22 May 2018

The New Sub-regression Type Estimator in Ranked Set Sampling

Article 28 February 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

When the data set does not follow a normal distribution or contains outliers, estimations of parameters are affected badly. To overcome this difficulty and get reliable conclusions about the contaminated data, robust estimators are defined in statistical analysis. This relies on finding proper estimates of the data location and scale ([9, 10, 13]).

Using information of auxiliary variable in the estimates also increases the efficiency. We can list some important studies as follows: Hanif and Shahzad [11] considered the issue of estimating the population variance utilizing trace of kernel matrix in absence of non-response under simple random sampling (SRS) scheme. Shahzad et al. [25] defined a new class of ratio-type estimators for the population mean. Shahzad et al. [26] proposed a new class of exponential-type estimators, based on the known median of study variable.

The authors also studied robust ratio-type estimators when the data contained outliers. Zaman and Bulut [30] have studied the robust estimators in simple random sampling and they also extended their studies to stratified simple random sampling [31]. Recently, Ali et al. [1] generalized Zaman and Bulut [30]’s estimators and they have studied sensitive data case. Subzar et al. [28] adapted the various robust regression techniques to the ratio estimators. Shahzad et al. [27] defined class of regression-type estimators utilizing robust regression tools.

Ranked set sampling (RSS) is an effective design introduced by [20]. The efficiency of RSS depends on the sampling allocation whether balanced or unbalanced. The balanced RSS features an equal allocation of the ranked order statistics. It has been shown theoretically and empirically the variance of the balanced RSS estimator is less than that of the estimator SRS estimator regardless of ranking errors ([3]). In literature, many authors showed that this design performs better compared to the SRS and proposed new RSS designs such as median ranked set (MRSS) by Muttlak [22], double ranked set by Al-Saleh and Al-Kadiri [7], pair ranked set by Muttlak [21], L ranked set by Al-Nasser [2], neoteric ranked set sampling by Zamanzade and Al-Omari [32] and so on.

We can also list some important papers to improve estimation under ranked set sampling designs as: Al-Omari [6] studied ratio estimators under MRSS. Koyuncu [17] studied regression-type estimators under different ranked set sampling. Koyuncu [18] has proposed regression-type estimators and more general class of estimators under MRSS. Koyuncu [16] has studied difference-cum-ratio and exponential type estimators under MRSS.

The aim of this study is to propose more general estimators of the population mean using robust statistics under RSS and MRSS.

The article is constructed as follows: In "Robust estimators in SRS" section, we have reviewed the recent robust literature in SRS. In "Proposed class of robust-regression-type-estimators under SRS, RSS and MRSS" section, the new generalized robust estimators under SRS, RSS and MRSS are presented. The mean square error (MSE) is derived up to the first-order of approximation. The theoretical efficiency comparison is given in "Efficiency comparison" section. In "Simulation study" section, a simulation study is conducted using a real data set and summarized our findings in "Conclusion" section.

Table 1 Summary of tree data

Full size table

Robust estimators in SRS

When the information about auxiliary variable is known and the using this in the estimator can results more efficient estimates. Also the normality assumption of data is not hold, we need to use robust statistics. Moving this direction Zaman and Bulut [30] suggested following robust estimators as follows:

$$\begin{aligned} {\bar{y}}_{z_{1}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}}{\bar{X}}, \end{aligned}$$

(2.1)

$$\begin{aligned} {\bar{y}}_{z_{2}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}+C_{x}}({\bar{X}} +C_{x}), \end{aligned}$$

(2.2)

$$\begin{aligned} {\bar{y}}_{z_{3}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}+\beta _{2}(x)}({\bar{X}} +\beta _{2}(x)), \end{aligned}$$

(2.3)

$$\begin{aligned} {\bar{y}}_{z_{4}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}\beta _{2}(x)+C_{x}}({\bar{X}}\beta _{2}(x) +C_{x}), \end{aligned}$$

(2.4)

$$\begin{aligned} {\bar{y}}_{z_{5}}= & {} \frac{{\bar{y}}+b_{\mathrm{rob}(zb)}({\bar{X}}-{\bar{x}})}{{\bar{x}}C_{x}+\beta _{2}(x)}({\bar{X}}C_{x} +\beta _{2}(x)) \end{aligned}$$

(2.5)

where $b_{i}$ is slope or regression coefficient, calculated from the robust regression methods such as least absolute deviations (LAD), least median of squares method (LMS), least trimmed squares method (LTS), Huber-M, Hample-M, Tukey-M, Huber-MM. When the data contain outliers, these observations cause problems because they may strongly influence the result. Classical methods can be affected by outliers. The aim of using robust statistics in estimators is to describe well the data majority and get reliable estimates. We can summarized these well known robust methods are as follows:

LAD is a method which minimizes the sum of absolute error and is described as

$$\begin{aligned} \min \sum _{i=1}^{n} \mid \epsilon _{i} \mid . \end{aligned}$$

LMS method rather than minimize the sum of the least-squares function, this model minimizes the median of the squared residuals

$$\begin{aligned} \min \quad \mathrm{median} (\epsilon _{i}^{2}) \end{aligned}$$

LTS proceeds with OLS after eliminating the most extreme positive or negative residuals. LTS orders the squared residuals from smallest to largest: $(E^2)_{(1)}, (E^2)_{(2)},\ldots ,(E^2)_{(n)}$, and then, it calculates b that minimizes the sum of only the smaller half of the residuals

$$\begin{aligned} \sum _{i=1}^{m} (E^2)_{(i)} \end{aligned}$$

where $m=[n/2]+1$; the square bracket indicates rounding down to the nearest integer.

Huber-M is based on minimizing another function of outliers instead of error squares. Objective function of M-estimator is given as

$$\begin{aligned} \min \sum _{i=1}^{n} \rho (e_{i}) \end{aligned}$$

and is asymmetric function of outliers. Huber’s function $\rho$ is designed as

$$\begin{aligned} \rho (e) = \left\{ \begin{array}{ll} \frac{e^{2}}{2} &{} \ \mid e \mid \le k\\ k\mid e \mid -\frac{k^2}{2} &{} \ \mid e \mid > k \end{array} \right. \end{aligned}$$

The influence function is determined by taking the derivative

$$\begin{aligned}\varphi (y) = \left\{ \begin{array}{ll} e &{} \ \mid e \mid \le k\\ k\mathrm{sign}(e)&{} \ \mid e \mid > k \end{array} \right. \end{aligned}$$

where sgn(.) is sign function and represented as

$$\begin{aligned}sgn(x) = \left\{ \begin{array}{ll} -1 &{} \ e< k\\ 0 &{} \ e=k\\ 1 &{} \ e> k \end{array} \right. \end{aligned}$$

Hample-M Estimation function is defined as

$$\begin{aligned}\rho (y) = \left\{ \begin{array}{ll} \frac{y^2}{2} &{} \ 0<\mid y \mid< a\\ a\mid y \mid -\frac{y^2}{2} &{} \ a<\mid y \mid \le b\\ \frac{-a}{2(c-b)}(c-y)^2+\frac{a}{2}(b+c-a) &{} \ b<\mid y \mid \le c\\ \frac{a}{2}(b+c-a) &{} \ c<\mid y \mid \end{array} \right. \end{aligned}$$

where $a=1.7$, $b=3.4$ and $c=8.5$.

Tukey M estimation function is given by

$$\begin{aligned}\rho (y) = \left\{ \begin{array}{ll} \frac{1}{6}(1-(1-(\frac{y}{k})^2)^3) &{} \ \mid y \mid \le k\\ \frac{1}{6} &{} \ \mid y \mid >k \end{array} \right. \end{aligned}$$

where $k=5$ or $k=6$.

Huber MM estimation method is described as follows:

A starting estimation with high breakdown point (0.5 if possible) is chosen.
Outliers are calculated as $e_{i}(T_{0})=y_{i}-T_{0}x_{i}, \quad 1\le i \le n$

where $T_{0}$ is starting estimation. Under $b/a=0.5$ constraints, b is calculated as below

$$\begin{aligned} b=\frac{1}{n}\sum _{i=1}^{n} \rho \left( \frac{e_{i}(\beta )}{s_{n}}\right) \end{aligned}$$

where $s_{n}$ is M scale estimation and it is calculated as $s_{n}=s(e(T_{0}))$. For more information robust estimators kindly see [12, 24, 29])

Ali et al. [1] generalized Zaman and Bulut [30] estimators as

$$\begin{aligned} {\bar{y}}_{z}=\dfrac{{\bar{y}}+b_{i}({\bar{X}}-{\bar{x}})}{(F{\bar{x}}+G)}(F{\bar{X}}+G) \quad \mathrm{for} \quad i=1,2,\ldots ,7 \end{aligned}$$

(2.6)

where i shows robust regression methods LAD, LMS, LTS, Huber-M, Hample-M, Tukey-M, Huber-MM, respectively.

$F \ne 0$ and G are any constants, either (0,1) or known characteristics of the population such as, $C_x$, the coefficients of variation, $\beta _{2}(x)$, the coefficients of kurtosis from the population having N identifiable units. We can generate many new estimators using suitable variables for $b_{i}, F$ and G.

The MSE of ${\bar{y}}_{z}$ is given as

$$\begin{aligned} \text {MSE}({\bar{y}}_{z})= & {} \left( \dfrac{1-f}{n}\right) \left[ S^{2}_{y}+ R^2_{FG} S^{2}_{x}+2B_{i} R_{FG} S^{2}_{x}+B^2_{i} S^{2}_{x} \right. \nonumber \\&-\left. 2R_{FG} S_{xy}-2B_{i}S_{xy} \right] \end{aligned}$$

(2.7)

where $B_{i}$ robust regression coefficient of population, $R_{FG}=\dfrac{F{\bar{Y}}}{F{\bar{X}}+G}$, $S^{2}_{y}= \dfrac{\sum _{i=1} ^{N} (y_{i}-{\bar{Y}})^{2}}{N-1}$, $S^{2}_{x}= \dfrac{\sum _{i=1} ^{N} (x_{i}-{\bar{X}})^{2}}{N-1}$ are the unbiased variances of Y and X, respectively, $S_{xy}= \dfrac{\sum _{i=1} ^{N} (y_{i}-{\bar{Y}})(x_{i}-{\bar{X}})}{N-1}$ is the covariance between (X, Y). ${\bar{X}}$, ${\bar{Y}}$ are population means ${\bar{x}}$, ${\bar{y}}$ are the sample means of auxiliary and study variables, respectively.

Then, Ali et al. [1] have proposed classical regression-type estimators using robust b in the equation instead of least-square estimator. They found that their regression-type estimators are more efficient than Zaman and Bulut [30]. Ali et al. [1] estimator and MSE of estimator are given as

$$\begin{aligned} {\bar{y}}_{a}={\bar{y}}+b_{i}({\bar{X}}-{\bar{x}}) \quad \mathrm{for} \quad i=1,2,\dots ,7 \end{aligned}$$

(2.8)

$$\begin{aligned} \mathrm{MSE}({\bar{y}}_{a})=\left( \dfrac{1-f}{n}\right) \left[ S^{2}_{y}-2B_{i}S_{xy}+B^2_{i} S^{2}_{x}\right] \end{aligned}$$

(2.9)

Table 2 MSE of estimators under SRS

Full size table

Proposed class of robust-regression-type-estimators under SRS, RSS and MRSS

Ranked-based sampling designs have found many applications in other fields including environmental monitoring [19], clinical trials and genetic quantitative trait loci mappings [8] and medicine ([33, 34]). In this section, RSS and MRSS designs are described and we have introduced a new generalized robust estimators under these designs.

RSS design

The RSS design introduced by McIntyre [20] can be described as follows:

1.
Select a simple random sample of size $m^{2}$ units from the target finite population and divide them into m samples each of size m.
2.
Rank the units within each sample in increasing magnitude by using personal judgment, eye inspection or based on a concomitant variable.
3.
Select the ith ranked unit from the ith sample.
4.
Repeat steps 1 through 3, k times if needed to obtain a RSS of size $n=mk$.

Let

$$\begin{aligned} (X_{11j},Y_{11j})&,&(X_{12j},Y_{12j}),\ldots ,(X_{1mj},Y_{1mj});\\ (X_{21j},Y_{21j})&,&(X_{22j},Y_{22j}),\ldots ,(X_{2mj},Y_{2mj});\\ \vdots&;&\\ (X_{m1j},Y_{m1j})&,&(X_{m2j},Y_{m2j}),\ldots ,(X_{mmj},Y_{mmj}) \end{aligned}$$

be m independent bivariate random samples with pdf f(x, y), each of size m in the jth cycle, $(j=1,2,\ldots ,k)$. Let

$$\begin{aligned}&(X_{i(1:m)j}, Y_{i[1:m]j}),(X_{i(2:m)j}, Y_{i[2:m]j}),\ldots ,\\&\quad (X_{i(m:m)j}, Y_{i[m:m]j}) \end{aligned}$$

be the order statistics of $X_{i1j},X_{i2j},$ $\ldots ,X_{imj}$ and the judgment order of $Y_{i1j},Y_{i2j},\ldots ,Y_{imj}$ $(i=1,2,\ldots ,m)$, where round parentheses () and square brackets [] indicate that the ranking of X is perfect and ranking of Y has errors. Assume measured units using RSS are

$$\begin{aligned} (X_{1(1:m)j}, Y_{1[1:m]j}),(X_{2(2:m)j}, Y_{2[2:m]j}),\ldots , (X_{m(m:m)j}, Y_{m[m:m]j}). \end{aligned}$$

Then, the RSS estimators of population mean for the study and auxiliary variables can be written as

$$\begin{aligned} {\bar{y}}_{[\mathrm{RSS}])}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k}\sum \limits _{i=1}^{m}Y_{i[i:m]j}, \end{aligned}$$

(3.1)

$$\begin{aligned} {\bar{x}}_{(\mathrm{RSS})}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k}\sum \limits _{i=1}^{m}X_{i(i:m)j}. \end{aligned}$$

(3.2)

Table 3 MSE of estimators under RSS

Full size table

MRSS design

The MRSS design can be described as in the following steps:

1.
Select m random samples each of size m bivariate units from the population of interest.
2.
The units within each sample are ranked by visual inspection or any other cost free method with respect to a variable of interest.
3.
If m is odd, select the $((m+1)/2)$ th-smallest ranked unit X together with the associated Y from each set, i.e., the median of each set. If m is even, from the first m/2 sets select the (m/2)th ranked unit X together with the associated Y and from the other sets select the $((m+2)/2)$ the ranked unit X together with the associated Y.
4.
The whole process can be repeated k times if needed to obtain a sample of size $n=mk$ units.

For odd and even sample sizes, the units measured using MRSS are denoted by MRSSO and MRSSE, respectively. For odd sample size,

$$\begin{aligned}&(X_{1(\frac{m+1}{2}:m)j}, Y_{1[\frac{m+1}{2}:m]j}), (X_{2(\frac{m+1}{2}:j)}, Y_{2[\frac{m+1}{2}:j]}),\ldots ,\\&\quad (X_{m(\frac{m+1}{2}:j)}, Y_{m[\frac{m+1}{2}]j}) \end{aligned}$$

the sample mean estimators using MRSSO are given,

$$\begin{aligned} {\bar{x}}_{(\mathrm{MRSSO})}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k}\sum \limits _{i=1}^{m}X_{i(\frac{m+1}{2}:m)j} \end{aligned}$$

(3.3)

$$\begin{aligned} {\bar{y}}_{[\mathrm{MRSSO}]}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k}\sum \limits _{i=1}^{m}Y_{i[\frac{m+1}{2}:m]j}. \end{aligned}$$

(3.4)

For even sample size,

$$\begin{aligned}&(X_{1(\frac{m}{2}:m)j}, Y_{1[\frac{m}{2}:m]j}),(X_{2(\frac{m}{2}:m)j}, Y_{2[\frac{m}{2}:m]j}),\ldots , \\&\quad (X_{m(\frac{m}{2}:m)j}, Y_{m[\frac{m}{2}:m]j}),\\&\quad (X_{\frac{m+2}{2}(\frac{m+2}{2}:m)j}, Y_{\frac{m+2}{2}[\frac{m+2}{2}:m]j}),\\&\quad (X_{\frac{m+4}{2}(\frac{m+4}{2}:m)j}, Y_{\frac{m+4}{2}[\frac{m+4}{2}:m]j}),\ldots , (X_{m(\frac{m}{2}:m)j}, Y_{m[\frac{m}{2}:m]j}) \end{aligned}$$

the sample means of X and Y using MRSSE are given

$$\begin{aligned} {\bar{x}}_{(\mathrm{MRSSE})}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k} \left( \sum \limits _{i=1}^{\frac{m}{2}}X_{i(\frac{m}{2}:m)j} +\sum \limits _{i=\frac{m+2}{2}}^{m}X_{i(\frac{m+2}{2}:m)j}\right) \end{aligned}$$

(3.5)

and

$$\begin{aligned} {\bar{y}}_{[\mathrm{MRSSE}]}= & {} \frac{1}{mk}\sum \limits _{j=1}^{k} \left( \sum \limits _{i=1}^{\frac{m}{2}}Y_{i[\frac{m}{2}:m]j} +\sum \limits _{i=\frac{m+2}{2}}^{m}Y_{i[\frac{m+2}{2}:m]j}\right) \end{aligned}$$

(3.6)

Proposed class of robust-regression-type-estimators in SRS

In this section, we can generalized Zaman et al. [30] and Ali et al. [1] estimators under SRS as

$$\begin{aligned} {\bar{y}}_{N(\mathrm{SRS}))}=[{\bar{y}}+b_{i}({\bar{X}}-{\bar{x}})](\frac{{F{\bar{X}}+G}}{F{\bar{x}}+G})^\alpha \end{aligned}$$

(3.7)

To obtain the MSE of the generalized estimators, let us define the following expectations under SRS:

$$\begin{aligned} e_{0}=\,({\bar{y}}-{\bar{Y}})/{{\bar{Y}}}, \quad e_{1}=({\bar{x}}-{\bar{X}})/{{\bar{X}}}. \end{aligned}$$

We can re-write the ${\bar{y}}_{N(\mathrm{SRS})}$ using e terms according to first order of approximation as follows:

$$\begin{aligned}&({\bar{y}}_{N(\mathrm{SRS})}-{\bar{Y}})^{2}\nonumber \\&\quad \cong {\bar{Y}}^2e_{0}^2+B_{i}^2{\bar{X}}^{2}e_{1}^2+{\bar{Y}}^2 \alpha ^2 R_{i}^2e_{1}^2\nonumber \\&\qquad -2B_{i}{\bar{X}}{\bar{Y}}e_{0}e_{1} -2{\bar{Y}}^2 \alpha R_{i}e_{0}e_{1}+2\alpha B_{i}{\bar{X}}{\bar{Y}} R_{i}e_{1}^2 \end{aligned}$$

(3.8)

The expectations of e terms are

$$\begin{aligned} E(e_{0}^2)= & {} \frac{(1-f)}{n}\frac{S_{y}^2}{{\bar{Y}}^2}, \quad E(e_{1}^2)=\frac{(1-f)}{n}\frac{S_{x}^2}{{\bar{X}}^2}, \quad \\ E(e_{0}e_{1})= & {} \frac{(1-f)}{n}\frac{S_{yx}}{{\bar{X}}{\bar{Y}}} \end{aligned}$$

Substituting these expectations in Eq. 3.8, we can get MSE as given by

$$\begin{aligned} \mathrm{MSE}({\bar{y}}_{N(\mathrm{SRS})})\cong & {} \frac{(1-f)}{n} [S_{y}^2+B_{i}^2S_{x}^2+ \alpha ^2 R_{FG}^2S_{x}^2\nonumber \\&-2B_{i}S_{yx}-2\alpha R_{FG}S_{yx}+2\alpha B_{i} R_{FG}S_{x}^2] \end{aligned}$$

(3.9)

Putting $\alpha$=0 in the $\mathrm{MSE}({\bar{y}}_{N(\mathrm{SRS})})$, we can get Ali et al. [1] $\mathrm{MSE}({\bar{y}}_{a})$ estimator. While for appropriate constants, we can also get MSEs of Zaman and Bulut [30] estimators.

Table 4 MSE of estimators under MRSS

Full size table

Proposed class of robust-regression-type-estimators in RSS and MRSS

In RSS and MRSS to improve the efficiency of estimators, some authors used information of auxiliary variable in estimators. Koyuncu [18] has proposed regression-type estimators under MRSS. Al-Omari and Bouza [4] studied the ratio estimators of population mean with missing values using RSS. Al-Omari et al. [5] suggested ratio-type estimators of the mean using extreme RSS. Jemain et al. [15] suggested a modified ratio estimator for the population mean using double MRSS.

In this section, we generalized also our proposed estimators in the previous section to new classes based on RSS and MRSS designs as follows

$$\begin{aligned} {\bar{y}}_{N(j)}= \left[ {\bar{y}}_{(j)}+b_{i(j)} ({\bar{X}}-{\bar{x}}_{(j)})\right] \left( \frac{{F{\bar{X}}+G}}{F{\bar{x}}_{(j)}+G}\right) ^\alpha \end{aligned}$$

(3.10)

where (j) represents the sampling design such as SRS, RSS and MRSS.

To obtain the bias and the MSE of suggested class of estimators in Eq. (3.10) under RSS, let us define following notations

To obtain the MSE of the generalized estimators, let us define the following expectations under SRS:

$$\begin{aligned} e_{0(j)}= & {}\, ({\bar{y}}_{(j)}-{\bar{Y}})/{{\bar{Y}}}, \quad e_{1(j)}=({\bar{x}}_{(j)}-{\bar{X}})/{{\bar{X}}} \nonumber \\ \mathrm{MSE}({\bar{y}}_{N(j)})\cong & {} \, E \left[ {\bar{Y}}^2 e_{0(j)}^2+B_{i}^2{\bar{X}}^2 e_{1(j)}^2+ \alpha ^2 \psi ^2{\bar{Y}}^2 e_{1(j)}^2 \right. \nonumber \\&- \left. 2B_{i}{\bar{Y}}{\bar{X}}e_{0(j)}e_{1(j)}-2\alpha {\bar{Y}}^2 \psi e_{0(j)}e_{1(j)} \right. \nonumber \\&+ \left. 2B_{i} \alpha \psi {\bar{Y}}{\bar{X}} e_{1(j)}^2 \right] \end{aligned}$$

(3.11)

where $\psi =\dfrac{F{\bar{X}}}{F{\bar{X}}+G}$. One can easily obtain the specific MSE from Eq. 3.11 putting expectation terms belong to design. For example if (j) represents the RSS design we can write following notations:

$$\begin{aligned} e_{0(\mathrm{RSS})}= & {} \, ({\bar{y}}_{(\mathrm{RSS})}-{\bar{Y}})/{{\bar{Y}}}, \quad \nonumber \\ e_{1(\mathrm{RSS})}= & {} \, ({\bar{x}}_{(\mathrm{RSS})}-{\bar{X}})/{{\bar{X}}} \nonumber \\ E(e_{0(\mathrm{RSS})}^2)= & {} \, \frac{var({\bar{y}}_{\mathrm{RSS}})}{{\bar{Y}}^2}\nonumber \\= & {} \, \frac{1}{{\bar{Y}}^2}\left[ \frac{S_{y}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{y[i]} -{\bar{Y}})^2\right] ,\nonumber \\ E(e_{1(\mathrm{RSS})}^2)= & {} \, \frac{var({\bar{x}}_{\mathrm{RSS}})}{{\bar{X}}^2}\nonumber \\= & {} \, \frac{1}{{\bar{X}}^2}\left[\frac{S_{x}^2}{m} -\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{x(i)}-{\bar{X}})^2 \right],\ \nonumber \\ E(e_{0(\mathrm{RSS})}e_{1(\mathrm{RSS})})= & {} \, \frac{cov({\bar{x}}_{\mathrm{RSS}},{\bar{y}}_{\mathrm{RSS}})}{{\bar{X}}{\bar{Y}}}\nonumber \\= & {} \, \frac{1}{{\bar{X}}{\bar{Y}}}\left[ \frac{S_{xy}}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{y[i]} -{\bar{Y}})(\mu _{x(i)}-{\bar{X}})\right] .\nonumber \\ \mathrm{MSE}({\bar{y}}_{N(\mathrm{RSS})})\cong & {} \left[ \left[ \frac{S_{y}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m} (\mu _{y[i]}-{\bar{Y}})^2\right] \right. \nonumber \\&\left. +B_{i}^2\left[ \frac{S_{x}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{x(i)}-{\bar{X}})^2\right] \right. \nonumber \\&\left. + \alpha ^2 R_{FG}^2\left[ \frac{S_{x}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{x(i)}-{\bar{X}})^2\right] \right. \nonumber \\&\left. -2B_{i}\left[ \frac{S_{xy}}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{y[i]} -{\bar{Y}})(\mu _{x(i)}-{\bar{X}})\right] \right. \nonumber \\&\left. -2\alpha R_{FG} \left[ \frac{S_{xy}}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{y[i]} -{\bar{Y}})(\mu _{x(i)}-{\bar{X}})\right] \right. \nonumber \\&\left. +2B_{i} \alpha R_{FG} \left[ \frac{S_{x}^2}{m}-\frac{1}{m^2}\sum _{i=1}^{m}(\mu _{x(i)} -{\bar{X}})^2\right] \right] \end{aligned}$$

(3.12)

If (j) represents the MRSS design and the sample size n is odd we can write following notations:

$$\begin{aligned} e_{0(\mathrm{MRSS}_o)}= & {} \, \frac{{\bar{y}}_{\mathrm{MRSS}_o}-{\bar{Y}}}{{\bar{Y}}} \quad \mathrm{and} \quad \\ e_{1(\mathrm{MRSS}_o)}= & {} \, \frac{{\bar{x}}_{\mathrm{MRSS}_o}-{\bar{X}}}{{\bar{X}}}\ \\ E(e_{0(\mathrm{MRSS}_o)}^2)= & {} \, \frac{1}{n{\bar{Y}}^2}S_{y[\frac{n+1}{2}]}^2,\\ E(e_{1(\mathrm{MRSS}_o)}^2)= & {} \, \frac{1}{n{\bar{X}}^2}S_{x(\frac{n+1}{2})}^2,\ \\ E(e_{0(\mathrm{MRSS}_o)}e_{1(\mathrm{MRSS}_o)})= & {} \, \frac{1}{n{\bar{X}}{\bar{Y}}}S_{xy(\frac{n+1}{2})} \end{aligned}$$

Then, we can get the MSE as follows:

$$\begin{aligned}&\mathrm{MSE}({\bar{y}}_{N(\mathrm{MRSS}_o)})\nonumber \\&\quad \cong \frac{1}{n} \left[ S_{y\left[ \frac{n+1}{2}\right] }^2+B_{i}^2S_{x(\frac{n+1}{2})}^2 + \alpha ^2 R_{FG}^2 S_{x(\frac{n+1}{2})}^2 \right. \nonumber \\&\qquad \left. -2B_{i}S_{xy(\frac{n+1}{2})} -2\alpha R_{FG} S_{xy(\frac{n+1}{2})} +2B_{i} \alpha R_{FG} S_{x(\frac{n+1}{2})}^2 \right] \end{aligned}$$

(3.13)

If (j) represents the MRSS design and the sample size n is even we can write following notations:

$$\begin{aligned} E(e_{0(\mathrm{MRSS}_e)}^2)= & {} \, \frac{1}{2n{\bar{Y}}^2} \left( S_{y[\frac{n}{2}]}^2+S_{y[\frac{n+2}{2}]}^2 \right) ,\ \\ E(e_{1(\mathrm{MRSS}_e)}^2)= & {} \, \frac{1}{2n{\bar{X}}^2} \left( S_{x(\frac{n}{2})}^2+S_{x(\frac{n+2}{2})}^2 \right) \quad \mathrm{and} \\ E(e_{0(\mathrm{MRSS}_e)}e_{1(\mathrm{MRSS}_e)})= & {} \, \frac{1}{2n{\bar{X}}{\bar{Y}}} \left( S_{yx(\frac{n}{2})}+S_{yx(\frac{n+2}{2})} \right) . \end{aligned}$$

After defining these terms, the procedure is very easy. When we put these e terms in Eq. 3.11 we get the MSE is given by

$$\begin{aligned}&\mathrm{MSE}({\bar{y}}_{N(\mathrm{MRSS}_e)}) \nonumber \\&\quad \cong \left[ \left( S_{y[\frac{n}{2}]}^2+S_{y[\frac{n+2}{2}]}^2 \right) +B_{i}^2 \left( S_{x(\frac{n}{2})}^2 \right. \nonumber \left. +S_{x(\frac{n+2}{2})}^2 \right) \right. \nonumber \\&\qquad +\left. \alpha ^2 R_{FG}^2 \left( S_{x(\frac{n}{2})}^2+S_{x(\frac{n+2}{2})}^2 \right) \right. \nonumber \\&\qquad \left. -2B_{i} \left( S_{yx(\frac{n}{2})}+S_{yx(\frac{n+2}{2})}\right) \right. \nonumber \\&\qquad \left. -2\alpha R_{FG} \left( S_{yx(\frac{n}{2})}+S_{yx(\frac{n+2}{2})}\right) \right. \nonumber \\&\qquad \left. +2B_{i} \alpha R_{FG} \left( S_{x(\frac{n}{2})}^2+S_{x(\frac{n+2}{2})}^2\right) \right] \end{aligned}$$

(3.14)

Table 5 Percent relative efficiency (PRE) of proposed estimators under RSS over SRS

Full size table

Efficiency comparison

In this section, the efficiency comparison between the MSE equations is obtained as below:

(1) Comparison of Ali et al. [1] ${\bar{y}}_{z}$ estimator which is general form of Zaman and Bulut [30] with proposed estimators under SRS

$\mathrm{MSE}({\bar{y}}_{N(\mathrm{SRS})})<\mathrm{MSE}({\bar{y}}_{z})$ if

$$\begin{aligned}&\alpha ^2 R_{FG}^2S_{x}^2+2\alpha B_{i} R_{FG}S_{x}^2-2\alpha R_{FG}S_{yx}< R^2_{FG} S^{2}_{x}\nonumber \\&\qquad +2B_{i} R_{FG} S^{2}_{x}-2R_{FG} S_{xy}\nonumber \\&(\alpha ^2-1) R_{FG}^2S_{x}^2 +2(\alpha -1) (B_{i} R_{FG}S_{x}^2+R_{FG}S_{yx})< 0 \end{aligned}$$

(4.1)

From Eq. (4.1), one can easily see that when the $\alpha =0$ we get same MSE.

(2) Comparison of Ali et al. [1] ${\bar{y}}_{a}$ estimator with proposed estimators under SRS

$\mathrm{MSE}({\bar{y}}_{N(\mathrm{SRS})})< \mathrm{MSE}({\bar{y}}_{a})$ if

$$\begin{aligned} \alpha [\alpha R_{FG}^2S_{x}^2-2R_{FG}S_{yx}+2 B_{i} R_{FG}S_{x}^2] <0 \end{aligned}$$

(4.2)

When the conditions Eqs. (4.1–4.2) are satisfied the proposed estimator is more efficient than existing estimators.

Table 6 PRE of proposed estimators under MRSS over SRS

Full size table

Simulation study

In this section, we used a real data set which is used by Jemain et al. [14] to illustrate the efficiency of proposed estimators over existing ones. These data consist of height and the diameter at breast height of 399 trees. We have used Height (H) in feet as study variable and Diameter (D) as auxiliary variable. The summary statistics of data is given in Table 1. The scatter plot of the data set is given in Fig. 1. From the scatter plot, we can say that the data set contains outliers. In the simulation study, we have selected 10000 samples with different sample sizes ($n=3,4,5,7,8$) under SRS, RSS and MRSS designs using R Software version 3.1.1 [23] We have calculated the MSE of Zaman and Bulut [30], Ali et al. [1] and the proposed estimators under SRS which is given by Table 2 based on linear, Huber-M, LMS, LTS, LAD, S and MM robust estimators. In this table, we can also see the efficiency of different robust betas in the same type estimator. The same procedure is also extended for the RSS and MRSS estimators as given Tables 3 and 4. We can summarized the simulation study as follows:

From Table 2, we can say that ${\bar{y}}_{N(\mathrm{SRS})_{2}}$ proposed estimator which used third quantile of auxiliary variable and LMS robust beta is the best for all sample sizes.
From Table 3, it can be seen that ${\bar{y}}_{N(\mathrm{RSS})_{4}}$ proposed estimator which used third quantile of auxiliary variable and LMS robust beta is the best for all sample sizes under RSS design.
From Table 4, under MRSS design ${\bar{y}}_{N(\mathrm{MRSS})_{4}}$ is the best using third quantile and LMS robust beta when the sample size $n=3,4,5.$ When the sample size is increasing, proposed ${\bar{y}}_{N(\mathrm{MRSS})_{4}}$ with S robust beta is perform better.
PRE of estimators over different sampling designs is given in Tables 5 and 6. From these tables, we can see that ranked set sampling designs have better performance over SRS for all simulation cases.
When we compare the efficiency according to sample size, we can say that for all sample sizes the proposed estimator is more efficient than existing estimators. Especially for small sample sizes, efficiency over existing estimators is quite high than large sample sizes.

Conclusion

When the data contain outliers or not hold normality assumption to get more reliable results on estimates, we need to consider robust statistics. In this study, moving this direction we have considered robust techniques for estimation of population mean. First, we have examined newly proposed robust estimators in SRS and tried to define more general class of estimators in SRS which newly proposed estimators are member of our class. Then, we have extended our theoretical findings to different sampling designs such as RSS and MRSS. A general form of estimators of population mean and MSE formula are obtained. Theoretical MSEs of the robust family are also given for each designs. To see the performance of our estimators, we have conducted a simulation study using a real data set. We have compared the existing estimators with our proposed estimators and concluded that our proposed estimators perform better than recently proposed Zaman and Bulut [30] and Ali et al. [1] estimators. As a future work, these robust estimators can be introduced under new RSS designs and new robust methods also can be proposed.

References

Ali, N., Ahmad, I., Hanif, M., Shahzad, U.: Robust-regression-type estimators for improving mean estimation of sensitive variables by using auxiliary information. Commun. Stat. Theory Methods (2019). https://doi.org/10.1080/03610926.2019.1645857
Article Google Scholar
Al-Nasser, A.D.: L ranked set sampling: a generalization procedure for robust visual sampling. Commun. Stat. Simul. Comput. 36(1), 33–43 (2007)
Article MathSciNet Google Scholar
Al-Omari, A.I., Bouza, C.N.: Review of ranked set sampling: modifications and applications. Investigación Operacional 35(3), 215–235 (2014)
MathSciNet MATH Google Scholar
Al-Omari, A.I., Bouza, C.N.: Ratio estimators of population mean with missing values using ranked set sampling. Environmetrics 26(2), 67–76 (2015)
Article MathSciNet Google Scholar
Al-Omari, A.I., Jaber, K., Ibrahim, A.: Modified ratio-type estimators of the mean using extreme ranked set sampling. J. Math. Stat. 4(3), 150 (2008)
Article Google Scholar
Al-Omari, A.I.: Ratio estimation of the population mean using auxiliary information in simple random sampling and median ranked set sampling. Stat. Probab. Lett. 82, 1883–1890 (2012)
Article MathSciNet Google Scholar
Al-Saleh, M.F., Al-Kadiri, M.A.: Double-ranked set sampling. Stat. Probab. Lett. 48(2), 205–212 (2000)
Article MathSciNet Google Scholar
Chen, Z.: Ranked set sampling: its essence and some new applications. Environ. Ecol. Stat. 14(4), 355–363 (2007)
Article MathSciNet Google Scholar
Collins, J.R.: Robust estimation of a location parameter in the presence of asymmetry. Ann. Stat. 4, 68–85 (1976)
Article MathSciNet Google Scholar
Daszykowski, M., Kaczmarek, K., Vander Heyden, Y., Walczak, B.: Robust statistics in data analysis–a review: basic concepts. Chemometr. Intell. Lab. Syst. 85(2), 203–219 (2007)
Article Google Scholar
Hanif, M., Shahzad, U.: Estimation of population variance using kernel matrix. J. Stat. Manag. Syst. 22(3), 563–586 (2019)
Google Scholar
Huber, P.J.: Robust regression: asymptotics, conjectures and Monte Carlo. Ann. Stat. (1973). https://doi.org/10.1214/aos/1176342503
Article MathSciNet MATH Google Scholar
Huber, P.J.: Robust Estimation of a Location Parameter. Breakthroughs in Statistics, pp. 492–518. Springer, New York (1992)
Google Scholar
Jemain, A.A., Al-Omari, A.I., Ibrahim, K.: Balanced groups ranked set sampling for estimating the population median. J. Appl. Stat. Sci. 17(1), 39–46 (2008a)
MATH Google Scholar
Jemain, A.A., Al-Omari, A.I., Ibrahim, K.: Modified ratio estimator for the population mean using double median ranked set sampling. Pak. J. Stat. 24(3), 217–226 (2008b)
MathSciNet MATH Google Scholar
Koyuncu, N.: New difference-cum-ratio and exponential type estimators in median ranked set sampling. Hacet. J. Math. Stat. 45(1), 207–225 (2016)
MathSciNet MATH Google Scholar
Koyuncu, N.: Regression estimators in ranked set, median ranked set and neoteric ranked set sampling. Pak. J. Stat. Oper. Res. 14(1), 89–94 (2018)
Article MathSciNet Google Scholar
Koyuncu, N.: A class of estimators in median ranked set sampling. İstatistikçiler Dergisi: İstatistik ve Aktüerya 12(2), 58–71 (2019)
Google Scholar
Kvam, P.H.: Ranked set sampling based on binary water quality data with covariates. J. Agric., Biol., Environ. Stat. 8(3), 271 (2003)
Article Google Scholar
McIntyre, G.A.: A method for unbiased selective sampling using ranked sets. Aust. J. Agric. Res. 3, 385–390 (1952)
Article Google Scholar
Muttlak, H.A.: Pair rank set sampling. Biom. J. 38, 879–885 (1996)
Article MathSciNet Google Scholar
Muttlak, H.A.: Median ranked set sampling. J. Appl. Stat. Sci. 6, 245–255 (1997)
MATH Google Scholar
R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/ (2020)
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley Series in Probability and Mathematical Statistics. Wiley, New York (1987)
Book Google Scholar
Shahzad, U., Perri, P.F., Hanif, M.: A new class of ratio-type estimators for improving mean estimation of nonsensitive and sensitive variables by using supplementary information. Commun. Stat. Simul. Comput. 48(9), 2566–2585 (2019)
Article MathSciNet Google Scholar
Shahzad, U., Al-Noor, N.H., Hanif, M., Sajjad, I.: An exponential family of median based estimators for mean estimation with simple random sampling scheme. Commun. Stat. Theory Methods 1–10 (2020)
Shahzad, U., Al-Noor, N.H., Hanif, M., Sajjad, I., Muhammad Anas, M.: Imputation based mean estimators in case of missing data utilizing robust regression and variance-covariance matrices. Commun. Stat. Simul. Comput. 1–20 (2020)
Subzar, M., Bouza, C.N., Al-Omari, A.I.: Utilization of different robust regression techniques for estimation of finite population mean in srswor in case of presence of outliers through ratio method of estimation. Investigación Operacional 40(5), 600–609 (2019)
MathSciNet Google Scholar
Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Boston (1977)
MATH Google Scholar
Zaman, T., Bulut, H.: Modified ratio estimators using robust regression methods. Commun. Stat. Theory Methods 48(8), 2039–2048 (2019a)
Article MathSciNet Google Scholar
Zaman, T., Bulut, H.: Modified regression estimators using robust regression methods and covariance matrices in stratified random sampling. Commun. Stat. Theory Methods (2019b). https://doi.org/10.1080/03610926.2019.1588324
Article Google Scholar
Zamanzade, E., Al-Omari, A.I.: New ranked set sampling for estimating the population mean and variance. Hacet. J. Math. Stat. 45(6), 1891–1905 (2016)
MathSciNet MATH Google Scholar
Zamanzade, E., Mahdizadeh, M.: A more efficient proportion estimator in ranked set sampling. Stat. Probab. Lett. 129, 28–33 (2017)
Article MathSciNet Google Scholar
Zamanzade, E., Wang, X.: Estimation of population proportion for judgment post-stratification. Comput. Stat. Data Anal. 112, 257–269 (2017)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Hacettepe University, Ankara, Turkey
Nursel Koyuncu
Department of Mathematics, Faculty of Science, Al al-Bayt University, Mafraq, Jordan
Amer Ibrahim Al-Omari

Authors

Nursel Koyuncu
View author publications
You can also search for this author in PubMed Google Scholar
Amer Ibrahim Al-Omari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nursel Koyuncu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koyuncu, N., Al-Omari, A.I. Generalized robust-regression-type estimators under different ranked set sampling. Math Sci 15, 29–40 (2021). https://doi.org/10.1007/s40096-020-00360-7

Download citation

Received: 27 July 2020
Accepted: 01 November 2020
Published: 17 November 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s40096-020-00360-7

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Generalized robust-regression-type estimators under different ranked set sampling

Abstract

Similar content being viewed by others

Improved ratio-type estimators using stratified double-ranked set sampling

Ratio estimators using stratified random sampling and stratified ranked set sampling

The New Sub-regression Type Estimator in Ranked Set Sampling

Introduction

Robust estimators in SRS

Proposed class of robust-regression-type-estimators under SRS, RSS and MRSS

RSS design

MRSS design

Proposed class of robust-regression-type-estimators in SRS

Proposed class of robust-regression-type-estimators in RSS and MRSS

Efficiency comparison

Simulation study

Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Generalized robust-regression-type estimators under different ranked set sampling

Abstract

Similar content being viewed by others

Improved ratio-type estimators using stratified double-ranked set sampling

Ratio estimators using stratified random sampling and stratified ranked set sampling

The New Sub-regression Type Estimator in Ranked Set Sampling

Introduction

Robust estimators in SRS

Proposed class of robust-regression-type-estimators under SRS, RSS and MRSS

RSS design

MRSS design

Proposed class of robust-regression-type-estimators in SRS

Proposed class of robust-regression-type-estimators in RSS and MRSS

Efficiency comparison

Simulation study

Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation