1 Introduction

In many survey sampling studies, in addition to the variable of interest, researchers often have additional auxiliary information to improve statistical inference. In many instances, this auxiliary information may not be accurate, cannot be turned into a numerical covariate, or may be even subjective. Even though it contains valuable information, use of this type of information is ignored in practice since it may require strong modeling assumptions. For example, ratio and regression estimators are constructed based on auxiliary variables under strong modeling assumptions. This paper uses rough information to provide ranks (or approximate ranks) for the population units to improve the information content of the sample by avoiding strong modeling assumptions.

The auxiliary variables are very common in survey sampling studies. Hence, finite population settings provide a natural platform to obtain exact or approximate ranks of all population units. Ranking process can be achieved through the use of auxiliary variables, such as, size of sampling units, previous survey outcomes, census tracks, etc. One such setting is given in Husby et al. (2005), namely, The United States Department of Agriculture’s (USDA) National Agricultural Statistics (NASS) county crop estimation program. This program samples farms across the United States from the sampling frames that include obvious auxiliary variables, such as acreage in the farm, size of the farm, etc. These auxiliary variables provide a reasonable mechanism to rank the farms based on their crop productions. The detailed description of the USDA/NASS county estimation program can be found in Iwig (1993).

In an infinite population setting, use of subjective information has generated extensive research interests in judgment post stratified (JPS) and ranked set sampling (RSS) designs. Readers can find the recent research activities and detailed description of JPS sampling designs in MacEachern et al. (2004), Frey and Feeman (2012, 2013), Frey and Ozturk (2011), Stokes et al. (2007), Wang et al. (2006, 2008, 2012), and Ozturk (2014a, b). Ranked set sampling design is originally developed to keep the overall cost of data collection minimal in estimating mean pasture yield in agricultural fields in an infinite population setting by McIntyre (1952, reprinted in 2005). In recent years, there has been a surge in research in RSS sampling designs both in finite and infinite population settings. A tiny slice of literature in finite population setting includes Patil et al. (1995), Jafari Jozani and Johnson (2011, 2012), Frey (2011), Gokpinar and Ozdemir (2010), Ozturk and Jafari Jozani (2013), Al-Saleh and Samawi (2007), Ozdemir and Gokpinar (2008) and Ozturk (2014c). A comprehensive up-to-date literature review both in JPS and RSS can be found in a recent review paper in Wolfe (2012).

Both RSS and JPS sampling designs use ranking information from a few units in a set, not from the entire population, to divide the data into homogeneous groups of judgment strata. This ranking process is subjective and does not require strong modeling assumptions. It only needs a consistent ranking scheme to create ranks for the units in a set without requiring an established standard of measurement. On the other hand, RSS and JPS use ranks locally in a set and ignores the global ranking information in the population. This paper, unlike JPS and RSS sampling designs, concentrates on global ranking information in the entire population and creates informative samples.

The paper considers finite population settings and assumes the ranks of population units are available before sampling. It selects a simple random sample (SRS) of n units from a finite population of size N either with- or without-replacement. For each selected unit in the sample, we measure the characteristic of interest along with its population rank. If the sample is selected with-replacement, the rank of the selected unit is determined from the entire population including all population units. If the sample is taken without-replacement, the population ranks can be determined in two different ways. In the first approach, before selecting the sample, ranks are assigned in the entire population including all units, and then a simple random sample without replacement is selected along with the population ranks of the sample units. In the second approach, ranks are determined sequentially. The rank of a sample unit is determined among the remaining population units by excluding all the previously selected sample units from the population. Even though these two approaches yield without-replacement SRS samples, they create different ranking structures in the sample. Hence, replacement policies and ranking structures lead to three different designs: design-0 , design-1, and design-2. Designs-0 selects the sample with replacement and assigns the ranks in the entire populations; design-1 selects the sample without-replacement and assigns the ranks sequentially; and finally, the design-2 selects the sample without-replacement and assigns the ranks in the entire population.

Ranking structures in these sampling designs provide a lot of information about the exact (or approximate) population location of the sample units. This location information can be used to borrow additional information from other unmeasured population units to improve the information content of the sample. For each measured unit, we consider selecting additional \(H-1\) unmeasured units without replacement from the remaining population units to form a set of size H. The relative position of each measured unit in these sets can be computed through its rank in the set by conditioning on its population rank. The rank of the measured unit in a set of size H yields a discrete conditional probability distribution given the population rank of the same measured unit. These conditional probabilities of within-set ranks further provide a mechanism to compute conditional inclusion probabilities for the population units. The final sample in this process consists of three pieces of information: measured values, conditional ranking probabilities, and the conditional inclusion probabilities. Even though the measured observations form an SRS sample in design-0, design-1, and design-2, the conditional probabilities are different. Hence, these designs show different characteristics.

Section 2 provides detailed developments for the construction of design-0, design-1, and design-2. For each design, we construct a probability distribution for the approximate location of the measured units among the unmeasured population units in a set of size H and compute the first-order conditional inclusion probabilities of the population units given the population ranks of a sample. Section 3 uses ranking information and conditional inclusion probabilities to construct estimators for the population mean and total. Section 4 provides empirical evidence to compare the proposed estimators with its competitors. Section 5 develops a bootstrap re-sampling procedure to estimate the variance of the estimators and to construct percentile confidence intervals for the population mean and total. Section 6 applies the proposed estimators to USDA 1992 Ohio corn data. Finally, Sect. 7 provides a concluding remark.

2 Construction of sampling designs

Consider a finite population of N units labeled as \(\mathcal{P}=\{u_1,\ldots , u_N\}\). For each population unit \(u_i\), we assume that its population rank \(s_i\), \(1 \le s_i \le N\), is known. If the population ranks are not known, we assume that there exists an auxiliary variables Y highly correlated with the variable of interest X. We then estimate the population rank of X from the population rank of the auxiliary variable Y. In the remaining part of the paper, we use \(s_i\) to denote the true population rank of the unit \(u_i\) whether it is estimated or not.

Design-0 We select a simple random sample, \(U=\{u_{s_1},\ldots ,u_{s_n}\}\), with-replacement from \(\mathcal{P}\) and measure all of them for the variable of interest X, \(\varvec{X}=(X_1,\ldots , X_n)\). We then identify the population ranks of the measured units, \(\varvec{S}=\{s_1,\ldots , s_n\}\), where \(s_j\) is the population rank of \(X_j\). Our sample then consists of n measurements and population ranks that correspond to these measured units:

$$\begin{aligned} \varvec{X}_{\varvec{S}}=\left\{ X_j,s_j\right\} , j=1,\ldots , n. \end{aligned}$$
(1)

We note that since the sample is selected with replacement, some pair \((X_i,s_i)\) may appear more than once in the sample. If we ignore the ranks in Eq. (1), \(\varvec{X}_{\varvec{S}}\) becomes a simple random sample, and the inference can be developed based on standard theory in finite population setting. Since \(\varvec{S}\) contains the true (or estimated) population ranks of the sample units, it provides a lot of information about the approximate location (in relative sense) of the sample units in the population. This location indicator allows us to borrow additional information from another \(H-1\), \(1 \le H \le N-1\), unobserved population units without a measurement. To achieve this goal, for each selected unit \(u_{s_j}\), we consider selecting \(H-1\) additional units at random without-replacement from the remaining \(N-1\) population units and form a set of size H

$$\begin{aligned} U_{j,H}=\{u_{s_j},u_{t_1},\ldots , u_{t_{H-1}}\}, t_h \ne s_j, h=1,\ldots , H-1, \end{aligned}$$
(2)

where \(u_{t_h}\) is the \(t_h\)-th smallest unit among N population units. Let \(R_{s_j}\) be the rank of \(X_j\), random variable obtained from \(u_{s_j}\), in the set \(U_{j,H}\). The conditional probability that \(R_{s_j}\) is equal to h given that \(X_j\) is the \(s_j\)-th smallest unit in the population can be computed by

$$\begin{aligned} \alpha ^{(0)} (h|s_j)=P(R_{s_j}=h|X_j=x_{s_j})= \frac{ \left( \begin{array}{c} {s_j-1}\\ {h-1} \end{array}\right) \left( \begin{array}{c} {N-s_j}\\ {H-h} \end{array} \right) }{\left( \begin{array}{c} {N-1} \\ {H-1} \end{array}\right) }. \end{aligned}$$
(3)

The above expression shows that the rank of random variable \(X_j\) in a set of size H has a conditional probability distribution over integers \((1,\ldots ,H)\) given that \(X_j\) is the \(s_j\)-th smallest unit in the population. This conditional distribution helps us to borrow information from additional \(H-1\) unmeasured units in the population in addition to the information each measured unit has in the sample.

We now look at the problem from a different perspective. Instead of treating \(\varvec{X}\) as a simple random sample, we treat it as a sample of independent order statistics by conditioning on rank vector \(\varvec{R}=\{R_{s_1},\ldots , R_{s_n}\}\) generated by Eq. (3). It is clear that the conditional distribution of \(X_j\) given that it has a rank \(R_{s_j}=h_j\) in a set of size H is the same as the \(h_j\)-th order statistics, \(X_{(h_j)}\mathop {=}\limits ^{D}X_j|R_{s_j}=h_j\). Let \(\varvec{X}_{H|\varvec{\varvec{S}}}=(X_{(h_j)},\ldots ,X_{(h_n)})\) be the n order statistics based on this conditional distribution.

Let \(\beta ^{(0)}(i,h|s_j)\) be the probability that the h-th order statistics in set \(U_{j,H}\) equals to the i-th smallest unit in the population given that \(R_{s_j}=h\). This conditional probability can be computed from

$$\begin{aligned} \beta ^{(0)}(i|h,s_j)=P(X_{(R_{s_j})}=x_{i}|R_{s_j}=h)=\frac{ \left( \begin{array}{c} {i-1}\\ {h-1} \end{array}\right) \left( \begin{array}{c} {N-i}\\ {H-h} \end{array}\right) }{\left( \begin{array}{c} {N}\\ {H} \end{array}\right) }. \end{aligned}$$

One can interpret \(\beta ^{(0)}(i|h,s_j)\) as the probability mass function of the h-th order statistics given that the rank of \(X_j\) equals to h, \(R_{s_j}=h\). By using \(\alpha ^{(0)}(i,h|s_j)\) and \(\beta ^{(0)}(i,h|s_j)\), we obtain the conditional probability that random variable \(X_j\) equals to the i-th smallest value in the population given the population rank \((s_j)\) of \(X_j\)

$$\begin{aligned} \beta ^{(0)}(i|s_j)=P(X_j=x_{i}|s_j)=\sum _{h=1}^H \beta ^{(0)}(i|h,s_j) \alpha ^{(0)} (h|s_j). \end{aligned}$$

Then the conditional inclusion probability of the i-th population unit in the sample \(\varvec{X}_{H|\varvec{S}}\) is given by

$$\begin{aligned} \pi ^{(0)}(i|\varvec{S})= 1- \prod _{j=1}^n(1- \beta ^{(0)}(i|s_j)). \end{aligned}$$

Note that \(\pi ^{(0)}(i|S)\), \(i=1,\ldots ,N\), are not the inclusion probabilities of the sample \(\varvec{X}\). They are the inclusion probabilities of the sample \(\varvec{X}_{H|\varvec{S}}\). Since i is arbitrary, we can replace it with \(s_j\). In this case, \(\pi ^{(0)}(s_j|\varvec{S})\) would be the probability that population unit \(u_{s_j}\) in sample \(\varvec{X}\) would be included in sample \(\varvec{X}_{H|S}\).

Remark 1

If either \(H=1\) or \(H>1\) and \(\alpha ^{(0)}(h|s_j)=1/H\) for \(h=1,\ldots , H\), then \(\beta ^{(0)}(i|s_j)= 1/N\) for \(j=1,\ldots , n\) and \(i=1,\ldots ,N\). The inclusion probabilities in these cases reduce to \(\pi ^{(0)}(i|\varvec{S})= 1-(\frac{N-1}{N})^n, i=1, \ldots , N\).

For design-0, the data structure of the sample will be denoted by

$$\begin{aligned} \left\{ X_j,\alpha ^{(0)}(h|s_j),\pi ^{(0}(s_j|\varvec{S})\right\} , \quad h=1,\ldots ,H, \quad j=1, \ldots , n. \end{aligned}$$

Design-1 Design-1 selects the sample units without replacement from the population and measures all of them, \(\varvec{X}=(X_{1},\ldots , X_{n})\). Unlike design-0, where ranks in \(\varvec{S}\) are computed from all units in the population, the population ranks of the measured units are identified sequentially by removing all of the previously ranked units in the sample from the population. Let \(s^*_j\) be the rank of the unit \(u_j\) after removing all the previously ranked units from the population. Since the ranks of the selected units are assigned sequentially, we introduce additional notation to accommodate the ranking structure. Let \(\mathcal{P}_{-j}\) be the finite population of size \(N+1-j\) after removing \(j-1\) units from the original population \(\mathcal{P}\)

$$\begin{aligned} \mathcal{P}_{-j}=\{\hbox {All population units excluding the first } j-1 \hbox { units in the sample}\}. \end{aligned}$$

Since the population rank of each selected unit is determined after removing all previously selected units from the population, the sample with this new ranking structure becomes

$$\begin{aligned} U^*=\{ u_{s_1^*}, u_{s_2^*}, \ldots , u_{s_n^*}\}, \end{aligned}$$

where \(u_{s_j^*}\) is the j-th selected unit in the sample that has a rank \(s_j^*\) in population \(\mathcal{P}_{-j}\), and \(\varvec{S}^*=\{s_1^*,\ldots , s_n^*\}\) is the set of the ranks obtained from the reduced populations. The expression \(s_j^*=s_j-\sum _{k=1}^n I(s_k < s_j)\) provides the connection between \(s^*_j\) in the reduced population and \(s_j\) in the full population. In this sample, for each selected unit \(u_{s_j^*}\), we again construct a set of size H to borrow information from additional \(H-1\) unmeasured units from population \(\mathcal{P}_{-j}\):

$$\begin{aligned} U^*_{j,H}=\{u_{s_j^*}, u_{t_1^*}, \ldots , u_{t_{H-1}^*}\}, \quad j=1, \ldots ,n, \end{aligned}$$

where \(u_{t_h^*}\) is the \(t_h^*\)-th smallest unit in population \(\mathcal{P}_{-j}\). Let \(R_{s_j^*}\) be the rank of \(u_{s_j^*}\) in the set \(U^*_{j,H}\). The conditional distribution of \(R_{s_j^*}\) given that \(s_j^* \in \mathcal{P}_{-j}\) is given by \(\alpha ^{(1)}(h|s^*_j)\):

$$\begin{aligned} \alpha ^{(1)}(h_j|s_j^*)= P(R_{s_j^*}=h_j|s_j^* \in \mathcal{P}_{-j})=\frac{ \left( \begin{array}{c} {s_j^*-1}\\ {h_j-1} \end{array} \right) \left( \begin{array}{c} {N-j+1-s^*_j}\\ {H-h_j} \end{array} \right) }{\left( \begin{array}{c} {N-j} \\ {H-1} \end{array}\right) }. \end{aligned}$$

As in design-0, we again consider a conditional sample \(\varvec{X}_{H|\varvec{S}^*}\) given the population ranks \(\varvec{S}^*\). By adopting the notation of design-0, the conditional probability that the h-th order statistics in set \(U^*_{j,H}\) equals to the i-th smallest unit in the population \(\mathcal{P}_{-j}\) given that \(R_{s_j}=h\) is given by

$$\begin{aligned} \beta ^{(1)}(i,|h,s_j^*)=P(X_{(h)}=x_i|R_{s_j^*}=h)= \frac{ \left( \begin{array}{c} {i-1}\\ {h-1} \end{array} \right) \left( \begin{array}{c} {N-j+1-i}\\ {H-h} \end{array} \right) }{\left( \begin{array}{c} {N-j+1}\\ {H} \end{array} \right) }, x_i \in \mathcal{P}_{-j}. \end{aligned}$$

In design 1, the conditional probability that random variable \(X_j\) equals to the i-th smallest value in the population \(\mathcal{P}_{-j}\) given its population rank \((s_j^*)\) follows from the total probability law over the conditional distribution of rank: \(R_{s_j}^*\)

$$\begin{aligned} \beta ^{(1)}(i|s_j^*)= P( X_j = x_i|s^*_j) = \sum _{h=1}^H\beta ^{(1)}(i|h,s_j^*) \alpha ^{(1)}(h|s_j^*), x_i \in \mathcal{P}_{-j}. \end{aligned}$$

To compute the conditional inclusion probability of the i-th population unit in the sample \(\varvec{X}_{H|\varvec{S}^*}\) given the rank vector \(\varvec{S}^*\), \(\pi ^{(1)}(i|\varvec{S}^*)\), we use the sequential algorithm given in Frey (2013) in a slightly different context. To compute \(\pi ^{(1)}(i|\varvec{S}^*)\), we first need to develop some additional notation due to sequential identification of population ranks. For \(j=0, \ldots , i-1\), let W(jd) be the probability that first d units in the sample include j units smaller than the i-th unit and not the i-th unit. It is obvious that, if \(d=0\), the values of \(\{ W(j,0),0 \le j \le i-1\}\) can be computed from

$$\begin{aligned} W(j,0)= \left\{ \begin{array}{ll} 1, &{} j=0 \\ 0, &{}\hbox {otherwise.} \end{array} \right. \end{aligned}$$

The values of W(jd) for \(d>0\) can be computed from a recursive relationship between adjacent selection steps. Assume that the values of \(\{ W(j,d), 0 \le j \le i-1 \}\) are known for a fixed d. There are then two ways to obtain the values of \(\{ W(j,d+1), 0 \le j \le i-1 \}\) from stage d: (1) There could be j units smaller than the i-th unit among the first d selected units in the sample and the next selected unit in the sample is larger than the i-th unit in the population. (2) There could be \(j-1\) units smaller than the i-th unit among the first d selected units in the sample and the next selected unit in the sample is smaller than the i-th unit in the population. These two statements define a recursive equation as follows:

$$\begin{aligned} W(0,d+1)=T(0,d) \sum _{\lambda =i+1}^{N-d} \beta ^{(1)}(\lambda |s_d^*) \end{aligned}$$

and for \(j=1, \ldots , i-1\)

$$\begin{aligned} W(j,d+1)=W(j,d) \sum _{\lambda =i+1-j}^{N-d} \beta ^{(1)}(\lambda |s_d^*) +W(j-1,d)\sum _{\lambda =1}^{(i-1)-(j-1)} \beta ^{(1)}(\lambda |s_d^*). \end{aligned}$$

Going through this recursive equation for \(d=1,\ldots , n\), we compute the probability \(\{W(j,n), 0 \le j \le i-1\}\). The probability that the i-th unit is not included in the sample is then given by \(\sum _{j=0}^{i-1} W(j,n)\). The first-order conditional inclusion probability of the i-th unit given \(\varvec{S}^*\) and H is then given by

$$\begin{aligned} \pi ^{(1)}(i|S^*)=1- \sum _{j=0}^{i-1} W(j,n), i= 1, \ldots , N \end{aligned}$$

Note that even though it is not explicitly stated in the notation, W(jd) is a conditional probability for given population rank vector \(\varvec{S}^*\). The data structure of design-1 will be denoted by

$$\begin{aligned} \{X_j,\alpha ^{(1)}(h|s_j^*), \pi ^{(1)}(s^*_j|\varvec{S}^*)\}, h=1,\ldots , H, j=1, \ldots , n \end{aligned}$$

Design 2 We select a simple random sample, \(\varvec{X}=(X_1,\ldots , X_n)\), of size n without replacement and identify their ranks, \(\varvec{S}=(s_1,\ldots , s_n)\), in population \(\mathcal{P}\)

$$\begin{aligned} U=\{u_{s_1},u_{s_2},\ldots , u_{s_n}\}. \end{aligned}$$

To borrow additional information from the unmeasured population units, we select n disjoint sets, each of size \(H-1\). We then randomly match these n sets with selected units in set U to form n sets, each of size H

$$\begin{aligned} U_{j,H}=\{u_{s_j},u_{t_1},\ldots , u_{t_H}\}, j=1,\ldots , n. \end{aligned}$$

The conditional probability distribution of the rank \(R_{s_j}\) of \(X_j\) in set \(U_{j,H}\) given that \(X_j\) has the rank \(s_j\) in the population \(\mathcal{P}\) is given by

$$\begin{aligned} \alpha ^{(2)}(h|s_j)=P(R_{s_j}=h_j|s_j)=\frac{\left( \begin{array}{c} {s_j-1}\\ {h_j-1} \end{array} \right) \left( \begin{array}{c} {N-s_j}\\ {H-h_j} \end{array} \right) }{\left( \begin{array}{c} {N-1} \\ {H-1} \end{array}\right) }, h_j=1,\ldots , H. \end{aligned}$$

In a similar fashion, in the sample \(\varvec{X}_{H|\varvec{S}}\), the conditional probability that the h-th order statistics in set \(U_{j,H}\) equals to the i-th smallest unit in the population \(\mathcal{P}\) given that \(R_{s_j}=h\) is given by

$$\begin{aligned} \beta ^{(2)}(i|h,s_j)=P(X_{(h)}=x_i|R_{s_j}=h)=\frac{\left( \begin{array}{c} {i-1} \\ {h-1} \end{array} \right) \left( \begin{array}{c} {N-i}\\ {H-h} \end{array}\right) }{\left( \begin{array}{c} {N}\\ {H} \end{array} \right) }, x_i \in \mathcal{P}. \end{aligned}$$

In Eq. (4), the sum over possible values of h yields the conditional probability that \(X_j\) equals to i-th smallest unit in the population given the population rank \(s_j\)

$$\begin{aligned} \beta ^{(2)}(i|s_j,H)=\sum _{h=1}^H \beta ^{(2)}(i|h,s_j) \alpha ^{(2)}(h|s_j). \end{aligned}$$
(4)

The first-order conditional inclusion probabilities given the population ranks of the observed measurements then follow from

$$\begin{aligned} \pi ^{(2)}(i|\varvec{S})= \sum _{j=1}^n \beta ^{(2)}(i|s_j), i=1,\ldots ,N. \end{aligned}$$

Finally, the data structure of the sample from design-2 is denoted with

$$\begin{aligned} \{X_{j},\alpha ^{(2)}(h|s_j),\pi ^{(2)}(s_j|\varvec{S})\}, h=1,\ldots ,H, j=1,\ldots ,n. \end{aligned}$$

Remark 2

If either \(H=1\) or \(H>1\) and \(\alpha ^{(2)}(h|s_j)=1/H\) for \(h=1,\ldots , H\), then \(\beta ^{(2)}(i|s_j)= 1/N\) for \(j=1,\ldots , n\) and \(i=1,\ldots ,N\). The inclusion probabilities in these cases reduce to \(\pi ^{(2)}(i|\varvec{S})= n/N, i=1, \ldots , N\).

3 Estimators for population mean and total

In this section, we introduce three estimators for population mean and total for each sampling design. The estimators for population total use the data structures established in the previous section:

$$\begin{aligned} T_1^{(L)}= & {} \left\{ \begin{array}{ll} {\sum }_{j=1}^n w^{(L)}_1(j|\varvec{S})X_j &{} L=0,2 \\ {\sum }_{j=1}^n w^{(L)}_1(j|\varvec{S}^*)X_j &{} L=1 \end{array} \right. \\ T_2^{(L)}= & {} \left\{ \begin{array}{ll} \frac{N}{H}{\sum }_{h=1}^H{\sum }_{j=1}^nw^{(L)}_2(h,j|\varvec{S}) X_j &{} L=0,2,\\ \ \frac{N}{H}{\sum }_{h=1}^H{\sum }_{j=1}^nw^{(L)}_2(h,j|\varvec{S}^*) X_j &{} L=1 \end{array} \right. \end{aligned}$$

and

$$\begin{aligned} T_3^{(L)}=\left\{ \begin{array}{ll} \frac{N}{H}{\sum }_{h=1}^H{\sum }_{j=1}^nw^{(L)}_3(h,j|\varvec{S}) X_j &{} L=0,2 \\ \frac{N}{H}{\sum }_{h=1}^H{\sum }_{j=1}^nw^{(L)}_3(h,j|\varvec{S}^*) X_j &{} L=1 \end{array} \right. \end{aligned}$$

where

$$\begin{aligned}&w_1^{(L)}(j|\varvec{S})=\frac{1}{\pi ^{(L)}(s_j|\varvec{S})}, \quad w_2^{(L)}(h,j|\varvec{S}) = \frac{\alpha ^{(L)}(h|s_j)}{{\sum }_{j=1}^n\alpha ^{(L)}(h|s_j)}, \\&w_3^{(L)}(h,j|\varvec{S}) = \frac{\frac{\alpha ^{(L)}(h|s_j)}{\pi ^{(L)}(s_j|\varvec{S})}}{{\sum }_{j=1}^n\frac{\alpha ^{(L)}(h|s_j)}{\pi ^{(L)}(s_j|\varvec{S})}}. \end{aligned}$$

We note that design-1 estimators always use population ranks \(\varvec{S}^*\) in the reduced populations \(\mathcal{P}_{-j}\), \(j=1,\ldots ,n\), to compute the conditional probabilities \(\alpha ^{(1)}(h|s^*_j\) and \(\pi ^{(1)}(s^*_j|s^*_j,H)\). Estimator \(T^{(L)}_1\) is motivated from Horvitz–Thompson estimator (Horvitz and Thompson 1952), where units having smaller inclusion probability in the sample is given higher weight. On the other hand, it should be clear that \(T^{(L)}_1\) is not a Horvitz–Thompson estimator since \(\pi ^{(L)}(i|\varvec{S})\) and \(\pi ^{(L)}(i|\varvec{S}^*)\) are not inclusion probabilities for sample \(\varvec{X}\). They are the inclusion probabilities for sample \(\varvec{X}_{H|\varvec{S}}\) and \(\varvec{X}_{H|\varvec{S}^*}\), respectively.

The estimator \(T^{(L)}_2\) is motivated from JPS estimator in MacEachern et al. (2004), where each measured observation is prorated to H ranking classes. The prorate is proportional to the probability that the measured unit has rank h in a set of size H. This prorating process creates H strata. Hence, improvement over simple random sample (SRS) estimator can be anticipated form the theory of stratified sampling design in survey sampling. Even though the estimators \(T^{(0)}_2\) and \(T^{(2)}_2\) have the same form, they yield different efficiency results since the sample \(\varvec{X}\) is constructed with and without replacement in design-0 and design-1, respectively.

The estimator \(T^{(L)}_3\) uses the same idea as in estimator \(T^{(L)}_2\), but it gives more weight to observations that are less likely to be included in the sample to reduce the variance of the estimator. One then anticipates that the estimator \(T^{(L)}_3\) performs better than the other two estimators.

Estimators for the population mean can be obtained by dividing \(T^{(L)}_r\) with N

$$\begin{aligned} \hat{\mu }^{(L)}_r= \frac{1}{N} T^{(L)}_r, r=1, \ldots , 3, \quad L=0,1,2. \end{aligned}$$

4 Empirical evidence

In this section, we investigate the efficiency of the estimators. Even though it is theoretically possible to construct the probability distributions of the estimators by computing the weight functions over all possible values of \(\varvec{S}\), this would computationally be intensive even for moderate sample and population sizes. Hence, to reduce the computational burden, we use a simulation study to investigate the properties of the estimators.

Simulation study considered two sets of sample (n) and set (H) sizes, \(n=20,50\) and \(H=2,5\), respectively. Ranking accuracy is controlled by the correlation coefficient \(\rho =1.00,0.75, 0.5\) between X and Y. Datasets are generated from discrete normal and exponential distributions of size \(N=300\). Discrete normal and exponential populations are generated by \(x_i= Q((i-0.5)/N)\), \(i=1,\ldots , N\), where Q is the quantile function of either standard normal or standard exponential distribution depending on the underlying population. Simulation size is taken to be 1000.

Ranking accuracy is simulated by perceived size ranking model in Dell and Clutter (1972). This model, for the population values \(\varvec{x}=(x_1,\ldots , x_N)\), selects an N dimensional random vector, \(\varvec{\epsilon }=(\epsilon _1, \ldots , \epsilon _N)\), from a normal distribution having mean zero and variance \(\tau ^2\). These two vectors are added to create a ranking vector \(\varvec{y}=\varvec{x}+\varvec{\epsilon }\). The ranks of the observations in the vector \(\varvec{y}\) are used to predict the ranks of the values (\(\varvec{x}\)) of population units. The accuracy of ranking is controlled by the correlation coefficient: \(\rho =\mathrm{corr}(Y,X)=\frac{1}{\sqrt{1+\tau ^2/\sigma ^2}}\), or equivalently by the variance \(\tau ^2\), where \(\sigma ^2\) is the variance of X.

Table 1 Biases of the the estimators \(\hat{\mu }_{r}^{(L)}\), \(r=1,2,3\) and SRS mean.
Table 2 Biases of the the estimators \(\hat{\mu }_{r}^{(L)}\), \(r=1,2,3\) and SRS mean.

Tables 1 and 2 present the biases of the estimators for discrete normal and exponential distributions. It is clear from these tables that the estimator \(\hat{\mu }_1^{(L)}\) has a substantial amount of bias in all sampling designs when the population mean is large, or equivalently when the coefficient of variation, CV\(=\sigma /\mu \), is small. For example, the biases of the estimator \(\hat{\mu }_1^{(L)}\) are practically zero when \(\mu =0\), but they become very large for \(\mu =100\). The other estimators, \(\hat{\mu }_r^{(L)}\), \(r=2,3\); \(L=0,1,2\), appear to be essentially unbiased for all \(\mu \) and sampling designs. Since the biases of the estimators \(\hat{\mu }_1^{(L)}\), \(L=0,1,2\), are very large when the coefficient of variation is small, these estimators are not considered any further in this paper.

Table 3 Relative efficiencies of \(\hat{\mu }_{r}^{(L)}\), \(r=2,3\) and SRS estimators with respect to \(\hat{\mu }_{3}^{(2)}\), \(R^{(L)}_r=\mathrm{MSE}(\hat{\mu }_{r}^{(L)})/\mathrm{MSE}(\hat{\mu }_{3}^{(2)})\) and \(R_{\mathrm{SRS}}=\mathrm{MSE}(\mathrm{SRS})/\mathrm{MSE}(\hat{\mu }_{3}^{(2)})\).
Table 4 Relative efficiencies of \(\hat{\mu }_{r}^{(L)}\), \(r=2,3\) and SRS estimators with respect to \(\hat{\mu }_{3}^{(2)}\), \(R^{(L)}_r=\mathrm{MSE}(\hat{\mu }_{r}^{(L)})/\mathrm{MSE}(\hat{\mu }_{3}^{(2)})\), and \(R_{\mathrm{SRS}}=\mathrm{MSE}(\mathrm{SRS})/\mathrm{MSE}(\hat{\mu }_{3}^{(2)})\).

Tables 3 and 4 present the relative efficiencies of \(\hat{\mu }_r^{(L)}\), \(r=2,3\), and SRS mean with respect to the estimator \(\hat{\mu }_3^{(2)}\)

$$\begin{aligned} R_r^{(L)}= \frac{\mathrm{MSE}(\hat{\mu }_r^{(L)} )}{\mathrm{MSE}(\hat{\mu }_3^{(2)})}, r=2,3; L=0,1,2, \hbox { and } R_{\mathrm{SRS}}= \frac{\mathrm{MSE}(\mathrm{SRS})}{\mathrm{MSE}(\hat{\mu }_3^{(2)})}. \end{aligned}$$

The values of \(R_r^{(L)} > 1\) and \(R_{\mathrm{SRS}} >1\) indicate that the estimator \(\hat{\mu }_3^{(2)}\) outperforms \(\hat{\mu }_r^{(L)}\), and SRS mean, respectively.

There are several important features in Tables 3 and  4 that need to be discussed. It is clear that \(R_{\mathrm{SRS}}/R_r^{(L)} >1\), for \(r=2,3\) and \(L=0,1,2\), which indicates that all of the proposed estimators have higher efficiencies than SRS mean. The efficiency gain is substantial if the ranking information is accurate and set size is large. For example, in Table  3, the efficiencies of \(\hat{\mu }_2^{(0)}\), \(\hat{\mu }_3^{(0)}\), \(\hat{\mu }_2^{(1)}\),\(\hat{\mu }_3^{(1)}\), \(\hat{\mu }_2^{(2)}\), and \(\hat{\mu }_3^{(2)}\) with respect to SRS mean are 5.480 (13.809/2.52), 11.604 (13.809/1.19), 6.529 (13.809/2.115), 14.310 (13.809/0.965), 5.960 (13.809/2.317), and 13.809, respectively, when \(n=20\) ,\(H=5\), \(\rho =1\), and \(\mu =0\). Even if \(\rho = 0.50\), the new estimators are still better than SRS mean. On the other hand, the efficiency gain is not as high as the ones under perfect ranking.

Tables 3 and 4 also reveal that the relative efficiencies of design-0 estimators are generally lower than the efficiencies of design-1 and design-2 estimators. This is mostly due to the replacement policy of the design-0, where units are selected with replacement. Among these three designs, it appears that design-1 is the most efficient one. For example, \( R^{(2)}_2 /R_2^{(1)} >1\). This can be anticipated from the fact that design-1 determines the ranks of sample units sequentially by removing all the previously ranked units in the sample from the population. This sequential ranking provides stronger data structure in the sample in design-1 than the one in design-2, and hence increases the efficiency. On the other hand, design-1 and design-2 are comparable in their efficiency for the estimators \(\hat{\mu }^{(1)}_3\) and \(\hat{\mu }^{(2)}_3\). They practically have the same efficiency, \(R_3^{(1)} \approx 1\). The estimator \(\hat{\mu }^{(2)}_3\) is slightly less efficient when \(\rho =1\), but for the other values of \(\rho \), \(\rho <1\), the estimators are essentially equivalent in their efficiencies.

Fig. 1
figure 1

The plots of the simulated mean square errors of the estimators \(\hat{\mu }_r^{(L)}\), \(r=2,3\), \(L=0,1,2\), with respect to set size H. Samples (\(n=20\)) are generated from a discrete normal distribution of size \(N=100\) with mean \(\mu =10\), \(\sigma =10\) and perfect ranking \(\rho =1\). Simulation size is 1000.

The efficiencies of the estimators in Tables 3 and 4 appear to be increasing function of set size H. To investigate the impact of set size further, we performed another simulation study using discrete normal population (\(\mu =10\) and \(\sigma = 10\)) of size \(N=100\) with \(\rho =1\). In this part of the simulation study, sample and simulation sizes are taken to be 20 and 1000, respectively. Figure 1 plots the mean square errors (MSEs) of the estimators \(\hat{\mu }_r^{(L)}\), \(r=2,3\); \(L=0,1,2\) against set size H. It is clear that MSEs are the decreasing functions of set size H for \(H \le 15\) and almost flat for \(H > 15\). The reason that the MSE plots become flat for large H can be anticipated from the behavior of \(\alpha ^{(L)}(h|s_j)\). For large H, this probability will be very small (or zero) for the values of h that are inconsistent with \(s_j\). For example, if \(s_j\) is small, then large values of h yields \(\alpha ^{(L)}(h|s_j) \approx 0\). Hence, the contribution of these ranking classes would be negligible to reduce the MSE of the estimators.

Figure 1 also indicates that design-1 estimators yield smaller MSEs than the design-0 and design-2 estimators. In design-1, population ranks of the selected units are determined sequentially after removing all the previously selected units. This sequential ranking induces stronger data structure (presumably negative correlations among measured observations). Hence, design-1 yields higher efficiency results than the other designs in finite population setting.

5 Bootstrap variance estimate and confidence interval

In this section, we develop statistical inference for population mean, but similar inference can also be developed for population total with a slight change in the notation. The exact sampling distribution of the estimator is not computationally feasible for reasonable sample and population sizes. Therefore, to reduce the computational burden, we use bootstrap distribution to draw statistical inference.

Discussion in the previous section indicates that design-1 performs slightly better than the other two designs. On the other hand, the computation of the conditional inclusion probabilities in design-1 requires extensive computing time when the population and/or sample sizes are large. Since the estimators \(\hat{\mu }^{(1)}_3\) and \(\hat{\mu }^{(2)}_3\) are equivalent in their efficiencies for all practical purposes, to increase the bootstrap simulation size, we develop the inference based on design-2 estimator \(\hat{\mu }^{(2)}_3\).

Let \(\theta \) be the parameter of interest. The parameter \(\theta \) can be considered as a statistical functional \(\theta = F(\mathcal{P})\). The estimate of \(\theta \) then can be obtained from plug-in method by replacing \(\mathcal{P}\) with empirical population \(\hat{\mathcal{P}}\), \(\hat{\theta } = F(\hat{\mathcal{P}})\). The empirical bootstrap population \(\hat{\mathcal{P}}\) should preserve without replacement structure of design-0, design-1, and design-2. Let \(\varvec{x}_S^{(L)}=\{x_j,s_j\};j=1,\ldots , n, \) be the measured values of the simple random sample, \(U=\{u_{s_1},\ldots , U_{s_n}\}\), selected from population \(\mathcal{P}\) based on design-L, \(L=0,1,2\). Let D be the integer part of the ratio N / n. We construct empirical bootstrap population by repeating set \(\varvec{x}_S^{(L)}\) D times and selecting \(d=N-Dn\) pairs at random from \(\varvec{x}_S^{(L)}\) to create an empirical population of size N:

$$\begin{aligned} \hat{\mathcal{P}}^{(L)}=\{\varvec{x}_S^{(L)}, \ldots , \varvec{x}_S^{(L)}, z_1,\ldots , z_d\}, \end{aligned}$$

where \(z_t\) , \(t=1,\ldots ,d\), are randomly selected pairs from \(\varvec{x}_S^{(L)}\). It is clear that the size of the empirical population is the same as the original population. The bootstrap samples are then selected without replacement from population \(\hat{\mathcal{P}}^{(L)}\). Let \(\varvec{x}^*=\{x^*_j,a_j\}\); \(j=1, \ldots , n,\) be a re-sample from population \(\hat{\mathcal{P}}^{(L)}\), where \(a_i \in S\) for \(i=1,\ldots ,n\). For each \(b=1,\ldots , B\), let \(\varvec{x}^*_b=\{x^*_{j,b},a_{j,b}\}\); \(j=1,\ldots ,n\), be a re-sample selected from \(\hat{\mathcal{P}}^{(L)}\), we apply our estimator to each one of these bootstrap re-samples to obtain

$$\begin{aligned} \hat{\mu }_{2,b}^{(L)*}= & {} \frac{N}{H}\sum _{h=1}^H\sum _{j=1}^n w_2^{(L)}(h,j|A_b,H)x^*_{j,b}, \hbox { and } \\ \hat{\mu }_{3,b}^{(L)*}= & {} \frac{N}{H}\sum _{h=1}^H\sum _{j=1}^n w_3^{(L)}(h,j|A_b)x^*_{j,b}, \quad b=1,\ldots ,B, \end{aligned}$$

where \(A_b=(a_{1,b},\ldots , a_{n,b})\). The bootstrap variance estimates of the estimators in \(\hat{\mu }_r^{(L)}\), \(r=2,3\), are then given by

$$\begin{aligned} BV(\hat{\mu }_{r}^{(L)})=\frac{1}{B-1} \sum _{b=1}^B \left( \hat{\mu }_{r,b}^{(L)*}-\bar{\hat{\mu }}_r^{(L)*}\right) ^2, r=2,3, \end{aligned}$$

where \(\bar{\hat{\mu }}_r^{(L)*}\) is the mean of \(\hat{\mu }_{r,b}^{(L)*}\), \(b=1,\ldots ,B\).

A \((1-\gamma )100\,\%\) bootstrap percentile confidence interval is constructed by \((Q^{\gamma /2}_r,Q^{1-\gamma /2}_r)\), where \(Q^{a}_r\) is the a-th quantile of \(\hat{\mu }_{r}^{(L)}\) satisfying \(a= P(\hat{\mu }_{r}^{(L)}<Q^{a}_r | \mathcal{P})\). The quantiles \(Q^{\gamma /2}_r\) and \(Q^{1-\gamma /2}_r\) are obtained from bootstrap distribution of \(\hat{\mu }_{r}^{(L)}\).

In order to investigate the properties of bootstrap variance estimate of \(\hat{\mu }_{r}^{(2)}\), for \(r=2,3\), and the bootstrap percentile confidence interval of population mean, we performed another simulation study. The simulation parameters are taken to be \(n=30,50\), \(H=2,5\), \(\rho =1, 0.75, 0.5\). The shift parameter \(\mu \) is selected to be \(\mu = 0,100\). Datasets are generated from discrete normal and exponential distributions. Simulation and bootstrap replications are selected to to be 2000 and 1000, respectively.

Table 5 Coverage probabilities (C) of 95 % bootstrap percentile confidence intervals of population mean based on estimators \(\hat{\mu }_{r}^{(2)}\), \(r=2,3\).

Table 5 presents 95 % coverage probabilities (C) of the bootstrap percentile confidence intervals based on estimators \(\hat{\mu }_{r}^{(2)}\), \(r=2,3\), and SRS mean. Table 5 reveals that coverage probabilities are reasonably close to the nominal coverage probability 0.95 for discrete normal distribution. The coverage probabilities for the discrete exponential distribution appear to be slightly lower than the nominal level for small sample sizes. This may be due to the fact that super-population exponential distribution is a skewed distribution. Hence, it may require larger sample sizes to satisfy the regularity conditions of the bootstrap procedure in Booth et al. (1994).

Table 6 Bootstrap(B) and simulation (S) variance estimates of the estimators \(\hat{\mu }_{r}^{(2)}\), \(r=2,3\).

Table 6 presents bootstrap (B) and simulated (S) variance estimates of the estimators \(\hat{\mu }_{r}^{(2)}\), \(r=2,3\) and SRS mean. It is obvious that bootstrap variance estimates are almost identical to those estimated from simulation study. The simulation results provide a convincing evidence that variance estimates of the proposed estimators can be computed from bootstrap distribution.

6 Application

In this section, we apply the proposed sampling designs and estimators to 1992 Ohio corn yield data which were used by the Ohio Agricultural Statistics Department in its county estimation program. This dataset includes responses from farms in the USDA’s National Quarterly Agricultural Survey and from farms responding to the Ohio supplemental survey, Husby et al. (2005). (Also, see Ohio Department of Agriculture, 1993, for published estimates based on these data). The success of the proposed sampling designs depends heavily on accurate ranking information of the population units. To get a reasonably correct rank ordering of the population units, we select our population as one of the counties from the Ohio Corn Data having 202 farms. Hence, the population size is \(N= 202\). In this population, there are five variables: corn production (bushels, X), farm size (acreage, \(Y_1\)), group size (\(Y_2\)), acre planted (\(Y_3\)), and acre harvested (\(Y_4\)). Our interest lies in estimation of the mean corn production in the county. The constructions of the proposed sampling designs require rank ordering of 202 farms based on their corn production. We use the variables \(Y_1\), \(Y_2\), \(Y_3\), and \(Y_4\) as auxiliary variables to predict rank ordering of X. There is high correlation between X and the other auxiliary variables, \(\rho _k=\mathrm{cor}(X,Y_k)\), \(r=1,\ldots ,4\). The histogram of X reveals that the population is strongly skewed right. The parameters of this population are given in Ozturk (2014c) and reproduced in Table 7.

Table 7 Population characteristics of an Ohio county in 1992 Ohio corn data, \(\rho _k=\mathrm{corr}(X,Y_k)\).

The auxiliary variable group size \(Y_2\) is an integer-valued random variable which only takes values 1, 2, and 3. There are also ties in other auxiliary variables \(Y_1\), \(Y_3\), and \(Y_4\). In order to break the ties, we generated a random vector \(\varvec{\epsilon }\) of size \(N=202\) from a normal distribution with mean 0 and standard deviation 0.001 and constructed \(\varvec{y}^*_j=\varvec{y}_j+\varvec{u}\), \(j=1,\ldots ,4\). Since all entries in \(\varvec{y}^*_j \) are unique, the rank ordering of vector \(\varvec{x}\) with no ties is estimated from \(\varvec{y}^*_j\), \(j=1,\ldots ,4\).

By treating these 202 farms as a finite population, we performed another simulation study to investigate the biases and efficiencies of \(\hat{\mu }_2^{(2)}\) and \(\hat{\mu }_3^{(2)}\). In this part of the simulation, we also included ratio (\(\hat{\mu }_\mathrm{Ra}\)) and regression (\(\hat{\mu }_\mathrm{Reg}\)) estimator of the population mean

$$\begin{aligned} \hat{\mu }_\mathrm{Ra}=\frac{1}{N}\frac{\sum _{j=1}^n x_j }{\sum _{j=1}^ns_j}\frac{N(N+1)}{2} \hbox { and } \hat{\mu }_\mathrm{Reg}=\hat{B}_0+\hat{B}_1(N+1)/2, \end{aligned}$$

where \(\hat{B}_0\) and \(\hat{B}_1\) are the estimated regression coefficients, regressing \(x_j\) on \(s_j\). Simulation study also considered the bootstrap estimates of standard deviations of the estimators and coverage probabilities of the percentile confidence intervals. Samples in the simulation are selected with the following sample and set size combination, \((n,H)=(20,10),(30,6)\), and (50, 4). Simulation and bootstrap replication sizes are taken to be 3000 and 2000, respectively.

Table 8 Biases (Bi) and relative efficiencies (R) of the estimators \(\hat{\mu }_2^{(2)}\), \(\hat{\mu }_3^{(2)}\), \(\hat{\mu }_\mathrm{Ra}\), \(\hat{\mu }_\mathrm{Reg}\), and SRS mean.

Table 8 presents the biases (Bi) of \(\hat{\mu }_2^{(2)}\),\(\hat{\mu }_3^{(2)}\),\(\hat{\mu }_\mathrm{Ra}\), \(\hat{\mu }_\mathrm{Reg}\), and SRS mean. It also contains the relative efficiencies of the estimator \(\hat{\mu }_3^{(2)}\) with respect to \(\hat{\mu }_2^{(2)}\), SRS sample mean, \(\hat{\mu }_\mathrm{Ra}\), and \(\hat{\mu }_\mathrm{Reg}\)

$$\begin{aligned} R_2^{(2)}= & {} \frac{\mathrm{MSE}(\hat{\mu }_2^{(2)})}{\mathrm{MSE}(\hat{\mu }_3^{(2)})}, \quad R_\mathrm{SRS}^{(2)}=\frac{\mathrm{MSE}(\mathrm{SRS})}{\mathrm{MSE}(\hat{\mu }_3^{(2)})}, \\ R_\mathrm{Ra}^{(2)}= & {} \frac{\mathrm{MSE}(\hat{\mu }_\mathrm{Ra})}{\mathrm{MSE}(\hat{\mu }_3^{(2)})} \, \hbox { and } \, R_\mathrm{Reg}^{(2)} = \frac{\mathrm{MSE}(\hat{\mu }_\mathrm{Reg})}{\mathrm{MSE}(\hat{\mu }_3^{(2)})}. \end{aligned}$$

Again relative efficiencies greater than one indicate that the estimator \(\hat{\mu }_3^{(2)}\) has smaller mean square error.

Biases in Table 8 follow a pattern similar to the ones we have observed in Tables 1 and 2. For large set sizes, the estimators have slightly larger negative biases than SRS estimator. This can be anticipated from the fact that the proposed estimators borrow information from unmeasured units in a set of size H. For skew distributions and large set sizes H, effects of extreme observations in the sample are divided into H different strata, Hence the influence of extreme observations on the estimator is reduced. For this reason, for skewed distributions, the estimators provide a slightly under-estimation for the population mean.

The relative efficiencies also follow the similar pattern as in Tables 3 and 4 . The estimator \(\hat{\mu }_3^{(2)}\) is the best estimator, and the efficiencies increase with the quality of ranking information, set, and sample sizes. Efficiency gain with respect to SRS sample mean, ratio, and regression estimators is substantial if the ranking information is reasonably accurate.

The same dataset is analyzed in Ozturk (2014c) using RSS design by combining ranking information from different sources. Sampling designs in this paper are different from the RSS designs used in Ozturk (2014c). Our designs select simple random samples and construct weights based on global ranking information in the entire population, whereas Ozturk (2014c) uses local ranking information in a given set of size H. Since we use global ranking information in the population, our estimator performs better than RSS designs in Ozturk (2014c).

Table 9 Coverage probabilities of bootstrap percentile confidence intervals of population mean based on estimators \(\hat{\mu }_r^{(2)})\), \(r=2,3\), and SRS mean.
Table 10 Bootstrap (B)- and simulation (S)-based estimates of standard deviation of the estimators, \(\hat{\mu }_r^{(2)}\), \(r=2,3\), and SRS mean.

Table 9 presents the coverage probability of the bootstrap percentile confidence interval for population mean based on estimators \(\hat{\mu }_2^{(2)}\), \({\mu }_3^{(2)}\) and SRS mean. All these coverage probabilities appear to be close to the nominal value 0.95 for large sample sizes. Since the population has strong skewness to right, for small sample sizes, the coverage probabilities are slightly smaller than the nominal coverage probability 0.95. Table 10 presents the standard error estimates of the estimators. Again the bootstrap estimate of standard deviations are close to the estimate of the standard deviation of the estimators from simulation.

7 Concluding remark

We have developed three sampling designs to estimate the population total and mean in a finite population setting. The proposed sampling designs select a simple random sample and identify their population ranks. Selection of the sample could be either with- or without-replacement. Replacement policy and the way that population ranks of selected units are identified define the three different sampling designs: design-0, design-1, and design-2. In these designs, population ranks of the measured units provide information about the relative positions of the sample units in the population. This positional information is used to borrow additional information from other unmeasured units in the population to reduce the uncertainty in the sample. We introduced three different estimators for the population mean and total for each one of these sampling designs. We show that the estimators perform better than simple random sample estimator as long as there is meaningful ranking information to rank the population units. The efficiencies of the estimators are an increasing function of the correlation coefficient between the response and auxiliary variable.

The population ranks of the selected units can be considered as covariate. These ranks are not strongly attached to the measured values. They can be ignored completely, and the sample can be analyzed as a simple random sample. Similar to a ranked set sample, where strong tie is established between the rank and measurement, it is possible to attach ranking information to the measured units strongly to induce further stratification in the data, but the resulting sample may not be reduced to a simple random sample. In this case, the properties of this sample need to be developed further. One of our current project investigates this type of sampling design.