Estimation and efficiency evaluation of stochastic frontier models with interval dependent variables

Hwu, Shih-Tang; Fu, Tsu-Tan; Tsay, Wen-Jen

doi:10.1007/s11123-021-00609-w

Estimation and efficiency evaluation of stochastic frontier models with interval dependent variables

Published: 26 May 2021

Volume 56, pages 33–44, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Productivity Analysis Aims and scope Submit manuscript

Estimation and efficiency evaluation of stochastic frontier models with interval dependent variables

Download PDF

Shih-Tang Hwu¹,
Tsu-Tan Fu² &
Wen-Jen Tsay³

395 Accesses
1 Citation
Explore all metrics

Abstract

This paper considers the maximum likelihood estimation of a stochastic frontier production function with an interval outcome. We derive an analytical formula for calculating the likelihood function of interval stochastic frontier models. Monte Carlo experiments reveal that the finite sample performance of our method is promising even when the sample size is relatively moderate. We also provide an exact formula for evaluating technical efficiency with interval outcome and apply our method to measure information inefficiency in the labor market for newly graduated college students in Taiwan.

Estimating efficiency effects in a panel data stochastic frontier model

Article 21 November 2019

Stochastic frontier models using the Generalized Exponential distribution

Article 02 February 2021

Stochastic frontier estimation through parametric modelling of quantile regression coefficients

Article Open access 03 July 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Consider a stochastic frontier model with the following matrix form:

$$y={X}^{\top }\beta +\varepsilon ,$$

(1)

where y and ε are N × 1 vectors of observations on the dependent variable and the random disturbance, respectively; X is an N × K matrix of observations on a constant term and K − 1 regressors; and β is a K × 1 vector of unknown regression coefficients to be estimated. The error specification is:

$$\varepsilon =v+Su,$$

(2)

where the elements of v are independent and identically distributed (iid) as $N(0,{\sigma }_{v}^{2})$, and the elements of u are the absolute value of the variables that are iid as $N(0,{\sigma }_{u}^{2})$. Here, S is a pre-specified number that equals –1 for the production frontier or 1 for the cost frontier. All $v^{\prime} s$ and $u^{\prime} s$ are independent of each other and are also independent of X. We follow the reparameterization of Aigner et al. (1977, ALS hereafter):

$${\sigma }^{2}={\sigma }_{u}^{2}+{\sigma }_{v}^{2},\ \lambda ={\sigma }_{u}/{\sigma }_{v}.$$

(3)

ALS show that the log-likelihood function for the aforementioned stochastic frontier model is:

$$L=\frac{N}{2}{\mathrm{ln}}\,\left(\frac{2}{\pi }\right)-N{\mathrm{ln}}\,(\sigma )+\mathop{\sum }\limits_{i=1}^{N}{\mathrm{ln}}\,\left[{{\Phi }}\left(\frac{S\lambda }{\sigma }{\varepsilon }_{i}\right)\right]-\frac{1}{2{\sigma }^{2}}\mathop{\sum }\limits_{i=1}^{N}{\varepsilon }_{i}^{2}.$$

(4)

Here, ${\varepsilon }_{i}={y}_{i}-{x}_{i}^{\top }\beta$, ${x}_{i}^{\top }$ is the ith row of X, and Φ(⋅) is the standard normal cumulative distribution function (cdf). The maximum likelihood estimator (MLE) is obtained by maximizing Eq. (4) with respect to the parameters ${(\beta ,{\sigma }_{u},{\sigma }_{v})}^{\top }$.

This paper extends the analysis of ALS to the scenario where the dependent variable ${y}_{i}^{* }$ is an ordered interval variable located in J + 1 regimes:

$$\left\{\begin{array}{ll}{y}_{i}^{* }={x}_{i}^{\top }\beta +{\varepsilon }_{i},&\quad i=1,2,\ldots ,N;\\ \qquad{y}_{i}=0,&\quad\,\,\,{\text{if}}\,\ {y}_{i}^{* }\le {w}_{0};\\ \qquad{y}_{i}=1,&\,\,\,\quad{\text{if}}\,\ {w}_{0}\,<\,{y}_{i}^{* }\le {w}_{1};\\ \qquad\quad\,\,\vdots &\\ \qquad{y}_{i}=J,&\,\,\,\quad{\text{if}}\,\ {y}_{i}^{* }\,>\,{w}_{J-1},\end{array}\right.$$

(5)

i.e., the cell limits w₀ < w₁ < w₂ < ⋯ < w_J − 1 are observable even though ${y}_{i}^{* }$ is unobservable. For example, as regularly found in survey questionnaires, we are unable to observe the exact value of some variable, rather just the range of it. This kind of survey structure is ubiquitously prevalent in the literature.

This paper thus suggests an analytic and highly accurate formula for evaluating the likelihood function of the SFA model in Eq. (5) and proposes an exact formula to measure the technical efficiency of the model by extending the method in Jondrow et al. (1982, JLMS hereafter). Since neither a numerical integration nor simulation-based techniques are required in the two formulae proposed in this paper, the computational burden of our approach is lowered to the limit. Amsler et al. (2019, AST hereafter) also provide a detailed discussion of the importance of using highly accurate formula to evaluate the distribution function of ε. We focus on the case where the elements of u are the absolute value of the variables that are iid as $N(0,{\sigma }_{u}^{2})$, because it is directly linked to the distribution assumption of ALS^{Footnote 1}. Interval stochastic frontier model can also be tackled with the Bayesian estimation strategy of Griffiths et al. (2014) given that u is an exponential distribution, but its implementation is not trivial. Moreover, the Monte Carlo experiments conducted herein show that the finite sample performance of our formulae is very promising.

We apply our method to the wage rates of newly graduated college students in Taiwan. The empirical results reveal the phenomenon of “school quality matters,” in which a worker graduating from a higher quality school has a higher potential wage. We also find that the undergraduate major affects the level of potential wage. Furthermore, the students of private universities are found to attain the highest efficiency score, implying private university students perform the best in their job market information search.

The remaining parts of this paper are arranged as follows. “Interval stochastic frontier model” presents the model and analytic formula of the likelihood function. “Interval efficiency estimate” provides an exact formula for measuring the technical efficiency when the conditional variable ε is only observed in a range. “Monte Carlo experiment” investigates the performance of our formulae via Monte Carlo experiments. “Empirical application” applies the proposed method to data on the wage rates of newly graduated college students in Taiwan. “Conclusion” concludes.

2 Interval stochastic frontier model

Defining f( ⋅ ) and F( ⋅ ) as the probability density and distribution functions of ε_i, respectively, the probability of ${y}_{i}^{* }$ in each regime is:

$$\left\{\begin{array}{ll}\qquad\quad{F}_{i0}=F\left({w}_{0}-{x}_{i}^{\top }\beta \right),&\,\qquad{\text{if}}\,\ {y}_{i}^{* }\le {w}_{0};\\ {F}_{i1}=F\left({w}_{1}-{x}_{i}^{\top }\beta \right)-F\left({w}_{0}-{x}_{i}^{\top }\beta \right),&\,{\text{if}}\,\ {w}_{0}\,<\,{y}_{i}^{* }\le {w}_{1};\\ \qquad\qquad\qquad\qquad\vdots &\\ \qquad{F}_{iJ}=1-F\left({w}_{J-1}-{x}_{i}^{\top }\beta \right),&\,\quad{\text{if}}\,\ {y}_{i}^{* }\,>\,{w}_{J-1}.\end{array}\right.$$

(6)

Consequently, the log-likelihood function for the model in Eq. (5) is:

$${\mathrm{ln}}\,L=\mathop{\sum }\limits_{i=1}^{N}\mathop{\sum }\limits_{j=0}^{J}1({y}_{i}=j){\mathrm{ln}}\,{F}_{ij},$$

(7)

where 1(⋅) is the indicator function.

The major difficulty for the maximum likelihood estimation centers on computing the associated cdf functions in Eq. (7). For example, we have:

$$F\left({w}_{j}-{x}_{i}^{\top }\beta \right)={\displaystyle\int\nolimits_{-\infty }^{{w}_{j}-{x}_{i}^{\top }\beta }}f({\varepsilon }_{i})d{\varepsilon }_{i}=\frac{2}{\sigma }{\displaystyle\int\nolimits_{-\infty }^{{Q}_{ij}}}\left({\displaystyle\int\nolimits_{-\infty }^{a{\varepsilon }_{i}}}\phi (\zeta )d\zeta \right)\phi (b{\varepsilon }_{i})d{\varepsilon }_{i}=\frac{2}{\sigma }I\left({Q}_{ij},a,b\right),$$

(8)

where j = 0, 1, …, J − 1, ${Q}_{ij}={w}_{j}-{x}_{i}^{\top }\beta$, $a=S\frac{\lambda }{\sigma }$, $b=\frac{1}{\sigma }$, and ϕ(⋅) stands for the standard normal density function. Provided that we have a good approximation formula for the value of $I\left({Q}_{ij},a,b\right)$ in Eq. (8), we can accurately evaluate the likelihood function in Eq. (7). The first contribution of this paper is to derive such an approximation formula by extending the methodology of Tsay et al. (2013), who only deal with the cases S = 1 in Eq. (8). Since the formula developed in this paper does not impose a constraint on the value of S, it can be used for both cost and production function scenarios encountered in interval stochastic frontier models. Before presenting I_app(Q_ij, a, b) in the following Proposition 1, we define an error function as:

$$Erf(z)=\frac{2}{\sqrt{\pi }}{\displaystyle\int\nolimits_{0}^{z}}{e}^{-{t}^{2}}dt=2{\displaystyle\int\nolimits_{0}^{\sqrt{2}z}}\phi (t)dt.$$

(9)

Proposition 1 I(Q_ij, a, b) in Eq. (8) can be approximated by I_app(Q_ij, a, b):

$$\begin{array}{ll}{I}_{\rm{app}}({Q}_{ij},a,b)=&\frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[1+Erf\left(\frac{\sqrt{{a}^{2}}{c}_{1}}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]S}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}+\frac{\sqrt{{b}^{2}}}{2{b}^{2}}\frac{1-S}{2}\\ &-\frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[Erf\left(\frac{a{c}_{1}}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)+Erf\left(\frac{-a{c}_{1}+\sqrt{2}{Q}_{ij}\left({b}^{2}-{a}^{2}{c}_{2}\right)\rm{sign}(S{Q}_{ij})}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\\ &+\frac{Erf\left(\frac{b{Q}_{ij}}{\sqrt{2}}\right)}{2b}\frac{1+\rm{sign}(S{Q}_{ij})}{2}\end{array}$$

where c₁ = –1.0950081470333 and c₂ = –0.75651138383854.^{Footnote 2}

Proof of Proposition 1 is in the Appendix. AST and Amsler et al. (2020) address the issues of evaluating the cdf of the skew normal distribution or the composed error ε in Eq. (2). In the upper tail of the cdf of ε, the approximation formula of Tsay et al. (2013) and the formula APS-UT of APS works really well. In the lower tail, the approximation formula APS-LT of APS is recommended. We thus have the following proposition.

Proposition 2 I(Q_ij, a, b) in Eq. (8) can also be approximated by the formula APS-LT of APS

$$I({Q}_{ij},a,b)\approx \left\{\begin{array}{ll}\quad\,\,\, 2{{\Phi }}\left(\frac{{Q}_{ij}}{\sigma }\right){{\Phi }}\left(\frac{\lambda {Q}_{ij}}{\sigma }\right)\frac{1}{1+{\lambda }^{2}},&\,\,\,{\text{if}}\,\ S=1;\\ 1-2{{\Phi }}\left(\frac{-{Q}_{ij}}{\sigma }\right){{\Phi }}\left(\frac{-\lambda {Q}_{ij}}{\sigma }\right)\frac{1}{1+{\lambda }^{2}},&\,\text{if}\,\ S=-1;\end{array}\right.$$

when the value of Q_ij approaches the lower tail of the distribution.

For detailed simulations and discussions of Proposition 2, readers are referred to APS. With the combination of I_app(Q_ij, a, b) or APS-LT, the cdf of ε in Eq. (8) can be easily evaluated with a standard statistical package. This implies that the computation of the associated likelihood function in Eq. (7) is straightforward when using these formulae.

3 Interval efficiency estimate

This section addresses the important issue of ranking the relative efficiency level of different decision units given that the parameters in Eq. (5) are estimated. ALS defined the efficiency level of different units as $\exp \left(S{u}_{i}\right)$. Since u_i is unobservable, JLMS propose to estimate the efficiency level of each unit for the case S = −1 as:

$$\exp \left(-{\widehat{u}}_{i}\right),\ \ {\widehat{u}}_{i}=E\left({u}_{i}| {\varepsilon }_{i}\right)=S\frac{{\sigma }_{u}^{2}}{{\sigma }^{2}}{\varepsilon }_{i}+\frac{{\sigma }_{u}{\sigma }_{v}}{\sigma }\frac{\phi \left(\lambda \frac{{\varepsilon }_{i}}{\sigma }\right)}{{{\Phi }}\left(S\lambda \frac{{\varepsilon }_{i}}{\sigma }\right)}.$$

(10)

This result is widely used in the literature. Nevertheless, it only can be used when the dependent variable is continuous. The interval nature of the dependent variable in Eq. (5) requires a new formula to evaluate the conditional mean of u_i when ε_i is only observed in a range. In particular, for y_i = 0 in Eq. (5), we observe the upper bound of ε_i as:

$${\widehat{\varepsilon }}_{i}\le {w}_{0}-{x}_{i}^{\top }\widehat{\beta },$$

(11)

where $\widehat{\beta }$ are the maximum likelihood estimates. For y_i ∈ {1, …, J − 1}, we observe both the upper and the lower bound of ε_i as:

$${w}_{j-1}-{x}_{i}^{\top }\widehat{\beta }\le {\widehat{\varepsilon }}_{i}\le {w}_{j}-{x}_{i}^{\top }\widehat{\beta },j=1,\ldots ,J-1$$

(12)

For y_i = J, we observe the lower bound of ε_i as:

$${\widehat{\varepsilon }}_{i}\ge {w}_{J-1}-{x}_{i}^{\top }\widehat{\beta }.$$

(13)

Because we can only observe the boundary of ε_i, the conditional expectations cannot be directly evaluated with the method of JLMS. We thus fill the gap in the literature by providing an exact formula to fulfill our goal.

Without loss of generality, we assume the censoring interval of dependent variable is at $\left(a,b\right)$, and the conditional expectation denotes E(u_i∣a < ε_i < b). Defining ${[{\mathbb{D}}(.)]}_{a}^{b}={\mathbb{D}}(b)-{\mathbb{D}}(a)$, we state our theoretical finding in the following proposition.

Proposition 3 Let ${\lambda }^{* }=\sqrt{1+{\lambda }^{2}}$. E(u_i∣a < ε_i < b) is expressed as:

$$\frac{S\frac{{\sigma }_{u}^{2}}{\sigma }\left\{-\sigma {\left[f\left(\varepsilon \right)\right]}_{a}^{b}+\frac{2\lambda }{\sqrt{2\pi }{\lambda }^{* }}{\left[{{\Phi }}\left(\frac{S{\lambda }^{* }}{\sigma }\varepsilon \right)\right]}_{a}^{b}\right\}+\frac{{\sigma }_{u}{\sigma }_{v}}{\sqrt{2\pi }{\lambda }^{* }\sigma }{\left[erf\left(\frac{{\lambda }^{* }}{\sqrt{2}\sigma }\varepsilon \right)\right]}_{a}^{b}}{{[F\left(\varepsilon \right)]}_{a}^{b}}.$$

Derivations of Proposition 3 are in the Appendix. The formula in Proposition 3 is exact and can be easily computed with standard statistical packages, given that the denominator, ${[F\left(\varepsilon \right)]}_{a}^{b}$, can be evaluated with the results in Proposition 1 and Proposition 2. We also establish that the results in Proposition 3 are identical to those in JLMS when the censoring interval degenerates to 0 in the following corollary.

Corollary 1 Let c and ξ be any finite constant such that:

$$\lim\limits_{\xi \to 0}E\left({u}_{i}| c\,<\,{\varepsilon }_{i}\,< \,c+\xi \right)\to E\left({u}_{i}| {\varepsilon }_{i}=c\right),$$

where $E\left({u}_{i}| {\varepsilon }_{i}=c\right)$ is the formula in JLMS.

Proof of the above Corollary is also put in the Appendix.

4 Monte Carlo experiment

We consider two sets of Monte Carlo experiments in this section: (1) the finite sample performance of MLE when the data-generating processes (DGP) are the form in Eq. (5), and (2) the accuracy of the interval efficiency estimate in Proposition 3. All the programs are written in GAUSS.

4.1 Finite sample performance of MLE

Following Olson et al. (1980, p. 76) and Tsay et al. (2013), we consider a set of experiments with a two-regressor model:

$${y}_{i}^{* l}={\beta }_{0}+{\beta }_{1}{x}_{i}+{\varepsilon }_{i}^{l},\ i=1,2,\ldots ,N,\ l=1,2,\ldots ,1000,$$

(14)

where l denotes the lth replication of the data, and the regressors x_i are only drawn once from N(0,1) and kept fixed throughout the simulation. Without loss of generality, we consider the case where J = 3; i.e., we divide the data into four regimes depending on whether ${y}_{i}^{* l}$ is less than w₀, between w₀ and w₁, between w₁ and w₂, or greater than w₂. We consider three sets of (w₀, w₁, w₂, β₀, β₁, σ_u, σ_v) to check the robustness of the simulation results and four sample sizes, N = 100, 400, 1600, and 6400. Since our model is directly linked to the model in ALS and ${y}_{i}^{* l}$ is known to us, we also estimate Eq. (14) using the log-likelihood function in Eq. (4) for comparison purpose. The optimization algorithm used to implement MLE for both models is the quasi-Newton algorithm of Broyden, Fletcher, Goldfarb, and Shanno (BFGS) contained in the GAUSS MAXLIK library. The maximum number of iterations for each replication is 200.

The results in Tables 1 and 2 show the bias and mean squared errors (MSE) of the maximum likelihood estimates of ALS’s model and our model, respectively. For all the combinations of parameters presented in Tables 1 and 2, we find the bias is small and the MSE always decreases with an increasing sample size. The results confirm the promising performance of the proposed I_app approximation formula, and enhance our confidence in using I_app for the interval SFA models. We also find that the bias and MSE in Table 2 are larger than those in Table 1. This is consistent with the fact that our dependent variable is limited and thus contains less information than the continuous variable ${y}_{i}^{* l}$.

Table 1 Finite sample properties of MLE for ALS’s model in Eq. (4)

Full size table

Table 2 Finite sample properties of MLE with the analytic formula in Proposition 1

Full size table

4.2 Accuracy of the interval efficiency estimate

This subsection illustrates the accuracy of the interval efficiency estimator proposed in Proposition 3. Following Kumbhakar and Lothgren (1998), we obtain 100 million draws of u_i and v_i, which are generated from ${N}^{+}\left(0,{\sigma }_{u}^{2}\right)$ and $N\left(0,{\sigma }_{v}^{2}\right)$, respectively. We control the variance of two error terms, σ_u and σ_v, and the variance ratio, λ, that reflects the contribution of the variance of u to the total variance of the error term ε. In particular, three different variance ratios λ = (0.5, 1, 2) and six censoring intervals of ε, $\left(a,b\right)=\left(-4,-2\right)$, $\left(-2,-1\right),\left(-1,-0.5\right),\left(-0.5,0\right),\left(0,1\right)$, and $\left(1,3\right)$, are employed.

Table 3 presents the simulation results. For all 18 combinations of parameters and censoring intervals, the largest difference between the values calculated by Proposition 3 and the values calculated by 100 million draws is 0.0006. The results again confirm the accuracy of Proposition 3.

Table 3 Accuracy of interval efficiency estimate

Full size table

5 Empirical application

This paper applies the proposed method to the earning frontier analysis of newly graduated students in Taiwan. The data used in the empirical example reported below are drawn from a survey conducted by Peng (2005) on Taiwanese college graduates who graduated in 2003. The survey employed a proportionate stratified two-stage random sampling framework in which schools and majors were used as the stratum. To serve the purpose of this study, the survey data are further screened to include only full-time workers, excluding part-time workers, self-employed, family workers, and graduates in military service or in prison. The above screening of the survey data yields a total of 3973 sample workers.

Following the literature of the human capital theory and the framework of Hofler and Polachek (1985), we develop a stochastic earning frontier function to empirically measure the labor market inefficiency of recent Taiwanese college graduates. Hofler and Polachek (1985) define the wage differential between the maximum potential wage that any given individual can attain and the wage the person actually earns to be the inefficiency that may be attributed to incomplete worker information. To accommodate the problem that the wage differential may include pure inefficiency as well as unobserved heterogeneity, previous studies have suggested to include demographic data in the earning function (Hofler and Polachek 1985; Polachek and Yoon 1996).

The dependent variable is the natural logarithm of a worker’s monthly wage (measured by New Taiwan Dollar). It is located in four possible categories^{Footnote 3}: less than ${\mathrm{ln}}\,(22800)$, between ${\mathrm{ln}}\,(22800)$ and ${\mathrm{ln}}\,(28800)$, between ${\mathrm{ln}}\,(28800)$ and ${\mathrm{ln}}\,(36300)$, and more than ${\mathrm{ln}}\,(36300)$. The explanatory variables (X) used in estimating the maximum potential wage cover a set of socio-demographic factors. More specifically, the explanatory variables are related to education or work experience, gender, work status, school type, and individual majors in college.

Education/Work experience:

Experience = 1 if a worker ever had a part-time job during undergraduate schooling; Experience = 0, otherwise. Since all subjects are recent graduates (with one year of work experience), they all have the same education level and number of working years. However, those who had a part-time work experience during college are expected to have more labor market information, which may lead to a better paying job.

Gender:

Male = 1 if male; and Male = 0, otherwise. Females are usually paid less due to sex discrimination in the job market.

Work Status:

Hours = Worker’s weekly working hours. Workers with longer working hours are expected to earn more.

Knowledge Match:

Knowledge Match = 1 if a worker’s undergraduate major matches the job requirement; Knowledge Match = 0, otherwise. The knowledge match is postulated to have a positive impact on the wage rate.

Public Sector:

Public Sector = 1 if working for the government; and Public Sector = 0, otherwise. Government employees in Taiwan on average are better paid than their private counterparts.

School Type:

Public University = 1 if a worker graduated from a public (comprehensive) university; and Public University = 0, otherwise.

Public Vocational College = 1 if a worker graduated from a public vocational college; and Public Vocational College = 0, otherwise.

Private Vocational College = 1 if a worker graduated from a private vocational college; and Private Vocational College = 0, otherwise.

Taiwan’s higher education institutions can be classified by types into public/private or comprehensive/vocational. Among the four different types, Private University is used as the base for regression measurement. School type may reflect school quality. Workers who graduated from better quality schools are expected to earn more (Card and Krueger, 1992). In Taiwan, public schools are found to be of better quality than private schools, whereas comprehensive universities are on average better than vocational colleges in school quality (Fu, 2011).

Undergraduate Major:

Social Science = 1 if a worker graduated with a major in social science; Social Science = 0, otherwise.

Technology = 1 if a worker graduated with a major in engineering and technology; Technology = 0, otherwise.

The majors of undergraduates are classified into Social Science, Technology and Engineering, and Humanities. The Humanities category is used as the base for the regression analysis. In Taiwan’s labor market, workers with a technology major are generally paid well, followed by Humanities and Social Science.

It is worth to note that our control variables did not include the variables of workers’ seniority, i.e., age and experience of working, as our dataset does not provide the observations of these two variables. Nevertheless, this will not cause much loss of information, because the respondents of our dataset are newly graduated college students who have very little formal working experience, and they are about of the same age^{Footnote 4}.

Table 4 shows the mean statistics of variables used in the interval SFA model. Among all 3973 subjects of the four types of schools, 68.92% of them (38.54% from Private University and 30.38% from Private Vocational College) graduated from private schools. Public schools make up only 31.08% of the subjects, with 21.02% from Public University and 10.07% from Public Vocational College. Public University has a relatively high mean value in wage, knowledge match, and working in the public sector, as compared to the other three types of schools. The mean wages of public schools are averagely higher than those in private schools. We also find that the sample graduates from both Public and Private Vocational Schools have a higher percentage (46–55%) of obtaining a Technology and Engineering degree than those from general comprehensive schools, whereas private schools are found to have a higher percentage of graduates who majored in Social Science than do public schools.

Table 4 Summary statistics (sample mean)

Full size table

Table 5 shows that the sign of the estimates in the earning frontier function matches our expectations. Workers obtain a higher potential wage when working in the public sector, taking a job with a knowledge match, and having more working hours. The positive sign of Male indicates the existence of gender discrimination in Taiwan’s labor market. The school type variables show a positive sign for Public University and negative signs for both types of vocational schools, as compared to Private University.

Table 5 Estimating log wage equation with the interval SFA model

Full size table

School type variables are proxies for school quality, and it is widely recognized in Taiwan that the quality of public schools is on average higher than that of private schools, as are general comprehensive universities versus vocational colleges. Therefore, the empirical results indicate the phenomenon of "quality matters," in which a worker graduating from a higher quality school will have a higher potential wage. In our study, the sample graduates from public universities have the highest potential wage. Table 4 also shows that undergraduate major affects the level of potential wage. Those who majored in technology and engineering (social science) have a higher (lower) potential wage than those with a humanities major.

Given that the parameters are estimated, we now apply the results in Proposition 3 to measure the relative efficiency level of different units. The summary statistics and histograms of the resulting estimates are displayed in Table 6 and Figs. 1–4, respectively. From these efficiency estimates, we find that private university graduates achieve the highest average efficiency of 0.8406 as compared to the value of 0.8380 for public university graduates, but the difference is quite small. This implies that private universities try to help their graduates to get a better payoff so as to raise their ability of enrolling new students when facing the difficulty of attracting the best students in Taiwan. Interestingly, the lowest value of efficiency found is 0.5217, which is from graduates of a public university.

Table 6 Summary statistics of the efficiency estimates

Full size table

6 Conclusion

This paper extends the analysis of ALS to the scenario where the dependent variable is an ordered interval variable by recognizing that the interval coding data are commonly collected in surveys to individuals or decision units. In such surveys, the subjects are often asked to answer questions on their performance by choosing among some ordered interval coding values. We thus provide an analytic formula for evaluating the associated likelihood function and propose an exact formula to measure the technical efficiency by generalizing the method in JLMS. Our approaches do not depend on any numerical integration or simulation-based techniques and thus can be implemented straightforwardly.

The current paper illustrates an application of the proposed model for measuring labor market information inefficiency when the earning variable is measured by interval outcomes. Other potential applications are abundant, including survey studies that adopt contingency valuation methods to examine the values of hypothetical quality improvement of public goods or non-market goods (Carson 2012). In particular, we can use the willingness to pay (WTP) for quality improvement of environmental goods such as water, air, or any such amenity. Studies on WTP for hypothetical non-market goods such as national security, conservation of public parks or natural resources, and new medical treatments are also abundant in both the environment and health-related literature. Since WTP is often expressed in an interval form due to the contingency valuation survey design, our proposed model can therefore be used to measure the inefficiency of WTP reporting.

Notes

Other distribution assumptions for u have also been discussed in the literature, including truncated normal distribution (Stevenson 1980), gamma distribution (Greene 1990), and Weibull distribution (Tsionas 2007). Recently, Badade and Ramanathan (2020) propose a probabilistic frontier model using a logit model specification.
For discussion and evaluation of c₁ and c₂, readers are referred to page 261 of Tsay et al. (2013).
The original survey conducted by Peng (2005) has nine categories for worker’s monthly wage: (1) less than 15,840, (2) between 15,840 and 22,800, (3) between 22,800 and 28,800, (4) between 28,800 and 36,300, (5) between 36,300 and 45,800, (6) between 45,800 and 57,800, (7) between 57,800 and 72,800, (8) between 72,800 and 83,900, and (9) more than 83,900. However, some categories contain only small amount of observations. For example, there are only 39 observations in the first category and only 213 observations in total in categories (6)–(9). To keep the number of observations in each category more balanced, we merge the first two categories, and we combine categories (5)–(9) into one.
The efficiency estimates certainly depend on the specification of model used for empirical analysis and the distribution assumption imposed on composite error. Therefore, we treat our empirical results as a starting point of analyzing the market inefficiency of recent Taiwanese college graduates. More empirical works and data collection are useful to enhance our knowledge about this important issue.

References

Abramowitz M, Stegun IA (1970) Handbook of mathematical functions. Dover, New York
Google Scholar
Aigner D, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier production function models. J Econom 6:21–37
Article Google Scholar
Amsler C, Papadopoulos A, Schmidt P (2020) Evaluating the CDF of the skew normal distribution. Empir Econ. https://doi.org/10.1007/s00181-020-01868-6
Amsler C, Schmidt P, Tsay WJ (2019) Evaluating the CDF of the distribution of the stochastic frontier composed error. J Prod Anal 52(1):29–35
Article Google Scholar
Badade M, Ramanathan TV (2020) Probabilistic frontier regression model for multinomial ordinal type output data. J Prod Anal 53:339–354
Article Google Scholar
Card D, Krueger A (1992) Does school quality matter? Returns to education and the characteristics of public schools in the United States. J Political Econ 100(1):1–40
Article Google Scholar
Carson R (2012) Contingent valuation: a comprehensive bibliography and history. Edward Elgar Publishing.
Flecher C, Allard D, Naveau P (2009) Truncated skew-normal distributions: estimation by weighted moments and application to climatic data. Metron 68:331–345
Article Google Scholar
Fu T-T (2011) School quality, operational efficiency, and optimal size: an analysis of higher education institutions in Taiwan. J Res Educ Sci 56(3):181–213
Google Scholar
Greene WH (1990) A Gamma-distributed stochastic frontier model. J Econom 46:141–163
Article Google Scholar
Griffiths W, Zhang X, Zhao X (2014) Estimation and efficiency measurement in stochastic production frontiers with ordinal outcomes. J Prod Anal 42(1):67–84
Article Google Scholar
Hofler RA, Polachek SW (1985) A new approach for measuring wage ignorance in the labor market. J Econ Bus 37(3):267–276
Article Google Scholar
Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical inefficiency in the stochastic frontier production function model. J Econom 19:233–238
Article Google Scholar
Kumbhakar SC, Lothgren M (1998) Monte Carlo analysis of technical inefficiency predictors. Working paper series in Economics and Finance, No. 229, Stockholm School of Economics
Olson JA, Schmidt P, Waldman DM (1980) A Monte Carlo study of estimators of stochastic frontier production functions. J Econom 13(1):67–82
Article Google Scholar
Peng SM (2005) Taiwan higher education data system and its applications. National Science Council research report, National Tsing-Hua University, Taiwan
Polachek SW, Yoon BJ (1996) Panel estimates of a two-tiered earnings frontier. J Appl Econom 11(2):169–178
Article Google Scholar
Stevenson RE (1980) Likelihood functions for generalized stochastic frontier estimation. J Econom 13:57–66
Article Google Scholar
Tsay W-J, Huang CJ, Fu T-T, Ho I-L (2013) A simple closed-form approximation for the cumulative distribution function of the composite error of stochastic frontier models. J Prod Anal 39:259–269
Article Google Scholar
Tsionas EG (2007) Efficiency measurement with the Weibull stochastic frontier. Oxf Bull Econ Stat 69:693–706
Article Google Scholar

Download references

Acknowledgements

We thank for the valuable comments from Peter Schmidt of Michigan State University and the participants of the DEAIC2017 conference at Hefei, China.

Author information

Authors and Affiliations

Department of Economics, California State Polytechnic University, Pomona, CA, USA
Shih-Tang Hwu
Department of Economics, Soochow University, Taipei, Taiwan, Republic of China
Tsu-Tan Fu
Institute of Economics, Academia Sinica, Taipei, Taiwan, Republic of China
Wen-Jen Tsay

Authors

Shih-Tang Hwu
View author publications
You can also search for this author in PubMed Google Scholar
Tsu-Tan Fu
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Jen Tsay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shih-Tang Hwu, Tsu-Tan Fu or Wen-Jen Tsay.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

For convenience of exposition, we suppress subindex throughout the Appendix, and two equations from (Abramowitz and Stegun, 1970, Eqs. 7.11 and 7.4.32) are given:

$$Erf(z)=\frac{2}{\sqrt{\pi }}{\displaystyle\int\nolimits_{0}^{z}}\exp (-{t}^{2})=2{\displaystyle\int\nolimits_{0}^{\sqrt{2}z}}\phi (t)dt,$$

$$\displaystyle\int \exp -\left(k{x}^{2}+2mx+n\right)dx=\frac{1}{2}\sqrt{\frac{\pi }{k}}\exp \left(\frac{{m}^{2}-kn}{k}\right)Erf\left(\sqrt{k}x+\frac{m}{\sqrt{k}}\right)+C,\ k\,\ne \,0,$$

where C denotes a finite constant.

1.1 Derivation of Proposition 1

Given that (Q, a, b) ∈ R, b > 0, Erf(− x) = − Erf(x), and define $\varepsilon =\sqrt{2}v$, we have:

$$\begin{array}{ll}I(Q,a,b)&=\sqrt{2}{\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\left({\displaystyle\int\nolimits_{-\infty }^{\sqrt{2}av}}\phi \left(\zeta \right)d\zeta \right)\phi \left(\sqrt{2}bv\right)dv\\ &=\frac{\sqrt{2}}{2}{\displaystyle\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\end{array}$$

Tsay et al. (2013) show that Erf(x) can be well approximated by a function, $g(x)=1-\exp \left({c}_{1}x+{c}_{2}{x}^{2}\right)$ for x ≥ 0, where c₁ and c₂ are discussed in “Interval stochastic frontier model.” We divide the derivation into four cases: (Q ≥ 0, a ≥ 0), (Q ≤ 0, a ≥ 0), (Q ≥ 0, a ≤ 0), and (Q ≤ 0, a ≤ 0).

Case 1. (Q ≥ 0, a ≥ 0):

$$\begin{array}{ll}I(Q,a,b)\,=\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\quad\quad\,\,=\,\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\infty }^{0}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\quad\quad+\,\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{0}^{\frac{Q}{\sqrt{2}}}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\quad\quad\,\,=\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{0}^{\infty }}\left(1-Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\qquad+\,\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{0}^{\frac{Q}{\sqrt{2}}}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\quad\quad\,\,\approx \frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{0}^{\infty }}\exp \left({c}_{1}av+{c}_{2}{a}^{2}{v}^{2}\right)\frac{1}{\sqrt{2\pi }}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\qquad\qquad+\,\sqrt{2}{\displaystyle\int\nolimits_{0}^{\frac{Q}{\sqrt{2}}}}\frac{1}{\sqrt{2\pi }}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\qquad\qquad-\,\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{0}^{\frac{Q}{\sqrt{2}}}}\exp \left({c}_{1}av+{c}_{2}{a}^{2}{v}^{2}\right)\frac{1}{\sqrt{2\pi }}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\quad\quad\,\,=\frac{1}{2\sqrt{\pi }}{\displaystyle\int\nolimits_{0}^{\infty }}\exp \left(-({b}^{2}-{a}^{2}{c}_{2}){v}^{2}+a{c}_{1}v\right)dv\\ \qquad\qquad\qquad+\,\frac{1}{\sqrt{\pi }}{\displaystyle\int\nolimits_{0}^{\frac{Q}{\sqrt{2}}}}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\qquad\qquad-\,\frac{1}{2\sqrt{\pi }}{\displaystyle\int\nolimits_{0}^{\frac{Q}{\sqrt{2}}}}\exp \left(-({b}^{2}-{a}^{2}{c}_{2}){v}^{2}+a{c}_{1}v\right)dv.\end{array}$$

When we use (7.4.32) of Abramowitz and Stegun (1970), I(Q, a, b) in this case can be approximated by:

$$\begin{array}{ll}I(Q,a,b)\,\approx \frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[1-Erf\left(\frac{-a{c}_{1}+\sqrt{2}Q({b}^{2}-{a}^{2}{c}_{2})}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}+\frac{1}{2b}Erf\left(\frac{bQ}{\sqrt{2}}\right).\end{array}$$

Case 2. (Q ≤ 0, a ≥ 0):

$$\begin{array}{ll}I(Q,a,b)\,=\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\,\,=\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\frac{Q}{\sqrt{2}}}^{\infty }}\left(1-Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\,\,\approx \frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\frac{Q}{\sqrt{2}}}^{\infty }}\exp \left({c}_{1}av+{c}_{2}{a}^{2}{v}^{2}\right)\frac{1}{\sqrt{2\pi }}\exp \left(-{b}^{2}{v}^{2}\right)dv\\ \qquad\qquad\,\,=\frac{1}{2\sqrt{\pi }}{\displaystyle\int\nolimits_{-\frac{Q}{\sqrt{2}}}^{\infty }}\exp \left(-({b}^{2}-{a}^{2}{c}_{2}){v}^{2}+a{c}_{1}v\right)dv.\end{array}$$

When we use (7.4.32) of Abramowitz and Stegun (1970), I(Q, a, b) in this case can be approximated by:

$$I(Q,a,b)\approx \frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[1-Erf\left(\frac{-a{c}_{1}-\sqrt{2}Q({b}^{2}-{a}^{2}{c}_{2})}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}.$$

Case 3. (Q ≥ 0, a ≤ 0):

$$\begin{array}{ll}I(Q,a,b)\,=\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\,\,=\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\infty }^{0}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\qquad+\,\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{0}^{\frac{Q}{\sqrt{2}}}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\,\,=\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\infty }^{0}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\qquad+\,\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\frac{Q}{\sqrt{2}}}^{0}}\left(1-Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\,\,\approx -\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\infty }^{0}}\exp \left({c}_{1}av+{c}_{2}{a}^{2}{v}^{2}\right)\frac{1}{\sqrt{2\pi }}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\qquad\qquad+\,\sqrt{2}{\displaystyle\int\nolimits_{-\infty }^{0}}\frac{1}{\sqrt{2\pi }}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\qquad\qquad+\,\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\frac{Q}{\sqrt{2}}}^{0}}\exp \left({c}_{1}av+{c}_{2}{a}^{2}{v}^{2}\right)\frac{1}{\sqrt{2\pi }}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\qquad\,\,=-\frac{1}{2\sqrt{\pi }}{\displaystyle\int\nolimits_{-\infty }^{0}}\exp \left(-({b}^{2}-{a}^{2}{c}_{2}){v}^{2}+a{c}_{1}v\right)dv\\ \qquad\qquad\qquad+\,\frac{1}{\sqrt{\pi }}{\displaystyle\int\nolimits_{-\infty }^{0}}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\qquad\qquad+\,\frac{1}{2\sqrt{\pi }}{\displaystyle\int\nolimits_{-\frac{Q}{\sqrt{2}}}^{0}}\exp \left(-({b}^{2}-{a}^{2}{c}_{2}){v}^{2}+a{c}_{1}v\right)dv.\end{array}$$

When we use (7.4.32) of Abramowitz and Stegun (1970), I(Q, a, b) in this case can be approximated by:

$$\begin{array}{ll}I(Q,a,b)\,\approx -\frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[1+Erf\left(\frac{\sqrt{{a}^{2}}{c}_{1}}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}+\frac{1}{2b}\\ \qquad\qquad\quad\,\,-\frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[Erf\left(\frac{\sqrt{{a}^{2}}{c}_{1}}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)+Erf\left(\frac{-a{c}_{1}-\sqrt{2}Q({b}^{2}-{a}^{2}{c}_{2})}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}.\end{array}$$

Case 4. (Q ≤ 0, a ≤ 0):

$$\begin{array}{ll}I(Q,a,b)\,=\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\left(1+Erf\left(av\right)\right)\phi \left(\sqrt{2}bv\right)dv\\ \qquad\qquad\,\,\approx -\frac{\sqrt{2}}{2}{\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\exp \left({c}_{1}av+{c}_{2}{a}^{2}{v}^{2}\right)\frac{1}{\sqrt{2\pi }}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\qquad\qquad+\,\sqrt{2}{\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\frac{1}{\sqrt{2\pi }}\exp (-{b}^{2}{v}^{2})dv\\ \qquad\qquad\,\,=-\frac{1}{2\sqrt{\pi }}{\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\exp \left(-({b}^{2}-{a}^{2}{c}_{2}){v}^{2}+a{c}_{1}v\right)dv\\ \qquad\qquad\qquad+\,\frac{1}{\sqrt{\pi }}{\displaystyle\int\nolimits_{-\infty }^{\frac{Q}{\sqrt{2}}}}\exp (-{b}^{2}{v}^{2})dv.\end{array}$$

When we use (7.4.32) of Abramowitz and Stegun (1970), I(Q, a, b) in this case can be approximated by:

$$\begin{array}{ll}I(Q,a,b)\,\approx -\frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[1+Erf\left(\frac{\sqrt{{a}^{2}}{c}_{1}}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}+\frac{1}{2b}\\ \qquad\qquad\quad\,\,-\frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[Erf\left(\frac{\sqrt{{a}^{2}}{c}_{1}}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)+Erf\left(\frac{-a{c}_{1}+\sqrt{2}Q({b}^{2}-{a}^{2}{c}_{2})}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}+\frac{1}{2b}Erf\left(\frac{bQ}{\sqrt{2}}\right).\end{array}$$

Combining the results in Cases 1–4, we obtain the result in Proposition 1.

1.2 Derivation of Proposition 3

$$\begin{array}{ll}&E\left(u| a\,<\,\varepsilon \,<\, b\right)\\ &=E\left(E\left(u| \varepsilon \right)| a\,<\,\varepsilon \,<\, b\right)\\ &=\frac{1}{{[F\left(\varepsilon \right)]}_{a}^{b}}{\displaystyle\int\nolimits_{a}^{b}}E\left(u| \varepsilon \right)f\left(\varepsilon \right)d\varepsilon \\ &=\frac{1}{{[F\left(\varepsilon \right)]}_{a}^{b}}{\displaystyle\int\nolimits_{a}^{b}}\left(S\frac{{\sigma }_{u}^{2}}{{\sigma }^{2}}\varepsilon +\frac{{\sigma }_{u}{\sigma }_{v}}{\sigma }\frac{\phi \left(\lambda \frac{\varepsilon }{\sigma }\right)}{{{\Phi }}\left(S\lambda \frac{\varepsilon }{\sigma }\right)}\right)\left(\frac{2}{\sigma }\phi \left(\frac{\varepsilon }{\sigma }\right){{\Phi }}\left(S\lambda \frac{\varepsilon }{\sigma }\right)\right)d\varepsilon \\ &=\frac{1}{{[F\left(\varepsilon \right)]}_{a}^{b}}\left[{\displaystyle\int\nolimits_{a}^{b}}S\frac{{\sigma }_{u}^{2}}{{\sigma }^{2}}\varepsilon \frac{2}{\sigma }\phi \left(\frac{\varepsilon }{\sigma }\right){{\Phi }}\left(S\lambda \frac{\varepsilon }{\sigma }\right)d\varepsilon +{\displaystyle\int\nolimits_{a}^{b}}\frac{2{\sigma }_{u}{\sigma }_{v}}{{\sigma }^{2}}\phi \left(\frac{\varepsilon }{\sigma }\right)\phi \left(\lambda \frac{\varepsilon }{\sigma }\right)d\varepsilon \right]\\ &={E}_{1}+{E}_{2}.\end{array}$$

We first rewrite E₁ as:

$$\begin{array}{ll}{E}_{1}=\,\frac{1}{{[F\left(\varepsilon \right)]}_{a}^{b}}\frac{S{\sigma }_{u}^{2}}{{\sigma }^{2}}{\displaystyle\int\nolimits_{a}^{b}}\varepsilon \ f\left(\varepsilon \right)d\varepsilon \\ \quad\,\,=S\frac{{\sigma }_{u}^{2}}{{\sigma }^{2}}E\left(\varepsilon | a\,< \,\varepsilon \,<\, b\right).\end{array}$$

As we can see, the integral part in E₁ is a censored conditional expectation of $f\left({\varepsilon }_{}\right)$ given that ε ∈ (a, b). Using Proposition 1 in Flecher et al. (2009), we can evaluate E₁ as:

$${E}_{1}=S\frac{{\sigma }_{u}^{2}}{\sigma }\left\{-\sigma \frac{{\left[f(\varepsilon )\right]}_{a}^{b}}{{\left[F(\varepsilon )\right]}_{a}^{b}}+\frac{2\lambda }{\sqrt{2\pi }{\lambda }^{* }}\frac{{\left[{{\Phi }}\left(\frac{S{\lambda }^{* }}{\sigma }\varepsilon \right)\right]}_{a}^{b}}{{\left[F(\varepsilon )\right]}_{a}^{b}}\right\},$$

where ${\lambda }^{* }=\sqrt{1+{\lambda }^{2}}$.

We can also recast E₂ as:

$$\begin{array}{ll}{E}_{2}=\,\frac{2{\sigma }_{u}{\sigma }_{v}}{{\sigma }^{2}}\frac{1}{{[F\left(\varepsilon \right)]}_{a}^{b}}{\displaystyle\int\nolimits_{a}^{b}}\phi \left(\frac{\varepsilon }{\sigma }\right)\phi \left(\lambda \frac{\varepsilon }{\sigma }\right)d\varepsilon \\ \quad\,\,=\frac{2{\sigma }_{u}{\sigma }_{v}}{{\sigma }^{2}}\frac{1}{{[F\left(\varepsilon \right)]}_{a}^{b}}{\displaystyle\int\nolimits_{a}^{b}}\frac{1}{2\pi }\exp \left[-\frac{1}{2}{\left(\frac{\varepsilon }{\sigma }\right)}^{2}\right]\exp \left[-\frac{{\lambda }^{2}}{2}{\left(\frac{\varepsilon }{\sigma }\right)}^{2}\right]d\varepsilon \\ \quad\,\,=\frac{{\sigma }_{u}{\sigma }_{v}}{\pi {\sigma }^{2}}\frac{1}{{[F\left(\varepsilon \right)]}_{a}^{b}}{\displaystyle\int\nolimits_{a}^{b}}\exp \left(-\frac{{{\lambda }^{* }}^{2}{\varepsilon }^{2}}{2{\sigma }^{2}}\right)d\varepsilon .\end{array}$$

Using (7.4.32) of Abramowitz and Stegun (1970), we can evaluate E₂ as:

$$\begin{array}{ll}{E}_{2}=\frac{{\sigma }_{u}{\sigma }_{v}}{\pi {\sigma }^{2}}\frac{1}{{[F\left(\varepsilon \right)]}_{a}^{b}}{\left[\frac{1}{2}\sqrt{\frac{\pi }{\frac{{{\lambda }^{* }}^{2}}{2{\sigma }^{2}}}}erf\left(\frac{{\lambda }^{* }}{\sqrt{2}\sigma }\varepsilon \right)\right]}_{a}^{b}\\ \quad\,\,=\frac{{\sigma }_{u}{\sigma }_{v}}{\sqrt{2\pi }\sigma {\lambda }^{* }}\frac{{\left[erf\left(\frac{{\lambda }^{* }}{\sqrt{2}\sigma }\varepsilon \right)\right]}_{a}^{b}}{{\left[F\left(\varepsilon \right)\right]}_{a}^{b}}.\end{array}$$

With the above results, we now obtain the JLMS interval efficiency estimate.

1.3 Proof of Corollary 1

We express the limit of Proposition 3 as:

$$\begin{array}{ll}\lim\limits_{\delta \to 0}E\left(u| c\,<\, \varepsilon \,<\,c+\delta \right)\,=\lim\limits_{\delta \to 0}S\frac{{\sigma }_{u}^{2}}{\sigma }\left\{-\sigma \frac{{\left[f\left(\varepsilon \right)\right]}_{c}^{c+\delta }}{{\left[F\left(\varepsilon \right)\right]}_{c}^{c+\delta }}+\frac{2\lambda }{\sqrt{2\pi }{\lambda }^{* }}\frac{{\left[{{\Phi }}\left(\frac{S{\lambda }^{* }}{\sigma }\varepsilon \right)\right]}_{c}^{c+\delta }}{{\left[F\left(\varepsilon \right)\right]}_{c}^{c+\delta }}\right\}\\ \qquad\qquad\qquad\qquad\qquad\qquad+\lim\limits_{\delta \to 0}\frac{\frac{{\sigma }_{u}{\sigma }_{v}}{\sqrt{2\pi }\sigma {\lambda }^{* }}{\left[erf\left(\frac{{\lambda }^{* }}{\sqrt{2}{\sigma }^{2}}\varepsilon \right)\right]}_{c}^{c+\delta }}{{\left[F\left(\varepsilon \right)\right]}_{c}^{c+\delta }}\\ \qquad\qquad\qquad\qquad\qquad\,\,=\lim\limits_{\delta \to 0}S\frac{{\sigma }_{u}^{2}}{\sigma }\frac{-\frac{2S\lambda }{\sigma }\phi \left(\frac{c+\delta }{\sigma }\right)\phi \left(\lambda \frac{c+\delta }{\sigma }\right)-\frac{2c}{{\sigma }^{2}}\phi \left(\frac{c+\delta }{\sigma }\right){{\Phi }}\left(\lambda \frac{c+\delta }{\sigma }\right)}{f\left(c+\delta \right)}\\ \qquad\qquad\qquad\qquad\qquad\qquad+\lim\limits_{\delta \to 0}S\frac{{\sigma }_{u}^{2}}{{\sigma }^{2}}\frac{2S\lambda }{\sqrt{2\pi }}\frac{\phi \left({\lambda }^{* }\frac{c+\delta }{\sigma }\right)}{f\left(c+\delta \right)}\\ \qquad\qquad\qquad\qquad\qquad\qquad+\lim\limits_{\delta \to 0}\frac{2}{\sqrt{2\pi }}\frac{{\sigma }_{u}{\sigma }_{v}}{{\sigma }^{2}}\frac{\phi \left({\lambda }^{* }\frac{c+\delta }{\sigma }\right)}{f\left(c+\delta \right)}\\ \qquad\qquad\qquad\qquad\qquad\,\,=S\frac{{\sigma }_{u}^{2}}{{\sigma }^{2}}c-\frac{\frac{2{S}^{2}\lambda }{\sigma }\phi \left(\frac{c}{\sigma }\right)\phi \left(\lambda \frac{c}{\sigma }\right)-\frac{2{S}^{2}\lambda }{\sqrt{2\pi }\sigma }\phi \left({\lambda }^{* }\frac{c}{\sigma }\right)}{f\left(c\right)}\\ \qquad\qquad\qquad\qquad\qquad\qquad+\frac{2}{\sqrt{2\pi }}\frac{{\sigma }_{u}{\sigma }_{v}}{{\sigma }^{2}}\frac{\phi \left({\lambda }^{* }\frac{c}{\sigma }\right)}{f\left(c\right)}\\ \qquad\qquad\qquad\qquad\qquad\,\,=S\frac{{\sigma }_{u}^{2}}{{\sigma }^{2}}c+\frac{{\sigma }_{u}{\sigma }_{v}}{\sigma }\frac{\phi \left(\lambda \frac{c}{\sigma }\right)}{{{\Phi }}\left(S\lambda \frac{c}{\sigma }\right)},\end{array}$$

where the second equality is based on L’hospital Rule, and the fourth equality follows from the observation that $\phi \left(\frac{c+\delta }{\sigma }\right)\phi \left(\lambda \frac{c+\delta }{\sigma }\right)=\frac{1}{\sqrt{2\pi }}\phi \left({\lambda }^{* }\frac{c+\delta }{\sigma }\right)$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hwu, ST., Fu, TT. & Tsay, WJ. Estimation and efficiency evaluation of stochastic frontier models with interval dependent variables. J Prod Anal 56, 33–44 (2021). https://doi.org/10.1007/s11123-021-00609-w

Download citation

Accepted: 28 April 2021
Published: 26 May 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s11123-021-00609-w

Keywords

JEL

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Estimation and efficiency evaluation of stochastic frontier models with interval dependent variables

Abstract

Similar content being viewed by others

Estimating efficiency effects in a panel data stochastic frontier model

Stochastic frontier models using the Generalized Exponential distribution

Stochastic frontier estimation through parametric modelling of quantile regression coefficients

1 Introduction

2 Interval stochastic frontier model

3 Interval efficiency estimate