1 Introduction

Consider a stochastic frontier model with the following matrix form:

$$y={X}^{\top }\beta +\varepsilon ,$$
(1)

where y and ε are N × 1 vectors of observations on the dependent variable and the random disturbance, respectively; X is an N × K matrix of observations on a constant term and K − 1 regressors; and β is a K × 1 vector of unknown regression coefficients to be estimated. The error specification is:

$$\varepsilon =v+Su,$$
(2)

where the elements of v are independent and identically distributed (iid) as \(N(0,{\sigma }_{v}^{2})\), and the elements of u are the absolute value of the variables that are iid as \(N(0,{\sigma }_{u}^{2})\). Here, S is a pre-specified number that equals –1 for the production frontier or 1 for the cost frontier. All \(v^{\prime} s\) and \(u^{\prime} s\) are independent of each other and are also independent of X. We follow the reparameterization of Aigner et al. (1977, ALS hereafter):

$${\sigma }^{2}={\sigma }_{u}^{2}+{\sigma }_{v}^{2},\ \lambda ={\sigma }_{u}/{\sigma }_{v}.$$
(3)

ALS show that the log-likelihood function for the aforementioned stochastic frontier model is:

$$L=\frac{N}{2}{\mathrm{ln}}\,\left(\frac{2}{\pi }\right)-N{\mathrm{ln}}\,(\sigma )+\mathop{\sum }\limits_{i=1}^{N}{\mathrm{ln}}\,\left[{{\Phi }}\left(\frac{S\lambda }{\sigma }{\varepsilon }_{i}\right)\right]-\frac{1}{2{\sigma }^{2}}\mathop{\sum }\limits_{i=1}^{N}{\varepsilon }_{i}^{2}.$$
(4)

Here, \({\varepsilon }_{i}={y}_{i}-{x}_{i}^{\top }\beta\), \({x}_{i}^{\top }\) is the ith row of X, and Φ(⋅) is the standard normal cumulative distribution function (cdf). The maximum likelihood estimator (MLE) is obtained by maximizing Eq. (4) with respect to the parameters \({(\beta ,{\sigma }_{u},{\sigma }_{v})}^{\top }\).

This paper extends the analysis of ALS to the scenario where the dependent variable \({y}_{i}^{* }\) is an ordered interval variable located in J + 1 regimes:

$$\left\{\begin{array}{ll}{y}_{i}^{* }={x}_{i}^{\top }\beta +{\varepsilon }_{i},&\quad i=1,2,\ldots ,N;\\ \qquad{y}_{i}=0,&\quad\,\,\,{\text{if}}\,\ {y}_{i}^{* }\le {w}_{0};\\ \qquad{y}_{i}=1,&\,\,\,\quad{\text{if}}\,\ {w}_{0}\,<\,{y}_{i}^{* }\le {w}_{1};\\ \qquad\quad\,\,\vdots &\\ \qquad{y}_{i}=J,&\,\,\,\quad{\text{if}}\,\ {y}_{i}^{* }\,>\,{w}_{J-1},\end{array}\right.$$
(5)

i.e., the cell limits w0 < w1 < w2 < ⋯ < wJ − 1 are observable even though \({y}_{i}^{* }\) is unobservable. For example, as regularly found in survey questionnaires, we are unable to observe the exact value of some variable, rather just the range of it. This kind of survey structure is ubiquitously prevalent in the literature.

This paper thus suggests an analytic and highly accurate formula for evaluating the likelihood function of the SFA model in Eq. (5) and proposes an exact formula to measure the technical efficiency of the model by extending the method in Jondrow et al. (1982, JLMS hereafter). Since neither a numerical integration nor simulation-based techniques are required in the two formulae proposed in this paper, the computational burden of our approach is lowered to the limit. Amsler et al. (2019, AST hereafter) also provide a detailed discussion of the importance of using highly accurate formula to evaluate the distribution function of ε. We focus on the case where the elements of u are the absolute value of the variables that are iid as \(N(0,{\sigma }_{u}^{2})\), because it is directly linked to the distribution assumption of ALSFootnote 1. Interval stochastic frontier model can also be tackled with the Bayesian estimation strategy of Griffiths et al. (2014) given that u is an exponential distribution, but its implementation is not trivial. Moreover, the Monte Carlo experiments conducted herein show that the finite sample performance of our formulae is very promising.

We apply our method to the wage rates of newly graduated college students in Taiwan. The empirical results reveal the phenomenon of “school quality matters,” in which a worker graduating from a higher quality school has a higher potential wage. We also find that the undergraduate major affects the level of potential wage. Furthermore, the students of private universities are found to attain the highest efficiency score, implying private university students perform the best in their job market information search.

The remaining parts of this paper are arranged as follows. “Interval stochastic frontier model” presents the model and analytic formula of the likelihood function. “Interval efficiency estimate” provides an exact formula for measuring the technical efficiency when the conditional variable ε is only observed in a range. “Monte Carlo experiment” investigates the performance of our formulae via Monte Carlo experiments. “Empirical application” applies the proposed method to data on the wage rates of newly graduated college students in Taiwan. “Conclusion” concludes.

2 Interval stochastic frontier model

Defining f( ⋅ ) and F( ⋅ ) as the probability density and distribution functions of εi, respectively, the probability of \({y}_{i}^{* }\) in each regime is:

$$\left\{\begin{array}{ll}\qquad\quad{F}_{i0}=F\left({w}_{0}-{x}_{i}^{\top }\beta \right),&\,\qquad{\text{if}}\,\ {y}_{i}^{* }\le {w}_{0};\\ {F}_{i1}=F\left({w}_{1}-{x}_{i}^{\top }\beta \right)-F\left({w}_{0}-{x}_{i}^{\top }\beta \right),&\,{\text{if}}\,\ {w}_{0}\,<\,{y}_{i}^{* }\le {w}_{1};\\ \qquad\qquad\qquad\qquad\vdots &\\ \qquad{F}_{iJ}=1-F\left({w}_{J-1}-{x}_{i}^{\top }\beta \right),&\,\quad{\text{if}}\,\ {y}_{i}^{* }\,>\,{w}_{J-1}.\end{array}\right.$$
(6)

Consequently, the log-likelihood function for the model in Eq. (5) is:

$${\mathrm{ln}}\,L=\mathop{\sum }\limits_{i=1}^{N}\mathop{\sum }\limits_{j=0}^{J}1({y}_{i}=j){\mathrm{ln}}\,{F}_{ij},$$
(7)

where 1(⋅) is the indicator function.

The major difficulty for the maximum likelihood estimation centers on computing the associated cdf functions in Eq. (7). For example, we have:

$$F\left({w}_{j}-{x}_{i}^{\top }\beta \right)={\displaystyle\int\nolimits_{-\infty }^{{w}_{j}-{x}_{i}^{\top }\beta }}f({\varepsilon }_{i})d{\varepsilon }_{i}=\frac{2}{\sigma }{\displaystyle\int\nolimits_{-\infty }^{{Q}_{ij}}}\left({\displaystyle\int\nolimits_{-\infty }^{a{\varepsilon }_{i}}}\phi (\zeta )d\zeta \right)\phi (b{\varepsilon }_{i})d{\varepsilon }_{i}=\frac{2}{\sigma }I\left({Q}_{ij},a,b\right),$$
(8)

where j = 0, 1, …, J − 1, \({Q}_{ij}={w}_{j}-{x}_{i}^{\top }\beta\), \(a=S\frac{\lambda }{\sigma }\), \(b=\frac{1}{\sigma }\), and ϕ(⋅) stands for the standard normal density function. Provided that we have a good approximation formula for the value of \(I\left({Q}_{ij},a,b\right)\) in Eq. (8), we can accurately evaluate the likelihood function in Eq. (7). The first contribution of this paper is to derive such an approximation formula by extending the methodology of Tsay et al. (2013), who only deal with the cases S = 1 in Eq. (8). Since the formula developed in this paper does not impose a constraint on the value of S, it can be used for both cost and production function scenarios encountered in interval stochastic frontier models. Before presenting Iapp(Qij, a, b) in the following Proposition 1, we define an error function as:

$$Erf(z)=\frac{2}{\sqrt{\pi }}{\displaystyle\int\nolimits_{0}^{z}}{e}^{-{t}^{2}}dt=2{\displaystyle\int\nolimits_{0}^{\sqrt{2}z}}\phi (t)dt.$$
(9)

Proposition 1 I(Qij, a, b) in Eq. (8) can be approximated by Iapp(Qij, a, b):

$$\begin{array}{ll}{I}_{\rm{app}}({Q}_{ij},a,b)=&\frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[1+Erf\left(\frac{\sqrt{{a}^{2}}{c}_{1}}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]S}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}+\frac{\sqrt{{b}^{2}}}{2{b}^{2}}\frac{1-S}{2}\\ &-\frac{\exp \left(\frac{{a}^{2}{c}_{1}^{2}}{4{b}^{2}-4{a}^{2}{c}_{2}}\right)\left[Erf\left(\frac{a{c}_{1}}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)+Erf\left(\frac{-a{c}_{1}+\sqrt{2}{Q}_{ij}\left({b}^{2}-{a}^{2}{c}_{2}\right)\rm{sign}(S{Q}_{ij})}{2\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\right)\right]}{4\sqrt{{b}^{2}-{a}^{2}{c}_{2}}}\\ &+\frac{Erf\left(\frac{b{Q}_{ij}}{\sqrt{2}}\right)}{2b}\frac{1+\rm{sign}(S{Q}_{ij})}{2}\end{array}$$

where c1 = –1.0950081470333 and c2 = –0.75651138383854.Footnote 2

Proof of Proposition 1 is in the Appendix. AST and Amsler et al. (2020) address the issues of evaluating the cdf of the skew normal distribution or the composed error ε in Eq. (2). In the upper tail of the cdf of ε, the approximation formula of Tsay et al. (2013) and the formula APS-UT of APS works really well. In the lower tail, the approximation formula APS-LT of APS is recommended. We thus have the following proposition.

Proposition 2 I(Qij, a, b) in Eq. (8) can also be approximated by the formula APS-LT of APS

$$I({Q}_{ij},a,b)\approx \left\{\begin{array}{ll}\quad\,\,\, 2{{\Phi }}\left(\frac{{Q}_{ij}}{\sigma }\right){{\Phi }}\left(\frac{\lambda {Q}_{ij}}{\sigma }\right)\frac{1}{1+{\lambda }^{2}},&\,\,\,{\text{if}}\,\ S=1;\\ 1-2{{\Phi }}\left(\frac{-{Q}_{ij}}{\sigma }\right){{\Phi }}\left(\frac{-\lambda {Q}_{ij}}{\sigma }\right)\frac{1}{1+{\lambda }^{2}},&\,\text{if}\,\ S=-1;\end{array}\right.$$

when the value of Qij approaches the lower tail of the distribution.

For detailed simulations and discussions of Proposition 2, readers are referred to APS. With the combination of Iapp(Qij, a, b) or APS-LT, the cdf of ε in Eq. (8) can be easily evaluated with a standard statistical package. This implies that the computation of the associated likelihood function in Eq. (7) is straightforward when using these formulae.

3 Interval efficiency estimate

This section addresses the important issue of ranking the relative efficiency level of different decision units given that the parameters in Eq. (5) are estimated. ALS defined the efficiency level of different units as \(\exp \left(S{u}_{i}\right)\). Since ui is unobservable, JLMS propose to estimate the efficiency level of each unit for the case S = −1 as:

$$\exp \left(-{\widehat{u}}_{i}\right),\ \ {\widehat{u}}_{i}=E\left({u}_{i}| {\varepsilon }_{i}\right)=S\frac{{\sigma }_{u}^{2}}{{\sigma }^{2}}{\varepsilon }_{i}+\frac{{\sigma }_{u}{\sigma }_{v}}{\sigma }\frac{\phi \left(\lambda \frac{{\varepsilon }_{i}}{\sigma }\right)}{{{\Phi }}\left(S\lambda \frac{{\varepsilon }_{i}}{\sigma }\right)}.$$
(10)

This result is widely used in the literature. Nevertheless, it only can be used when the dependent variable is continuous. The interval nature of the dependent variable in Eq. (5) requires a new formula to evaluate the conditional mean of ui when εi is only observed in a range. In particular, for yi = 0 in Eq. (5), we observe the upper bound of εi as:

$${\widehat{\varepsilon }}_{i}\le {w}_{0}-{x}_{i}^{\top }\widehat{\beta },$$
(11)

where \(\widehat{\beta }\) are the maximum likelihood estimates. For yi ∈ {1, …, J − 1}, we observe both the upper and the lower bound of εi as:

$${w}_{j-1}-{x}_{i}^{\top }\widehat{\beta }\le {\widehat{\varepsilon }}_{i}\le {w}_{j}-{x}_{i}^{\top }\widehat{\beta },j=1,\ldots ,J-1$$
(12)

For yi = J, we observe the lower bound of εi as:

$${\widehat{\varepsilon }}_{i}\ge {w}_{J-1}-{x}_{i}^{\top }\widehat{\beta }.$$
(13)

Because we can only observe the boundary of εi, the conditional expectations cannot be directly evaluated with the method of JLMS. We thus fill the gap in the literature by providing an exact formula to fulfill our goal.

Without loss of generality, we assume the censoring interval of dependent variable is at \(\left(a,b\right)\), and the conditional expectation denotes E(uia < εi < b). Defining \({[{\mathbb{D}}(.)]}_{a}^{b}={\mathbb{D}}(b)-{\mathbb{D}}(a)\), we state our theoretical finding in the following proposition.

Proposition 3 Let \({\lambda }^{* }=\sqrt{1+{\lambda }^{2}}\). E(uia < εi < b) is expressed as:

$$\frac{S\frac{{\sigma }_{u}^{2}}{\sigma }\left\{-\sigma {\left[f\left(\varepsilon \right)\right]}_{a}^{b}+\frac{2\lambda }{\sqrt{2\pi }{\lambda }^{* }}{\left[{{\Phi }}\left(\frac{S{\lambda }^{* }}{\sigma }\varepsilon \right)\right]}_{a}^{b}\right\}+\frac{{\sigma }_{u}{\sigma }_{v}}{\sqrt{2\pi }{\lambda }^{* }\sigma }{\left[erf\left(\frac{{\lambda }^{* }}{\sqrt{2}\sigma }\varepsilon \right)\right]}_{a}^{b}}{{[F\left(\varepsilon \right)]}_{a}^{b}}.$$

Derivations of Proposition 3 are in the Appendix. The formula in Proposition 3 is exact and can be easily computed with standard statistical packages, given that the denominator, \({[F\left(\varepsilon \right)]}_{a}^{b}\), can be evaluated with the results in Proposition 1 and Proposition 2. We also establish that the results in Proposition 3 are identical to those in JLMS when the censoring interval degenerates to 0 in the following corollary.

Corollary 1 Let c and ξ be any finite constant such that:

$$\lim\limits_{\xi \to 0}E\left({u}_{i}| c\,<\,{\varepsilon }_{i}\,< \,c+\xi \right)\to E\left({u}_{i}| {\varepsilon }_{i}=c\right),$$

where \(E\left({u}_{i}| {\varepsilon }_{i}=c\right)\) is the formula in JLMS.

Proof of the above Corollary is also put in the Appendix.

4 Monte Carlo experiment

We consider two sets of Monte Carlo experiments in this section: (1) the finite sample performance of MLE when the data-generating processes (DGP) are the form in Eq. (5), and (2) the accuracy of the interval efficiency estimate in Proposition 3. All the programs are written in GAUSS.

4.1 Finite sample performance of MLE

Following Olson et al. (1980, p. 76) and Tsay et al. (2013), we consider a set of experiments with a two-regressor model:

$${y}_{i}^{* l}={\beta }_{0}+{\beta }_{1}{x}_{i}+{\varepsilon }_{i}^{l},\ i=1,2,\ldots ,N,\ l=1,2,\ldots ,1000,$$
(14)

where l denotes the lth replication of the data, and the regressors xi are only drawn once from N(0,1) and kept fixed throughout the simulation. Without loss of generality, we consider the case where J = 3; i.e., we divide the data into four regimes depending on whether \({y}_{i}^{* l}\) is less than w0, between w0 and w1, between w1 and w2, or greater than w2. We consider three sets of (w0, w1, w2, β0, β1, σu, σv) to check the robustness of the simulation results and four sample sizes, N = 100, 400, 1600, and 6400. Since our model is directly linked to the model in ALS and \({y}_{i}^{* l}\) is known to us, we also estimate Eq. (14) using the log-likelihood function in Eq. (4) for comparison purpose. The optimization algorithm used to implement MLE for both models is the quasi-Newton algorithm of Broyden, Fletcher, Goldfarb, and Shanno (BFGS) contained in the GAUSS MAXLIK library. The maximum number of iterations for each replication is 200.

The results in Tables 1 and 2 show the bias and mean squared errors (MSE) of the maximum likelihood estimates of ALS’s model and our model, respectively. For all the combinations of parameters presented in Tables 1 and 2, we find the bias is small and the MSE always decreases with an increasing sample size. The results confirm the promising performance of the proposed Iapp approximation formula, and enhance our confidence in using Iapp for the interval SFA models. We also find that the bias and MSE in Table 2 are larger than those in Table 1. This is consistent with the fact that our dependent variable is limited and thus contains less information than the continuous variable \({y}_{i}^{* l}\).

Table 1 Finite sample properties of MLE for ALS’s model in Eq. (4)
Table 2 Finite sample properties of MLE with the analytic formula in Proposition 1

4.2 Accuracy of the interval efficiency estimate

This subsection illustrates the accuracy of the interval efficiency estimator proposed in Proposition 3. Following Kumbhakar and Lothgren (1998), we obtain 100 million draws of ui and vi, which are generated from \({N}^{+}\left(0,{\sigma }_{u}^{2}\right)\) and \(N\left(0,{\sigma }_{v}^{2}\right)\), respectively. We control the variance of two error terms, σu and σv, and the variance ratio, λ, that reflects the contribution of the variance of u to the total variance of the error term ε. In particular, three different variance ratios λ = (0.5, 1, 2) and six censoring intervals of ε, \(\left(a,b\right)=\left(-4,-2\right)\), \(\left(-2,-1\right),\left(-1,-0.5\right),\left(-0.5,0\right),\left(0,1\right)\), and \(\left(1,3\right)\), are employed.

Table 3 presents the simulation results. For all 18 combinations of parameters and censoring intervals, the largest difference between the values calculated by Proposition 3 and the values calculated by 100 million draws is 0.0006. The results again confirm the accuracy of Proposition 3.

Table 3 Accuracy of interval efficiency estimate

5 Empirical application

This paper applies the proposed method to the earning frontier analysis of newly graduated students in Taiwan. The data used in the empirical example reported below are drawn from a survey conducted by Peng (2005) on Taiwanese college graduates who graduated in 2003. The survey employed a proportionate stratified two-stage random sampling framework in which schools and majors were used as the stratum. To serve the purpose of this study, the survey data are further screened to include only full-time workers, excluding part-time workers, self-employed, family workers, and graduates in military service or in prison. The above screening of the survey data yields a total of 3973 sample workers.

Following the literature of the human capital theory and the framework of Hofler and Polachek (1985), we develop a stochastic earning frontier function to empirically measure the labor market inefficiency of recent Taiwanese college graduates. Hofler and Polachek (1985) define the wage differential between the maximum potential wage that any given individual can attain and the wage the person actually earns to be the inefficiency that may be attributed to incomplete worker information. To accommodate the problem that the wage differential may include pure inefficiency as well as unobserved heterogeneity, previous studies have suggested to include demographic data in the earning function (Hofler and Polachek 1985; Polachek and Yoon 1996).

The dependent variable is the natural logarithm of a worker’s monthly wage (measured by New Taiwan Dollar). It is located in four possible categoriesFootnote 3: less than \({\mathrm{ln}}\,(22800)\), between \({\mathrm{ln}}\,(22800)\) and \({\mathrm{ln}}\,(28800)\), between \({\mathrm{ln}}\,(28800)\) and \({\mathrm{ln}}\,(36300)\), and more than \({\mathrm{ln}}\,(36300)\). The explanatory variables (X) used in estimating the maximum potential wage cover a set of socio-demographic factors. More specifically, the explanatory variables are related to education or work experience, gender, work status, school type, and individual majors in college.

Education/Work experience:

 Experience = 1 if a worker ever had a part-time job during undergraduate schooling; Experience = 0, otherwise. Since all subjects are recent graduates (with one year of work experience), they all have the same education level and number of working years. However, those who had a part-time work experience during college are expected to have more labor market information, which may lead to a better paying job.

Gender:

 Male = 1 if male; and Male = 0, otherwise. Females are usually paid less due to sex discrimination in the job market.

Work Status:

 Hours = Worker’s weekly working hours. Workers with longer working hours are expected to earn more.

Knowledge Match:

 Knowledge Match = 1 if a worker’s undergraduate major matches the job requirement; Knowledge Match = 0, otherwise. The knowledge match is postulated to have a positive impact on the wage rate.

Public Sector:

 Public Sector = 1 if working for the government; and Public Sector = 0, otherwise. Government employees in Taiwan on average are better paid than their private counterparts.

School Type:

 Public University = 1 if a worker graduated from a public (comprehensive) university; and Public University = 0, otherwise.

 Public Vocational College = 1 if a worker graduated from a public vocational college; and Public Vocational College = 0, otherwise.

 Private Vocational College = 1 if a worker graduated from a private vocational college; and Private Vocational College = 0, otherwise.

 Taiwan’s higher education institutions can be classified by types into public/private or comprehensive/vocational. Among the four different types, Private University is used as the base for regression measurement. School type may reflect school quality. Workers who graduated from better quality schools are expected to earn more (Card and Krueger, 1992). In Taiwan, public schools are found to be of better quality than private schools, whereas comprehensive universities are on average better than vocational colleges in school quality (Fu, 2011).

Undergraduate Major:

 Social Science = 1 if a worker graduated with a major in social science; Social Science = 0, otherwise.

 Technology = 1 if a worker graduated with a major in engineering and technology; Technology = 0, otherwise.

 The majors of undergraduates are classified into Social Science, Technology and Engineering, and Humanities. The Humanities category is used as the base for the regression analysis. In Taiwan’s labor market, workers with a technology major are generally paid well, followed by Humanities and Social Science.

It is worth to note that our control variables did not include the variables of workers’ seniority, i.e., age and experience of working, as our dataset does not provide the observations of these two variables. Nevertheless, this will not cause much loss of information, because the respondents of our dataset are newly graduated college students who have very little formal working experience, and they are about of the same ageFootnote 4.

Table 4 shows the mean statistics of variables used in the interval SFA model. Among all 3973 subjects of the four types of schools, 68.92% of them (38.54% from Private University and 30.38% from Private Vocational College) graduated from private schools. Public schools make up only 31.08% of the subjects, with 21.02% from Public University and 10.07% from Public Vocational College. Public University has a relatively high mean value in wage, knowledge match, and working in the public sector, as compared to the other three types of schools. The mean wages of public schools are averagely higher than those in private schools. We also find that the sample graduates from both Public and Private Vocational Schools have a higher percentage (46–55%) of obtaining a Technology and Engineering degree than those from general comprehensive schools, whereas private schools are found to have a higher percentage of graduates who majored in Social Science than do public schools.

Table 4 Summary statistics (sample mean)

Table 5 shows that the sign of the estimates in the earning frontier function matches our expectations. Workers obtain a higher potential wage when working in the public sector, taking a job with a knowledge match, and having more working hours. The positive sign of Male indicates the existence of gender discrimination in Taiwan’s labor market. The school type variables show a positive sign for Public University and negative signs for both types of vocational schools, as compared to Private University.

Table 5 Estimating log wage equation with the interval SFA model

School type variables are proxies for school quality, and it is widely recognized in Taiwan that the quality of public schools is on average higher than that of private schools, as are general comprehensive universities versus vocational colleges. Therefore, the empirical results indicate the phenomenon of "quality matters," in which a worker graduating from a higher quality school will have a higher potential wage. In our study, the sample graduates from public universities have the highest potential wage. Table 4 also shows that undergraduate major affects the level of potential wage. Those who majored in technology and engineering (social science) have a higher (lower) potential wage than those with a humanities major.

Given that the parameters are estimated, we now apply the results in Proposition 3 to measure the relative efficiency level of different units. The summary statistics and histograms of the resulting estimates are displayed in Table 6 and Figs. 14, respectively. From these efficiency estimates, we find that private university graduates achieve the highest average efficiency of 0.8406 as compared to the value of 0.8380 for public university graduates, but the difference is quite small. This implies that private universities try to help their graduates to get a better payoff so as to raise their ability of enrolling new students when facing the difficulty of attracting the best students in Taiwan. Interestingly, the lowest value of efficiency found is 0.5217, which is from graduates of a public university.

Table 6 Summary statistics of the efficiency estimates
Fig. 1
figure 1

The wage efficiency of public vocational college graduates

Fig. 2
figure 2

The wage efficiency of public university graduates

Fig. 3
figure 3

The wage efficiency of private vocational college graduates

Fig. 4
figure 4

The wage efficiency of private university graduates

6 Conclusion

This paper extends the analysis of ALS to the scenario where the dependent variable is an ordered interval variable by recognizing that the interval coding data are commonly collected in surveys to individuals or decision units. In such surveys, the subjects are often asked to answer questions on their performance by choosing among some ordered interval coding values. We thus provide an analytic formula for evaluating the associated likelihood function and propose an exact formula to measure the technical efficiency by generalizing the method in JLMS. Our approaches do not depend on any numerical integration or simulation-based techniques and thus can be implemented straightforwardly.

The current paper illustrates an application of the proposed model for measuring labor market information inefficiency when the earning variable is measured by interval outcomes. Other potential applications are abundant, including survey studies that adopt contingency valuation methods to examine the values of hypothetical quality improvement of public goods or non-market goods (Carson 2012). In particular, we can use the willingness to pay (WTP) for quality improvement of environmental goods such as water, air, or any such amenity. Studies on WTP for hypothetical non-market goods such as national security, conservation of public parks or natural resources, and new medical treatments are also abundant in both the environment and health-related literature. Since WTP is often expressed in an interval form due to the contingency valuation survey design, our proposed model can therefore be used to measure the inefficiency of WTP reporting.