1 Introduction

1.1 Data Envelopment Analysis (DEA)

The success of an organization or unit, whether government or private, depends on its efficiency. Measurement of efficiency of units of similar type, not only help in identifying the shortcomings of the unit, but also it helps in the development of the unit by eradicating or minimizing those shortcomings, thereby developing the country as a whole.

The first and the foremost principle of efficiency is to attain the best outcome through the minimum utilization of resources. Efficiency measurement and efficiency enhancement is a crucial and essential part of every organization for its future development [1]. However, measuring the efficiency, though extremely important, is not an easy task. For simple cases, when units have a single output and single input, efficiency could be measured as the ratio of the two. But, things become challenging when efficiency is to be calculated depending on multiple-input and output. This has attracted the scientists and researches working in the area of decision-making and operations research since many years.

Fifty years ago, in 1957, Farrel introduced the assessment of productivity efficiency in his classical paper [2]. Around twenty years later, in 1978, Charnes et al. [3] introduced a methodology named Data Envelopment Analysis (DEA) to calculate the relative efficiency of units based on multiple outputs and multiple inputs. He defined efficiency as the ratio of the weighted sum of the outputs to the weighted sum of inputs, and gave the mathematical representation described in detail are below. Some notable advantages of DEA are [4]:

  • It may maintain multiple outputs and multiple input variables, which may have different units.

  • It is able of engaging with qualitative as well as quantitative data.

  • It acts an effective decision-making tool in directing the attention of management to the area that can be improved.

Literature reveals that from 1978 to 1995, there has been a constant growth in the publications of DEA, but 1995 onwards there has been an exponential rise in the publications of DEA, both in terms of theoretical development and in terms of diverse applications such as banking [5], transportation [6], health sector [7], education [8], tourism [9], finance [10], sports [11], and many more. This trend is clearly reflected in Fig. 1.

Fig. 1
figure 1

DEA publications since 1980

1.2 Basic DEA Model

We now come to the basic DEA model, which laid the foundation for several modified DEA variants. The idea was conceptualized by Charnes et al., in their highly cited paper published in 1978 [3]. DEA is a two-step process wherein the initial step the problem is formulated in the fractional form, which is simplified to the linear form in the second step [12].

Efficiency, by definition, is the ratio of the output upon the input as given in Eq. (1).

$$\begin{aligned} \text{ Efficiency } =\frac{ \text{ Outputs } }{ \text{ Inputs } } \end{aligned}$$
(1)

However, things become tricky in case of multiple input and multiple output, which is usually the case with real-life scenarios.

This is where, the notion of DEA comes to the rescue, according to which efficiency can be calculated as the weighted sum of output to the weighted sum of input, mathematically represented as Eq. (2).

$$\begin{aligned} \text{ Efficiency } =\frac{ \text{ Weighted } \text{ sum } \text{ of } \text{ the } \text{ Output } }{\text{ Weighted } \text{ sum } \text{ of } \text{ the } \text{ Inputs } } \end{aligned}$$
(2)

To explain (2) in a better manner, let us assume that there are \(y_t\) and \(x_s\) output and input criteria respectively, where the output \(y_t\) is to be maximized, and input \(x_s\) is to be minimized. Here, we have considered M DMUs. Now, if we have to maximize the efficiency of the \(z^{th}\) DMU, then its mathematical representation can be the following equations:

$$\begin{aligned} \begin{array}{ll} \max E_{z}=\frac{\sum _{t=1}^{T}c_{t z} y_{t z}}{\sum _{s=1}^{S}d_{s z} x_{s y}}\\ \text {s.t.}\\ 0 \le \frac{\sum _{t=1}^{T} c_{t z} y_{t y}}{\sum _{s=1}^{S} d_{s z} b_{s y}} \le 1,\quad y=1,2, \ldots , Y\\ c_{t z}, d_{s z} \ge 0, s=1,2, \ldots , S,\quad t=1,2, \ldots , T \end{array} \end{aligned}$$
(3)

If the efficiency score of a DMU is one, then it is known to be a relative efficient DMU otherwise it is said as a relative inefficient DMU [13]. Once the problem has been formulated considering the input and the output criteria, it is simplified into a linear programming (LP) model as followed.

$$\begin{aligned} \begin{array}{ll} \max E_{z}=\sum _{t=1}^{T} c_{t z} y_{t z}\\ \text {s.t.}\\ \sum _{t=1}^{T} c_{t z} y_{t z}-\sum _{s=1}^{S} d_{s z} x_{s z} \le 0 \\ \sum _{s=1}^{S} d_{s z} x_{s z}=1 \\ c_{t z}, d_{s z} \ge 0, s=1,2, \ldots , S,\quad t=1,2, \ldots , T \end{array} \end{aligned}$$
(4)

where, \(E_{z}\) is the efficiency of the zth DMU, \(y_{tz}\) shows \(t^{th}\) output of the zth DMU, \(c_{tz}\) denotes the weight of the tth output, \(x_{sz}\) is the sth input of the zth DMU, and \(d_{sz}\) shows the weight of the sth input.

1.3 Significance of the Present Study

Many authors surveyed DEA literature in the past era on different aspects. Cooper et al. [14] in their review, discussed the theory of DEA models. Adler et al. [15] provided a review of different ranking methods of DEA. Kuah et al. [16] focused on the methodological development of DEA. Liu et al. [17] reviewed the citation-based survey from 1978-2010. An application of DEA-based literature was reviewed by Liu et al. [17]. Seiford [18] and Emrouznejad et al. [19] provided a bibliometric survey of taking into account 30 years of its development. Mardani et al. [20] and Xu et al. [21] provided a review on the energy efficiency evaluation based on DEA. The study aims to find a set of papers that cover the development in the DEA area in the last 40 years. The study adopts to review various DEA models, major application areas of DEA, details of the software, and bibliometric analysis. The present study is based on the following research questions (RQ):

  • RQ1: What are the different variants of DEA that have emerged during the 40 years of its existence?

  • RQ2: What are the advantages and disadvantages of different DEA models?

  • RQ3: what are the different application areas where DEA has been applied?

  • RQ4: what are major trends of DEA in terms of publications?

The paper is categorized into nine sections, including Sect. 1, a deep explanation of introduction of DEA. Section 2, examines the collecting of the information. Section 3 reviews selected DEA models, and Sect. 4 discusses the advantages and disadvantages of DEA models. Section 5 reviews the applications of the DEA and its variants. Section 6 discusses the software and conferences related to the DEA. A bibliometric analysis is propounded in Sect. 7, and conclusion and future measures are described in Sect. 8, and Sect. 9.

2 Collecting the Information

The purpose of this study is to abreast the reader with the historical and recent developments of DEA for which a large number of papers were surveyed through the standard ISI Web of Science (WOS) database.

According to the citation database, WOS is the world’s leading disciplinary coverage of high-impact journals in sciences, social sciences, as well as international progression. The selection of the publications is based on SCIE and SSCI index, Conference proceedings citation index-Science (CPCI-S), Art and humanities citation index (A& HCI), Conference proceeding citation index -social science & humanities (CPCI-SSH). More details are mentioned in Table 1. We have tried to bring together the different models of DEA, discussed in detail in Sect. 3.

The search for research articles to be comprised of this paper is based on Journal Citation Report (JCR), and their impact factor (shown in Fig 12).

Choice of Keywords: We started the search by considering the following keywords:

‘Data Envelopment Analysis’, ‘Additive DEA approach’, ‘Slacks-Based DEA technique’, ‘Russel Measure DEA approach’, ‘Cross efficiency approach’, ‘Super efficiency DEA technique’, ‘Network DEA Models’, ‘Hierarchical DEA Models’, ‘Supply Chain DEA procedures’, ‘Stochastic DEA models’, ‘Fuzzy DEA model’, ‘Banking and DEA’, ‘Sports and DEA’, ‘Agriculture & farm and DEA’, ‘Transportation and DEA’, ‘Education and DEA’, ‘Communication and DEA’, ‘Fishery and DEA’, ‘Tourism and DEA’, ‘Health care and DEA’, ‘Automobile and DEA’, ‘Forestry and DEA’, ‘Water and DEA’, ‘Real estate and DEA’, ‘E-business and DEA’.

The number of articles that appeared using the above keywords are listed in Table 1 for the time window 1981 – 2021. Searching the database on the basis of applications of DEA, the data obtained is illustrated in Fig. 2. In the subsequent sections we will discuss modified DEA models, their applications and so on Fig. 3.

Table 1 Distribution of the papers based on science index
Fig. 2
figure 2

Science index of the DEA

Fig. 3
figure 3

Steps of the workflow for reviewing the literature of DEA

3 DEA Models

Nothing in the world is flawless, not even DEA, and though it gave an initial concept of efficiency measurement for units having multiple input and output, researchers realized that despite being an advantageous technique, there are certain drawbacks, listed below, associated with DEA.

  • It is a non-parametric approach, so statistical hypothesis testing is difficult.

  • Lack of discriminating power between the DMUs.

  • It cannot predict the performance of other DMUs.

Since the initial conceptualization of DEA, several models have been proposed to improve its performance. DEA models can also be segregated as per different terminologies one of which can be Radial and non-radial [4]. While radial models directly assume proportion change of output/input variable, and generally, resting slacks are indirectly accounted for inefficiency, the non-radial models deal with slacks of all output/input variable separately and combine them into an efficiency measure. In the present study we have segregated the models on the basis of their development as follows:

  1. 1.

    Classical DEA methods

  2. 2.

    Extended DEA methods

  3. 3.

    Multistage and multi-level DEA models

  4. 4.

    Stochastic and Fuzzy DEA techniques.

  5. 5.

    Hybrid DEA models

  6. 6.

    AI/ML assisted DEA models

3.1 Classical Models of DEA

The three models, viz. CCR, BCC and SE are placed under the classical model’s category. These models helped in designing DEA in the initial stage and also laid the foundation for other variants of DEA. In the following sub sections, we will describe the models one by one.

3.1.1 CCR (CRS) Model

CCR is the first DEA approach, and investigated by Charnes et al. in 1978 [3]; it is named as CCR model after the first letter of their names. The mathematical structure is the CCR model is shown in Eq. 4. Since the DMUs are M, the approach is run M times to recognize the relative efficiency value of all DMUs. Each DMU choose a collection of output weights \((c_{tz})\) and input weights \((d_{tz})\) which maximize the efficiency value. This model presumes constant returns to scale (CRS) supposition. The efficiency is also denominated as the overall technical efficiency (OTE) score.

3.1.2 BCC Model

It was observed that the CCR approach is focused on the proper DMU when it can consider all DMUs on an optimal scale. Some application areas like incomplete completion and restrictions on finance etc., may give rise to DMUs to operate away from the optimal scale. The extension technique of CCR technique was given via Banker, Charnes and Cooper in 1984 [13] and is also known as the VRS method. The efficient frontier of VRS is constituted by the convex hull of the DMUs which are available [16]. The efficiency measures obtained from the BCC technique, denominated pure technical efficiency (PTE) value. Mathematical form of the model is given below:

$$\begin{aligned} \begin{aligned} \max E_{z}&=\sum _{t=1}^{T} c_{t z} y_{t z}+w_{q z} \\&\text{ s.t. } \\&\sum _{s=1}^{s} d_{s z} x_{s z}=1 \\&\sum _{t=1}^{T} c_{t z} y_{t y}-\sum _{s=1}^{S} d_{s z} x b_{s y}+w_{q z} \le 0, \quad \mathrm {y}=1,2 \ldots , \mathrm {Y} \\&c_{t z}, d_{s z} \ge \epsilon , s=1,2, \ldots , S, t=1,2, \ldots , T \\&w_{qz}\text { is unrestricted in sign} \end{aligned} \end{aligned}$$
(5)

Main variation between the CCR and BCC techniques is convexity constraint. In the BCC technique, the sum of the \(\sum w_{q z}=1\). If the value is \(\sum w_{q z}\le 1\), instead of the \(\sum w_{q z}=1\) then the technique is Non-Increasing Returns to scale (NIRS) form. If the value is \(\sum w_{q z}\ge 1\), instead the \(\sum w_{q z}=1\) thus, the approach is called Non-Decreasing Returns to scale (NDRS). CCR efficiency score is always less and equal to the BCC efficiency score.

Here, it is seen that the form of the BCC is different from the CCR model. This model has an extra convexity constraint, \(\sum w_{q z}=1\). The difference between both models is explained by Fig. 4.

Fig. 4
figure 4

Efficient frontier of CCR and BCC model

Here, it can be observed the straight-line “opqr”, which shows the CRS surface and the line “lsqwz” which shows the VRS.Under the CCR model, DMU q is efficient, and the remaining are inefficient DMUs. Under the VRS model, the efficient DMUs are sqw,  and z. These efficient units are called BCC-efficient. The line segments s and q is IRS, and point q shows the CRS while the other point on right side of the q is experiencing decreasing returns to scale (DRS).

3.1.3 Scale Efficiency (SE) Model

The ratio of the CCR score into the BCC score shows the value of the scale efficiency [22]. It measures the influence on scale value on the efficiency of a DMU.

$$\begin{aligned} \text {Scale efficiency}=\frac{E_{C C R}}{E_{B C C}} \end{aligned}$$
(6)

where,\(E_{CCR} = CCR\) score of the DMU, and \(E_{BCC}= BCC\) score of the DMU

3.2 Extended Versions of DEA Model

The traditional DEA approaches determine the relative efficiency of the DMUs, but they are not able to attain ranking for the efficient units. Lack of discriminating power is the major drawback of the model. If the number of input and output variables are greater than the number of decision units, it may reduce the discriminating tendency of the technique.

To deal with this drawback, researchers extended the basic model and proposed modifications in DEA to enhance its discriminatory power. Some extended models of the DEA are discussed in this section.

3.2.1 Additive DEA Model

The additive DEA model is non-radial and It proceeds the inefficient DMU to the efficient frontier by reducing the input and increasing output simultaneously. Also, This technique is called the Pareto Koopmans model and was presented by Charnes, Cooper, Golany, Seiford and Stutz in 1985 [23]. BCC and CCR models can be explained in the form of the additive model. The mathematical form is shown in Eq. (7).

$$\begin{aligned} \begin{array}{ll} E_{0}=\max \sum _{z} s_{z}^{-}+\sum _{t} s_{t}^{+}\\ \text {s.t}\\ \sum _{u} \delta _{u} x_{z u}-x_{z o}+s_{z}^{-}=0 \\ \sum _{u} \delta _{u} y_{t}-y_{t 0}-s_{t}^{+}=0\\ \sum _{u} \delta _{u}=1 \\ \delta _{u}, s_{z}^{-}, s_{t}^{+} \ge 0 \end{array} \end{aligned}$$
(7)

\(s_{z}^{-}\) and \(s_{t}^{+}\) are show slacks, when \(s_{z}^{-}\) is input excesses and \(s_{t}^{+}\) is output shortfalls.

If the values of all slacks are zero, then DMU is called additive efficient at its optimum solution.

3.2.2 Slacks-Based Model (SBM)

SBM technique is used to surmount the drawback of the additive approach. This technique is a variation of the additive model and was recommended by Tone [24]. Efficiency measurement of the SBM model is shown in Eq. (8).

$$\begin{aligned} \begin{array}{ll} \min E=\frac{1-\frac{1}{a} \sum _{z} s_{z}^{-} / x_{z 0}}{1+\frac{1}{b} \sum _{t} s_{t}^{+} / y_{t 0}}\\ \text {s.t.}\\ \sum _{u} \delta _{u} x_{z u}-x_{z o}+s_{z}^{-}=0 \\ \sum _{u} \delta _{u} y_{t}-y_{t 0}-s_{t}^{+}=0 \\ \sum _{u} \delta _{u}=1 \\ \delta _{u}, s_{z}^{-}, s_{t}^{+} \ge 0 \end{array} \end{aligned}$$
(8)

This fractional form can be transformed into the linear form, as shown in [25]. When all slacks are equal to 0, then the optimum solution of the SBM model is 1. SBM model is an example of the non-radial model.

3.2.3 Russel Measure Model (RM)

One of the other important DEA models is RM, introduced by Fare and Lovell [25] in 1978. Computationally, RM model is very complicated because it was formulated as a non-linear programming model (NLP)[26].

They revisited the RM model in 1999, called the Enhanced Russell Graph Measure model (EGRM). EGRM is an improved model of the RM model [26], and is calculated as the ratio among the average efficiency of the inputs to the average efficiency of outputs. EGRM is best model because it is capable of overcoming the computational complexity in the RM model. Pastor et al. published the development of RM. Without depending on EGRM, the second-order cone programming (SOCP) model is used directly to solve RM. The exact RM score can provide through the SOCP model.

3.2.4 Free Disposal Hull Model (FDH)

FDH [27] is an easy-to-use model and could be generated from the VRS or CRS model, with the extra restriction \(\delta _{u} \in \{0,1\}\). In 1999, Thrall [28] challenge this idea, from the economic theory perspective which can be rebutted strongly by Cherchye et al. [29] and FDH remains a noteworthy technique for measuring efficiency. Figure 5 shows the “hull” of the boundary of sets and its connection concept of the FDH model [16].

Fig. 5
figure 5

Efficient frontier of FDH model

3.2.5 Cross Efficiency DEA Model

This model was introduced by Sexton, Silkman, and Hogan in 1986 [30]. It evaluates the weights of outputs and inputs of each DMU with the help of the DEA. In the next step, weights of each DMU, using m sets of weights, and getting m efficiency value of each DMU. Finally, the cross-efficiency value of a DMU is the intermediate score of its efficiency score. According to cross efficiency value to give the rank of DMU. More discussion of the cross-efficiency model is mentioned in [31]. Some improved models along with applications are discussed in the references [32, 33].

3.2.6 Super Efficiency Model (SE-DEA)

The SE-DEA model was introduced by Andersen and Petersen [34]. This model performs very well, however, the DMU is omitted from the peer set under the evaluation. The super-efficiency model has been extensively used in various application areas, as an example, measuring the efficiency regions [35], ranking of the efficient DMUs [34], and detecting the extreme efficient DMUs [36]. The mathematical form of the super efficiency model defined under DEA-CCR is expressed as [37]:

$$\begin{aligned} \begin{array}{ll} {\text {Max}} E_{z} \sum _{t=1}^{T} c_{t z} y_{t z} \\ \text {s.t.} \\ \sum _{s=1}^{S} d_{s z} x_{s z}=1 \\ \sum _{t=1, z \ne a}^{T} c_{t i} y_{t i}-\sum _{s=1}^{S} d_{s i} x_{s i} \le 0, \forall i \\ c_{t z}, d_{s z} \ge 0 ; s=1,2, \ldots , S,\quad t=1,2, \ldots , T \end{array} \end{aligned}$$
(9)

\(y_{ti}\) and \(x_{si}\) are the \(t\)th output and \(s\)th input respectively of the \(i\)th DMU, where \(i = 1, 2,\ldots ,I,\) and \(s=1,2,\ldots ,S, t=1,2,\ldots ,T\).

Linear programming technique is used to solve each decision-making units above mathematical formulation [38]. The difference between the classical DEA (DEA-CCR) and the super-efficiency technique is exclusion of unit z as the constraint set in Eq. 10. When included unit z in Eq. 10, the maximum score of efficiency score is 1. Outputs are maximized when no longer in the second constraint (i) under the evaluation unit.

3.3 Multistage and Multilevel Models

Till now, the discussion was on the single-level DEA models. The concept of single level models extended further to multistage and multilevel DEA models. These models could be distinguished into three cliques: Network DEA approaches, Hierarchical DEA approaches and Supply Chain DEA models. These are explained briefly below:

3.3.1 Network DEA Models

Network model implements DEA for estimating the relative efficiency of the DMUs, taking its internal structure, which is not considered by the classical DEA models. Grosskopf and Fare [39] and Flare and Whittaker [40] proposed similar network models. Those models define the peer group for the DMU, as is the common practice in DEA that set of all convex linear compositions of DMUs. Some researchers also called the “black box” of DEA because in this model DMU is a black box. Fare et al. introduced the three general network DEA model as (i) static network model (ii) dynamic network model and (iii) technology adoption network model.

Lewis and Sexton [41] used the DEA technique in their paper to show the advantage of the network DEA model over the classical DEA technique. They took the Major League Baseball example. Network DEA Model could figure out inefficiencies that the efficient DEA Model displaces. Some examples of the network DEA model are; slack based DEA model [42], measuring the customer satisfaction and productivity of pharmacies [43]. The number of studies is increasing on the network DEA; thus, this is the future direction.

3.3.2 Hierarchical DEA Models

Few conditions of multistage efficiency assessment might include hierarchy structure. The concept of the model was established to determine the DMUs that fall naturally into a hierarchy. In this technique, measurement through pairwise comparisons by organizing the homogenous clusters of factors. Determining priority scales depend on the judgment of the experts. One of the applications of the hierarchical model is to find the power plant efficiency with several operating units [44].

The researchers determined the relative efficiency of each power unit and power plant, and hierarchical structure was observed as a multistage procedure. Green and Cook [45] modified the model, such that modified hierarchical efficiency assessment could be viewed at all levels concurrently.

3.3.3 Supply Chain DEA Model

One of the great competitive strategies is the supply chain, which modern enterprises can use because this strategy can influence the production of a variety of products at low expenses, high quality, and short lead times. Newly, DEA has been developed to specify the performance of supply chain operations and the recently proposed ‘supply chain model,’ is applied in the multistage DEA model. Optimization criteria in supply chains models have mentioned cost, inventory levels, profit, fill rates, stock out probability, product demand variance and system capacity. Present DEA-based supply chain models describe and calculate the efficiency of the supply chain. Authors showed that the DEA-based supply chain technique measures the efficiency of its member [46]. Another example is Wong and Wong [47] who studied efficiency of supply chain with the help of the DEA model.

3.4 Stochastic and Fuzzy DEA Model

One of the weaknesses of traditional DEA models is the fixed nature of input and output variables ignoring the stochastic fluctuations in output-input data, such as data entry and computational mistakes [36]. There are two DEA models viz. Stochastic DEA models and Fuzzy DEA models that can deal with the uncertainties associated with the data. These models are described below: If the final solutions are not accepted, then it gathers new data and proceeds next iterations of multi-criteria models,. There are two DEA methods, viz. Stochastic DEA approaches and Fuzzy DEA methods that can deal with the uncertainties associated with the data. These models are described below:

3.4.1 Stochastic DEA Model

Stochastic DEA model uses non-parametric convex hull reference techniques depend upon axioms via a statistical infrastructure in basis of axioms from statistics assumption that allow for an estimation reference technology [48]. Some researchers have included stochastic output and input variables into the DEA methodology.

Stochastic models allow the only inefficiency, but also statistical noise. They identify most quantile rile on assumptions of the composed error terms. These models allow for the probability of two-sided random disturbance, such as determine and specification error in output and input data.

These techniques attain the existence of data error in data and give probabilistic-based outcomes. Some applications are as follows: technical efficiency evaluated through the stochastic DEA, and other is Chance-Constrained Programming (CCP) approach [49]. Khodabakhshi and Asgharian [50] used the stochastic DEA model on input relaxation to measures of efficiency. CCP approaches used by Cooper et al. [49] used stochastic DEA to handle congestion issues.

3.4.2 Fuzzy DEA Models

To deal with the impreciseness of data in real life scenarios, concepts of fuzzy mathematical programming is integrated with DEA to provide a more realistic solution. For example, in [51] the authors incorporated fuzzy mathematical programming into the DEA model, and showed the performance of the DMU as a fuzzy efficiency.

Actually, DEA efficiency scores vary in (0, 1] and DEA methodologies assess the performance of different DMUs. Qin and Liu [52] used fuzzy inputs and outputs in the DEA methodology. Azadeh et al. [53] used integrated approaches like; computer simulation, fuzzy C-means, and fuzzy DEA to optimize operator allocation.

Fuzzy DEA models introduced by Peijun [54] extended the CCR technique to more general forms where crisp, fuzzy and hybrid data could be applied comfortably. Due to uncertainty existing in human contemplation and judgement, fuzzy DEA techniques could play a significant role in conceptual assessment problems comprehensively existing in the real-life.

Fig. 6
figure 6

Timeline of DEA Models

3.5 Hybrid DEA Models

In this paper, we are referring to hybrid DEA models as the ones in which DEA is integrated with other techniques like soft-computing, machine learning, dimensionality reduction methods and other MCDM techniques.

Fig. 7
figure 7

Hybrid DEA models

3.5.1 DEA + Soft-Computing Techniques

Soft-Computing (SC), a vital area of study in computer science since the early 90s, originally referred to the umbrella term for Fuzzy Logic, Genetic Algorithms and Artificial Neural networks. However, with the passage of time, other nature inspired algorithms like Particle Swarm Optimization (PSO), Differential Evolution (DE) etc. are also considered to be a part of SC methods. For the more details of the SC techniques, the interested readers may refer to [55,56,57].

Here we are not discussing fuzzy logic embedded DEA because a portion of it has been discussed in component B of Section 3.4. In this article, the following papers have been reviewed covering the years between 2016 to 2021. Mogha and Yadav [58] proposed a simple DE (SDE) to solve the DEA-based models. Pelesaraei et al. [59] hybridized DEA with Multi-objective Genetic Algorithm (MOGA) to approximate the energy efficiency and GHG emissions of wheat production. Misiunas et al. [60] recommended DEANN, a hybrid of DEA and ANN called DEANN and applied it to predict the functional status of the patients. This technique expressed that DEA will assess efficiencies on an expanded set of DMUs by preparing data for the ANN (training set), a supernumerary ANN interpolated DMU for each of the principal DMU, and It uses it to figure the efficiency frontier by comprising all DMUs.

Jahuar et al. [61] combined DEA with Differential Evolution (DE) and multi-objective differential evolution (MODE) and showed that the integration enhances the discriminatory powers of DEA.

Pendharkar[62] introduced a hybrid GA and DEA framework to solve the constant cost allocation (FCA) problem. This framework allowed to compound various FCA sub-objectives for efficient and inefficient DMUs and solved the FCA problem, and the entire entropy of resource designation is maximized for efficient DMUs. Jauhar et al. [8] applied DE integrated DEA for calculating the performance of a remarkable educational supply chain.

Shabanpour et al. [63] proposed a hybrid of dynamic DEA and ANN, to forecast the future efficiency of the DMUs. Modhej et al. [64] integrated inverse DEA with neural network for preserving the relative efficiency value. Vlontzos and Pardalos [65] consider a combination of DEA with ANN to measure and prognosticate green house gas emissions of EU countries. Mozaffari et al. [66] presented a hybrid GA and ratio-DEA (R-DEA) technique to determine the efficiency of the sustainable supply chain.

Tsolas et al. [67] used a two-phase hybrid DEA and ANN technique (DEA-ANN), which is used as a preprocessor, to use the large size branch data of Greek commercial banks. Pendharkar [68] expressed that an ANN trained by sustainable power on the “efficient” training data subset is powerful than the predictive performance of an ANN trained on the “inefficient” training data subset. Bose et al. [69] have smoothened the efficiency frontier of DEA via training an artificial neural network from input/output prediction multiple of DMUs. Out of the various soft computing techniques, ANN is the one that is integrated the most with DEA.

3.5.2 DEA + Machine Learning Techniques

Potential of ML techniques have been leveraged by several researchers in diverse areas and the domain of DEA is no exception. In this section, we go through selected papers where the researchers have implemented the concepts of ML to enhance the performance of DEA.

Dariush et al. [70] proposed a new framework to reduce calculation time of efficiency when the number of DMU’s are more. Zhu et al. [71] recommended the algorithm to precipitate the computational procedure in the big data environment. The authors proposed DEA as a tool for data-oriented analysis to execute evaluation and benchmark testing. Although the calculation of algorithms has been introduced under the basis of traditional DEA to produce large scale of data (decisions making, units, input, and outputs), the precious knowledge is displayed by the network structure that should be hidden in big data and extracted by DEA [71]. The selected papers are from the years 2020 and 2021 only. More information of the Machine learning techniques can be found in [72, 73].

Tayal et al. [74] measured the efficiency analysis for the stochastic dynamic facility layout problems by meta-heuristics, DEA, and machine learning techniques.They proposed three-stage techniques, where DEA was amplified with supervised and unsupervised ML technique to obtain the unique ranks and predicting the efficiency values of the layouts. Qu et al. [75] investigated the redevelopment of the urban centre blocks, and introduced a DEA, deep learning model. They introduced the DEA, and utilized a deep learning model to predict output-oriented indicators of non-intensive blocks.

Mirmozaffari et al. [76] integrated DEA with clustering algorithms to improve the accurecy of results. Aydin and Yurdakul [77] analyzed the performance of the 142 countries against the COVID-19 through a new three-staged framework by combining DEA approach with four ML algorithms viz. k-means, hierarchic clustering, decision tree, and random forest algorithms. Rebai et al. [78] predicted the performance of the secondary schools in Tunisia by the machine learning techniques. They used a two-level analysis part.

In the first stage, the directional distance function (DDF), estimated through the DEA, was used to deal with the desirable outputs while in the second stage, decision tree, and random forest algorithms were used to identify and visualize the variables of high-school performance. Salehi et al. [79] analyzed and improved the level of adaptive capacity of a the petrochemical plant. They used the DEA method for computing and analyzing the resilience engineering indicators, redundancy, and teamwork. They implemented multilayer perceptron (MLP) to estimate the level of adaptive capacity on the basis of the data.

Nandy and Singh [80] examined farm efficiency by machine learning and DEA techniques to predict the impact of the environmental factor on the farms’ performance. Jomthanachai et al. [81] introduced a cross-efficiency DEA model to find a set of risk factors obtained from failure mode and effect analysis (FMEA) and used ML techniques to predict the degree of remaining risk depending on simulated data corresponding to the risk treatment scenario. Thaker et al. [82] integrated random forest regression with DEA under a two-phase model and measured the corporate governance and bank efficiency.

Zhu et al. [83] combined the machine learning and DEA technique and used BPNN-DEA, GANN-DEA, ISVM-DEA, SVM-DEA techniques. They measured and predicted the efficiency of manufacturing companies of China. Jomthanachai et al. [84] used the DEA models with ML techniques to examine risk management. They used cross-efficiency DEA with ANN and predicted the level of risk. Zhong et al. [85] used an integrated DEA with machine learning technique, Super SBM-DEA-BPNN.

The aim of the authors is to improve the fusion efficiency. Taherinezhad and Alinezhad [86] used integrating DEAML techniques to predict the efficiency score of nations in the COVID-19 data. They used GANN, BPNN, SVM, and ISVM techniques and predicted the regression target (efficiency values). Nandy and Singh [80] determined the farm efficiency by the hybrid machine learning and data envelopment analysis. They used a two-stage DEA model with random forest and logistic regression techniques.

3.5.3 DEA + MCDM Techniques

Despite availability of various Multi criteria decision-making techniques[87], the application of DEA as a discrete alternative MCDM method is also an accepted way to attain reasonable decisions. The foundation of the MCDM techniques was laid in the 1950s and 1960s [88].

For more brief on MCDM techniques, the interested reader may refer to [89, 90]. This section brings forward a succinct review of the recent work on the integration of the DEA with MCDM techniques. The selected papers are taken from the years 2020 and 2021 only.

Kaewfak et al. [91] fuzzy-AHP (FAHP) and DEA model to identify and measure quantitative risk. Bajec et al. [92] developed a distance-based AHP-DEA super efficiency technique to attain the demands of premeditated groups by applying the slack variables.

The purpose of the distance-based AHP model was to attain a hierarchy structure and weights for the criteria.

The aim of the DEA approach was the efficiency evaluation of the 21 fictional providers of the electric bike-sharing system. RezaHoseini et al. [93] used a Z-AHP and Z-DEA, and a Z-AHP-DEA model in their study. The Z-AHP model aimed to identify the conventional sustainability criteria and calculate importance weights. Z- DEA and Z-AHP-DEA models were introduced to measure project efficiency and concluded that Z-AHP-DEA model gave the more effective solution.

Banihashemi and Khalilzadeh [94] used the network DEA model to choose the best mode to mitigate the environmental impacts. Rank measurement of the different execution modes of each activity waas done through the TOPSIS. Wang et al. [95] presented a hybrid approach, DEA window analysis, and FTOPSIS and determined the abilities of 42 countries in terms of renewable energy capacity. Karasakal et al. [96] proposed two novels PROMETHEE based ranking approach to determine the weights and threshold values obtained by the DEA method.

Hoseini et al. [97] measured the performance of the research and development (R& D) organizations by hybridizing DEA and ANP. Lee et al. [98] utilized various DEA models and VIKOR to determine the 12 manufacturing industries in Taiwan. They compared different models of the DEA (CCR, super efficiency, cross efficiency, Doyal-Green efficiency) with MCDM approach VIKOR technique. Karami et al. [99] proposed a three-step integrated procedure, DEA-PCA-VIKOR to evaluate garment industry. PCA reduced the criteria, and the additive DEA model measured the efficiency of the suppliers or efficient suppliers.

Finally, VIKOR was activated to determine the ranking of the efficient suppliers. Mei and Chen [100] used the rough-fuzzy best-worst method and rough-fuzzy DEA to determine and select sustainable hydrogen production technologies. Here, the rough-fuzzy BWM was procedure is introduced for measuring the relative weights of the sustainability criteria, and rough-fuzzy DEA was introduced to prioritize the alternative HPTs. Tanasulis et al. [87] introduced a comprehensive technique of higher education teaching evaluation connecting AHP and DEA(fuzzy AHP is included).

AHP permits the regard of the different values of each teaching performance criteria, although DEA can create a comparison to the teaching perceived by the tutor and the student to understand the range of development for each tutor. Lee et al. [101] developed the integrated two-phase MCDM technique to recognize the relative weights of criteria and examine energy technologies’ relative efficiency against high oil prices. In the first phase, the fuzzy–AHP allocates the relative weights to criteria. In the second phase, the DEA technique determined the relative efficiency of energy technologies versus high oil prices from an economic standpoint.

3.5.4 Miscellaneous

In this section we have incorporated that the techniques that could not fit under the previous sections but play an important role in improving the performances of DEA. Here we have considered only the papers 2018 onwards. Zelenyuk [102] explored price-based aggregation to solve the large dimension. Lee and Cai [103] recommended the Least Absolute Shrinkage and Selection Operator (LASSO) variable selection technique, in data science to extract significant factors. They combined the LASSO and sign-constrained convex non-parametric the least squares (SCNLS), which could be represented as DEA estimators.

The proposed LASSO-SCNLS methods and their variants gave beneficial guidelines for DEA with small datasets. Chen et al. [104] used the latest version of LASSO, known as elastic net (EN), adapted it to DEA, and proposed the EN-DEA and used it for reducing the large dimensions into sparser. Limleamthong and Gosalbez [105] used the mixed-integer programming (MIP) technique for dimensionality reduction of the input and output variables. Main purpose of the authors has to elevate the prejudiced ability of classical DEA technique. Davoudabadi et al. [106] and Deng et al. [107] also used PCA technique to decrease the dimensions of the data.

Smirlis et al. [108], Azizi [109] solved the missing value problem in the crisp dataset by an interval DEA approach. Kuosmanen [110] used an alternative method, constructing the best-practice frontier technique to take missing values. Tao et al. [111] approach contributes to relaxing convexity assumption on production possibility set (PPS) in benchmarking using DEA and provides a powerful managerial instrument to improve the performance. Ren et al. [112] used DEA to identify the congestion in projects of universities.

4 Advantages and Disadvantages of the Models

Up till now we have seen that there exists a fair number of DEA models. Each model has advantages as well as disadvantages. These are summarized in Tables 2 and 3 [16, 113,114,115,116], so that the user can identify the most appropriate model to be used for a particular problem.

Table 2 Advantages and disadvantages of the DEA models
Table 3 Advantages and disadvantages of the hybrid DEA models

5 Applications of the DEA and Its Variants

This section is segregated into two parts. In the first part we provide the implementation of different DEA models on different application problems, while in the second section we provide an overview of the different real life problems that have been solved through DEA or its variants.

DEA models described above have been implemented for solving a wide range of real life problems described in Tables 4 and Table 5. While Table 4 lists the application of traditional DEA models, Table 5 provides the application of hybrid DEA models. It was observed that out of the papers reviewed for this study, Hierarchical DEA models and Stochastic DEA models have been used more for real life problem-solving in comparison to other methods. Hybrid DEA models, particularly DEA hybridized with soft-computing methods and DEA hybridized with each-other, appear to be more popular in comparison to the traditional DEA models. These Tables are prepared on the basis of the most cited papers.

Table 4 Applications of the different models of DEA
Table 5 Application of the hybrid DEA techniques
Fig. 8
figure 8

Application of various DEA models

5.1 Applications of DEA

In this subsection, the review has been done from the point of view of the application and which models have been applied for solving the applications.

5.1.1 Banking

One of the earliest applications DEA is found to be in the banking sector in the year 1984. David Sherman and Franklin Gold [162] determined the efficiency of 14 branch offices of a bank. They used the CCR model and compared operating efficiency between the branches. In 1987, Celik Parken [163] examined the performance of the thirty-five branches of the Canadian bank and measured the efficiency.

Rangan et al. [164] recommended the production frontier technique to determine the technical efficiency of the 215 US banks. Pastor et al. [165] introduced the DEA with Malmquist index techniques to study bank efficiency and productivity growth rate. Thompson et al. [166] integrated the sense of the confidence region with the DEA model (DEA/AR). They measured the 100 largest banks of the US. Soteriou and Zenios [167] used benchmarking technique of DEA to approximation the efficiency and credible charge of bank products at the branch level. Dekker and Post [168] used the Free Disposal Hull (FDH) model to find the efficiency of Dutch banks. Sueyoshi [169] used an extended DEA model, DEA-Discriminant Analysis (DEA-DA) to evaluate the performance of the Japanese bank.

Chen [170] examined the technical efficiency of 39 banks of Taiwan. He used chance-constrained data envelopment analysis (CC-DEA) and stochastic frontier analysis (SFA). Emrouznejad and Anouze [171] calculated the efficiency of the banking branches in the Gulf Cooperation Council countries. They proposed a DEA framework with a classification & regression model. Azad et al. [172] examined the efficiency of Islamic, the conventional banks in Malaysia. They used the customary efficiency sense, a black box with three-stage NDEA model. Thaker et al. [82] investigated the efficiency of Indian banks. They used a machine learning technique and random forest regression with DEA under a two-phase model and measured the efficiency of Indian bank.

5.1.2 Sports

Sueyoshi et al. [173] examined the efficiency of baseball players.They used a slack-adjusted (SA-DEA) model to determine their ranking scores. Lozano et al. [174] determined the efficiency of the nations at the recent five summers Olympic Games. Garcia-Sanchez [175] applied a three-phase DEA technique to specify the efficiency of the Spanish football teams. Gutierrez and Ruiz [176] introduced the DEA and cross-efficiency appraisement to measure the individual game performance of Spanish Premier League handball players. Dhordjevic et al. [177] evaluated the technical efficiency of the national football team for the 2010 FIFA World Cup.

They used the multi-stage DEA model. Villa and Lozano [178] determined the efficiency of the basketball performance on the basis of assessment of the players and the performance of the teams. The authors used the dynamic network DEA model. Adhikari et al. [179] used a modified DEA technique, namely supper efficiency DEA, to select the cricket players.

5.1.3 Agriculture & Farm

The first paper observed to be in the field of agriculture is by Fare et al. [180]. Who determined the technical efficiency (TE) of agriculture in Philippine. Cloutier and Rowley [181] measured the productive efficiency of the Quebec dairy farms at Quebec. Zaibet and P.S. Dharmapala used stochastic production frontier (SPF) and DEA methods to determine the percentage of the farmers. Majiwa et al. [182] have explained the network DEA technique to determine the PH efficiency of milling using the data of the rice procuring industry. Stefanos et al. [183] applied fuzzy DEA technique in the agriculture sector to determine the efficiency of the organic farms. Sefeedpari et al. [184] examined the efficiency assessment of dairy farming in Iran by implementing window data envelopment analysis (W-DEA) model with energy use as inputs and milk production as output for 25 provinces in Iran. Candemir [185] determined the efficiency of the cotton enterprises Kahramanmaras in Turkey. He obtained technical efficiency, pure technical efficiency (TE), scale efficiency (SE), allocative efficiency (AE), and economic efficiency (EE). Cecchini et al. [186] used the two-stage DEA model to measure the efficiency and super efficiency of the 76 meat-producing sheep farms. Their second objective was to examine the effect of animal welfare and management indicators on TE values.

5.1.4 Transportation

Michael Schefczyk [187] introduced the DEA methodology to examine the efficiency of 15 airlines. Their study analyzed the factor of the high profitability and performance of the airline. Oum and Yu [188] used a two-step procedure in their study. First, they examined the gross efficiency index from the panel data of 19 railways through DEA, and then they implied Tobit regression to identify the public subsidies. Cowie and Riddington [189] measured the efficiency of European railways. Nicole AdlerBoaz Golany [190] determined the deregulated railway networks used by combining DEA and principal component analysis (PCA). They studied the West European air transportation industry. Laya Olfat and Mahsa Pishdar [191] proposed fuzzy dynamic NDEA model based on the double production frontier for evaluating the sustainability of the Iranian airports. Sarmento et al. [192] in their paper measured the efficiency of the seven highway projects in Portugal by exploiting DEA and the Malmquist index techniques. Song et al. [193] introduced a three-stage DEA technique and measured the operational efficiency of the air transport industry in China. Chen et al. [194] utilized inverse DEA technique for analyzing the road safety issues in China.

5.1.5 Education

Charnes et al. [195] used the DEA model for measuring the efficiency of public schools’ education. One of the other studies in the school areas was Bessent and Bessent [196]. Sarrico and Dyson [197] introduced the DEA model to evaluate the performance of UK universities.Casu and Thanassoulis [198] measured the cost-efficiency in central administrative services in UK Universities. Kongar et al. [199] used a two-step DEA approach to determine the relative efficiency of the applicants to for graduate programs in engineering. Miranda et al. [200] introduced DEA and stochastic frontier analysis (SFA) models to examine the technical performance of the higher education institutions in the field of business administration courses. Thanassoulis et al. [87] evaluated higher education in Greece. They combined analytic hierarchy process along with DEA. Si and Qiao [201] determined the performance of the financial expenditure in science and maths education in China. Iulian [202] used the CRS DEA and Malmquist indicators models and measured the performance of the Romanian higher public education institutions. Foladi et al. [203] introduced the inverse dynamic DEA (IDDEA) approach for evaluating the faculties of the different departments of Urmia University. Chen and Chang [204] measured the efficiency of the National Chung Cheng University departments in Taiwan.

5.1.6 Communication

The DEA utilization, in communication, has attained great attention in the current years. The studies are mostly in the areas of computer network, telecom industry, Information and communication technology (ICT) etc.

Giokas and Pentzaropoulos [205] evaluated the operational efficiency of large-scale computer networks. Uri [206] calculated the technical and allocative efficiency of the local substitution carriers. Fernandez-Menendez et al. [207] showed in their paper the level of information and communication technology (ICT) within the organization. They measured the technical efficiency of the Spanish firms through the DEA. Goto [208] studied the fiscal effort of the world telecommunications industry via DEA–Discriminant Analysis (DEA-DA). Liao and Lien [209] measured the technology gaps related to the efficiency of Asia Pacific Economic Cooperation (APEC) integrated telecommunications operators using the DEA meta frontier technique. Lu et al. [210] investigated the influence of the information and communication technology (ICT) progress on construction labour productivity (CLP).

5.1.7 Fishery

Vestergaard et al. [211] measured the capacity and capacity utilization (CU) in the fishery through the DEA. Maravelias and Tsitsika [212] applied a simple DEA technique and found Mediterranean fisheries’ fleet capacity and economic efficiency. Rowe et al. [213] recommended a combination of life cycle assessment (LCA) and DEA technique life cycle assessment (LCA) and DEA to link socioeconomic and environmental evaluations of fisheries. Kim et al. [214] measured the productive efficiency of the sandfish coastal gillnet fishery through the stochastic frontier analysis (SFA). Espino et al. [215] examined the fishing valence and structure excess of fishing valence over the endurable level of the fleet. Wang et al. [216] evaluated the performance of fisheries enterprises using the Malmquist model.

5.1.8 Tourism

Onut and Soner [217] evaluated the energy efficiency of 32 five-star hotels in the Antalya regions. Sun and Lu [218] evaluated the Taiwan hotel industry by the weight slack–based technique. Chou et al. [219] measured human resource management in the tourism industry or agency through the Malmquist productivity index analysis technique. Taheri and Ansari et al. [220] determined the technical performance of the cultural-historical museums in Tehran, through DEA. Chiu and Wu [221] analyzed the efficiency of 49 international tourist hotels (ITHs). Shieh [222] investigated the connection between the green and cost efficiency in the international tourist hotels in Taiwan. Huang et al. [223] improved the two-stage DEA model and to investigate the effect on the productive efficiency, occupancy, and catering service of Taiwan’s hotels. Cuccia et al. [224] explored the influences of cultural heritage in fostering tourism through their paper. Their study checked the influences of UNESCO world heritage list inscription on tourism destination efficiency in Italy. In their empirical study, Yang et al. [225] took eight ethnic regions of China and evaluated the tourism investment and poverty alleviation.

5.1.9 Health care

The first paper in the area of healthcare was observed to be of Nunamaker [226], where the author provided an application of DEA to calculate the efficiency of the routine nursing service at the Wisconsin hospital. Jimtnez and Smith [227] judged the performance of primary health care through DEA. Johnston and Gerard [228] measured the relative efficiency of the breast screening units of UK. Kontodimopoulos and Niakas [229] measured the efficiency of the hemodialysis units in Greece. Rouse and Swales [230] identified the efficient expenditure level of public health care service through DEA. Ballestero and Maldonado [231] employed DEA to measure the rank of the hospital activities. Nayar and Ozcan [232] determined the technical efficiency of the hospital and measured the quality performance. Jat and Sebastian [233] determined the technical efficiency of the district hospitals in Madhya Pradesh, India. Arteaga et al. [234] evaluated the efficiency of kidney transplantation using DEA. Cinaroglu [235] examined the effect of hospital size on converts in the efficiencies of public hospitals.More recently, Hamzah [236] measured the efficiency of the healthcare system for COVID-19 restraint through the network DEA model.

5.1.10 Automobile

Brey et al. [237] applied a non-parametric technique to conduct the development of technologies in the automobile sector. Parameshwaran et al. [238] measured the performance of the automobile repair shops through integrating fuzzy AHP and DEA techniques. Hwang et al. [239] investigated the communication between the dynamic of price discounts at the dealership level and production efficiency in the auto market in Spanish. Wang et al. [240] analysed the performance of the world’s top 20 automakers by Malmquist productivity index [MPI], a DEA model. In the automobile area, the reviewed papers show that the CCR model, integrating fuzzy AHP and DEA, Malmquist productivity index [MPI], VRS multiplier DEA, techniques used.

5.1.11 Forestry

Vitala and Hanninen [241] measured the efficiency of the regional Forestry organization through the DEA. Bogetoft et al. [242] evaluated the different offices of the Danish Forestry extension service and Safak et al. [243] specified the efficiency of the forest sub-district in the Denizli Forestry Regional Directorate by fuzzy-DEA model. Limaei [244] examined the efficiency of the Iranian forest industry. Li et al. [245] determined performance of the China’s forestry resources and sustainable development of forestry. Lu et al. [246]studied the non-dynamic Slack-based DEA model and evaluated the energy efficiency of forestry areas.

5.1.12 Water

Lambert et al. [247] investigate the relative efficiencies of the publicly and privately owned water utilities. Thanassoulis [248] estimated the cost-saving for water companies in England. Daz et al. [249] studied of the irrigation districts in Andalusia, Southern Spain. Lilienfeld and Asmild [250] examined the influence of irrigation system type. Alsharif et al. [251] determined the efficiency of the water supply system for West Bank and Gaza. Frija et al. [252] investigate the technical efficiency of unheated greenhouse farms and measured the water efficiency used for irrigation. Rosas et al. [253] estimated the operational productivity of the Mexican water utilities and identified the context variables that influence their efficiency. Soltani et al. [254] developed a new water quality index (WQI) of the 47 Algerian dams.

5.1.13 Real estate

Lins et al. [255] measured the value range for real estate units. Wang examined the performance of the government real estate investment. Xue et al. [256] applied DEA base MPI technique to assess the performance of the construction industry in China. Chancellor and Lu [257] analyzed the regional productivity of Chinese construction industry. Li et al. [258] determined the performance of the contractors in China and Hong-Kong. Li et al. [259] studied the real estate market of China and determined the supply side efficiency of 29 provinces. Liu et al. [260] utilized the two-stage MPI in their paper and calculated the productivity efficiency of the real estate industry in China. Liu et al. [261] studied the same problem, but they used the meta-frontier SBM technique. Malmquist productivity index (MPI), Slack-based DEA, double perspective-DEA (DP-DEA), Färe-Primont DEA, techniques were used in the real estate application.

5.1.14 E-business

Cinca et al. [262] determined the efficiency in dot com firms. The objective of the firms is to obtain revenues from their activities and make an impact on the internet. Menendez et al. [207] examined the technical efficiency of Spanish firms by the information and communication technology (ICT). An et al. [263] measured the performance of the Chinese high-tech industries through the dynamic network DEA technique. Table 6 shows the various models used in the different application area (Table 6).

Table 6 A list of the applied models in the different applications

6 Software

We were able to identify 18 software that can be utilized for application of DEA. Description of the software is provided in Table 7 [264, 265].

Table 7 Details of software

7 Bibliometric Analysis

Bibliometric analysis is an excellent way of identifying the trends in a research area. Therefore, the remaining sections of this article are devoted to it. The bibliometric analysis is based on the database” web of science” for the years 1981 through 2021. The statistics is based on: year of publications, application area, journal quartile and its publication, area of research, and country-wise distribution and researcher statistics (Figs. 6, 7, 8, 9).

7.1 Year-Wise Statistics

Since 1981 to 2021, the total number of papers listed in web of science database is 10,691. The first paper, published in 1981 [178], has more than 733 citations. In 1984 two papers were published on the basic concept of the efficiency measurement of the DMUs. 1994 onward, the number of papers on DEA started increasing. This trend is shown in Table 8 and a pictorial representation is given in Fig. 9 (Table 8). Table 9 shows the published number of articles in the year 1981 to 2021 and graphically representation shows in the Fig. 10.

Table 8 Year of publications
Fig. 9
figure 9

Year-wise publication of DEA papers

Table 9 Types of articles published
Fig. 10
figure 10

Quantitative analysis of the articles on the basis of article type

7.2 Statistical Analysis Based on the Quartile

Tables 10, and 11 show the list of journals, along with their quartile, where DEA articles have been published (Fig. 10). A graphical representation of the impact factor of different journals is shown in Fig.11.

Table 10 Journal names and ranking
Table 11 Journal names and ranking
Fig. 11
figure 11

IF of different journals based on quartiles

7.3 Researchers Statistics

In this section, information about the most cited authors is listed in Table 12. We have listed only the papers for which the citation is above ten thousand.

Table 12 Highest cited publications

7.4 Recent Trends

In this section, we review the current trends of research in DEA, which includes articles from the years 2020 and 2021. The search results are taken from the WoS database (Table 13).

Table 13 Area of research of the DEA
Fig. 12
figure 12

Number of publications of area of research

8 Conclusion

The main objective of this paper is to revel the 40 years of existence of DEA and to generate an interest among the new researchers towards DEA by bringing forward its variants, its applications and above all its potential. This article presents an overview of DEA (1981–2021). From the collected data and statistical analysis, the following conclusions may be drawn:

  • Since 1981, the graph of DEA is constantly rising with number of papers being more than 4000 between the period 2018 – 2021 in comparison to only 9 publications during the period 1981–1987. W.W. Cooper is to be the most cited author with around 40000 (forty thousand) citations till date.

  • Due to the applicability of DEA to diverse domains, one can find articles related to DEA in several journals. It was observed that there are around 169 journals that have published papers related to DEA, out of which 81 belongs to the first quartile or are Q1 journals, at the time of writing this report.

  • DEA has been hybridized with other methods so as to improve its efficiency and to increase its discriminatory power, a major shortcoming of DEA. Mostly DEA is used in conjugation with other MCDM method like AHP, TOPSIS etc. but it has also been combined with ANN.

  • From the application point of view, the two areas where DEA has been applied most frequently are education and banking. One of the reasons for this could be the availability of data that may be shared. In the last two years, business economics and operations research and some new emerging areas where DEA is being applied are remote sensing and polymer science.

  • In the last two years, the highest number of publications in the area of business economics (388), operational research (279), engineering (266), and science technology (229). These areas have seen the application of DEA earlier as well but it was interesting to see newer areas like geology, chemistry, physics, polymer science and remote sensing where application of DEA has started.

  • Out of the various DEA models proposed so far, Network DEA, hierarchical DEA, and stochastic DEA models are apparently, the most frequently used models in different application areas. All these models are the extension of classical DEA models. However, DEA has also been hybridized with other decision making techniques like AHP, TOPSIS etc.

  • Integration of DEA with Machine Learning models is one of the emerging areas of DEA which may have a lot of potential in the coming years due to the increase in volume of data, availability of good computing facilities and increase in the complexities of real world problems.

9 Future Directions

  • DEA is a versatile technique having immense scope from the development as well as from the application point of view. In the present review article, we saw areas like polymer science and remote sensing where DEA is applied likewise there can be new avenues where potential of DEA can be explored.

  • Integration of DEA with ML algorithms is another interesting area which seems to have a lot of potential. Both are decision making techniques but are conceptually different. DEA is mainly employed for determining the efficiency of the DMUs but after integrating it with ML techniques, one can also forecast the future efficiency of DMUs. Also for handling big data, ML and DEA can be combined for more efficient decision making.

  • Fuzzy DEA is an important concept which tries to incorporate the uncertainties of the real world scenarios. In this article we have devoted a small section to Fuzzy DEA, which probably is not justified considering the significance of the method. A separate detailed study can be prepared for Fuzzy DEA in the form of a review article.

  • Some of the important areas not included in this study are ranking methods and time series methods of DEA. A detailed analysis of these techniques will form an interesting piece of article. Further, a study on identifying and handling of undesirable input and output variables will also provide useful information to the researchers working in this area.