Keywords

1 Introduction

In 2007, the European Commission’s Freight Transport Logistics Action Plan introduced the concept of Green Transport Corridors (GTC) for freight transportation between major hubs as integrated multimodal used to reduce environmental impact via road, rail, waterways, and intelligent technologies (European Commission, 2007).

Green Transport Corridors (GTCs) promote environmental-friendly freight transportation through the efficient management of investments, operations, integration of transportation routes, and transportation modes (land, water, and air) (Panagakos, 2015). For achieving and managing the maximum efficiency, it is necessary to measure the GTCs performance, through Logistics Performance Index (LPI), especially considering economically relevant hubs and long-distance routes (Panagakos, 2015). It is important to note that the LPI from the World Bank (2018) was developed using Principal Component Analysis (PCA). Also, the LPI is applied at a country-level, without considering regional specifications or in-country transportation routes. Hence, the LPI from the World Bank (2018) differs from the LPIs developed using Data Envelopment Analysis (DEA) to evaluate routes and corridors. Though, in practical terms, the proposal of a wide accepted LPI-development methodology faced different obstacles, depending on if the LPI is supposed to be applied at a country level or at a regional level (Alves Junior et al., 2021; Melo et al., 2018, 2020; Rentizelas et al., 2019).

Specifically, previously proposed LPIs which rely on DEA models to assess and compare regions have faced the common challenge of factoring “route length” into their models. Firstly, some papers (Alves Junior et al., 2021; Rentizelas et al., 2019) excluded the transported length of the model, considering only the indirect variables (dependent on the length) such as transportation costs, fuel consumption, emissions, etc. This approach has the back draw of ignoring one of the most affecting logistic characteristics.

For example, a DEA Slack-Based Measure (SBM) model, with variable returns to scale (VRS), was applied for choosing alternatives in the international biomass supply chain (Rentizelas et al., 2019). Three variables were considered: costs and energy input (as inputs) and emissions (as undesirable output). It can be stated that the number of chosen variables (three) is very limited to incorporate the complexity of the system. Alves Junior et al. (2021) proposed a single multi-criteria Logistics Composite Index (LCI) for GTCs. The authors applied it to Brazilian agricultural bulk transport export corridors, considering the existing and planned infrastructure in the harvest year of 2018/2019. They used seven variables (classified as desired and undesired inputs as well as desired and undesired outputs) but did not use the length in the DEA model.

Secondly, on the other hand, it is also possible to consider the length as a totally non-discretionary (tND) measure, assuming, i.e., decision- and policy-makers cannot change the length of the route, independently of their amount of investments or efforts. This was proposed by Melo et al. (2018) for investigating 102 soybean haulage routes in Brazil and the USA. The authors considered nine variables, classified them into inputs, outputs, undesirable outputs, and length as a tND measure.

Thirdly, it is also possible to consider the length as a DEA input, i.e., a measure, which the minimization is aimed (Cook et al., 2014). The classification of the length as an input implies the assumption that the transported length is fully under the control of decision- and policy-makers, depending exclusively on interests and effort focus.

We argue that, in real-world applications, decision- and policy-makers can change the transported length. Though they are usually limited by external constraints at a certain level. Hence, we investigate the possibility of integrating the length to the model as a partially Non-Discretionary (pND) measure, i.e., a measure that can be reduced until a certain percentage.

Along these lines, we aim to propose a long-distance cargo haulage performance index (LPI) methodology, integrating pND measures. For the first time in an LPI application, the pND characteristic was incorporated in a Double-Frontier of a Slack-Based Measure (SBM) under Variable Return to Scale (VRS) assumptions. The application is in 12 GTCs (encompassing 254 routes), considering the soybean transportation in Brazil, during the harvest of 2018/2019.

Hence, the LCI proposed here, incorporating pND measures and applying a Double-Frontier Data Envelopment Analysis (DEA), Slack-Based Measure (SBM) under Variable Return to Scale (VRS) assumption (pND-DF-SBM-VRS) to evaluate GTCs and their multimodal routes is a novelty, resulting in innovative methodology with practical implications for public investments.

Subsequently, the results of the proposed methodology were compared to the results considering the length as a tND measure and as an input. The pND efficiency results were similar to the efficiency results considering the length as an input. Though the pND assumption proved to be useful for constructing efficiency-improvement goals. Goals constructed based on the input assumption can be physically unachievable (such as proposing 18% of the length reduction for reaching efficiency, passing through a natural reserve area).

The long-distance cargo haulage performance index integrating pND measures may be used to guide future investments in infrastructure. And the methodology can be a useful tool in different contexts of application (such as other countries and other transported cargos).

2 Literature Review

Based on the multi-attribute utility theory (MAUT) and the decision theory, Dyckhoff and Souren (2020) proposed the multi-criteria production theory (MCPT) for applying methods to multi-criteria decision making (MCDM) problems—such as Data Envelopment Analysis (DEA) for decision-making in production systems. Many previous authors tried to formulate special DEA-MCDM models (Belton & Vickers, 1993; Doyle & Green, 1993; Joro et al., 1998) with some specific characteristics from Multi-Objective Linear Programming (MOLP). However, in general, DEA is a method to measure the efficiency of DMUs (Charnes et al., 1978), but its concept also relies on decision theory, even though this aspect has been ignored by part of the DEA literature, as well as it relies on the production theory (Charnes et al., 1985).

For example, Li and Reeves (1999) presented a Multiple Criteria DEA which can be used to improve discrimination power. Sarkis (1997) and Dvorakova and Klicnarova (2017) also applied DEA as an MCDM tool. Besides, it is argued that assigning arbitrary weights lead to the subjectivity problems in some MCDM approaches, as this limitation can be seen in AHP, TOPSIS, VIKOR, etc. (Hu et al., 2017; Noryani et al., 2018; Shen et al., 2018) because it requires subjective assessments of the decision-maker to prioritize performance attributes (Alinezhad et al., 2011). According to Jahedi and Méndez (2014), although subjectivity can be useful in some situation, for example, mainly when objective data is difficult to obtain, subjectivity suffer from systematic biases, it can be uncorrelated or negatively correlated to the objective data or it can be difficult to interpret. DEA is less subjective, because it does not rely on the decision-makers’ preference, so it is more suitable in the present context (Greco et al., 2018).

Among DEA models, Dyckhoff and Souren (2020) highlighted the adequacy and relevance of non-oriented additive DEA models for MCDM, especially, the slack-based measure (SBM), created by Tone (2001). These models take all slacks into account for efficiency measurement. Consequently, they directly identify strongly efficient solutions without the additional calculations necessary in radial models. In addition, as it is often hard to justify an orientation of a DEA model, the absent orientation of SBM represents yet another advantage.

One of the seminal assumptions of DEA is the homogeneity among DMUs. The acceptable limits of heterogeneity remain under discussion. Li et al. (2016) proposed the adoption of a non-homogeneous DEA model for solving non-homogeneity problems. Among DEA pitfalls, Cook et al. (2014) pointed out the misjudgment of efficiency when inputs and outputs simultaneously deal with ratio and raw data. However, under certain circumstances, the authorsstated that i the dealing with different types of data in the same DEA model is acceptable. The present paper did not assume the restriction of data type as a condition for this index construction.

The discrimination power in DEA is affected by the ratio between the number of DMUs and variables. Banker et al. (1989) stated that the DMUs may be, at least, three times more than variables. Notwithstanding, it is not an imperative rule, just accepted by convenience (Cook et al., 2014). It was assumed as a desirable target here.

Besides outputs and inputs, DEA also may have variables classified as undesirable outputs, e.g., pollutions. An interested reader about this variable type may consult (Hua & Bian, 2007; Liu et al., 2010; Seiford & Zhu, 2002). Among the possible treatments, this paper chose to insert inverted emissions as inputs (for minimization), based on the judgment of specialists.

There are also variables (inputs and outputs) classified as partially non-discretionary (pND). Melo et al. (2018) incorporated the concepts of non-discretion of Saen (2005) to the SBM, assuming no control of the variable (i.e., totally non-discretionary, tND). This paper goes a step further, incorporating a pND (e.g., assuming up to 5% of control of the variable) under VRS in a Double-Frontier-SBM applied to the context of Green Transport Corridors. This incorporation came from the assumption of the possibility of reducing the length of the route in a certain percentage by public investment strategies. The value of 5% was assumed because the percentage of yearly changes in the road (from 2001 to 2017) was up to 4.21% (DNIT, 2020).

3 Methods

The current investigation involved: (1) defining DMUs (routes from GTCs) and collecting data, (2) analyzing available variables and classifying them into DEA measures, (3) applying the pND-SBM model, and, finally, (4) applying the tiebreaking tool.

3.1 DMUs Definition and Data Collection

We considered a total of 245 DMUs (routes from 12 GTCs) during the harvest year of 2018/2019. Since Alves Junior et al. (2021) have already studied several routes and Green Transport Corridors in Brazil, we are using the same databases described in their paper, so the results of the present chapter can be compared to the literature. We considered only the currently existing infrastructure and not the planned projects with estimated values. The DMUs originated from producing mid-sized regions in all Brazilian macro-sized regions (IBGE, 2019) and are destined for the 12 main exporting ports. Figure 1 shows the ports and multimodal infrastructure and transport network in the main soybean export corridors (Ministry of Infrastructure, 2021).

Fig. 1
figure 1

Ports and multimodal infrastructure and transport network in the main soybean export corridors. Source: Ministry of Infrastructure (2021)

3.2 Variable Classification and Analysis

In DEA literature, the most usual variable classification is formulated considering desirable outputs (O) (measures to be maximized), and desirable inputs (I) (measures to be minimized) (Cook et al., 2014). Though, in real-life problems, there are also undesirable outputs (UO) (to be minimized) and undesirable inputs (UI) (to be maximized) (Liu et al., 2015). Finally, some variables can be classified as a partially non-discretionary (pND) measure, i.e., including a quasi-fixed factor that is almost not under control (Saen, 2005). The criterion for choosing a variable was the systematic judgment of specialists about the relevance of a variable for the model (Golany & Roll, 1989), considering the whole scenario, previous papers (Alves Junior et al., 2021; Melo et al., 2018), and the objective of the index. Table 1 shows the classification and the descriptive statistics of the eight used variables.

Table 1 Classification and descriptive statistics of selected variables

As can be noted in Table 1, EXPORTS (O) is the only measure that presents a greater standard deviation of the observed values than the mean. This is caused by the difference between productivity capacity in Brazil that lead to using the same ports for exporting (for example, the following ports: Santos, Santarém, Paranaguá, and Rio Grande). Similarly, PAVED (UI) is the measure with the smallest standard deviation in comparison to the mean. The unique pND measure (LENGTH) presents a standard deviation related to the mean of 68.42%, similarly what happens to the STORAGE (69.54%). But the fact that LENGTH is established as a pND measure may restrict more the impact of the dispersion of the observed values on the final DEA rank results. In other words, the dispersion of STORAGE is expected to have more impact on results. Further in the Findings.

Similarly to Alves Junior et al. (2021), because we are using the same databases (available through the same GitHub link informed in their paper, please see the data for DEA application to evaluate Brazilian GTCs at GITHUB (2021)), EXPORTS is the amount of exported soybeans and corn by each port (103 tons) and is classified as an Output (O). DEPTH is the highest draft depth of each port (meters), STORAGE is the grain storage capacity in the catchment area (103 ton), and PAVED is the paved road density (102 km of road/km2 of the area). DEPTH, STORAGE, and PAVED are classified as Undesirable Inputs (UI). COST is the weighted average freight cost of the flows arriving in each export port (BRL/ton) and is classified as an Input (I). CO2 is the weighted average CO2 emission (kg of CO2/ton) and ACCIDENT is the number of accidents per kilometer estimated in the transportation corridor (accidents/km). CO2 and ACCIDENT are classified as undesirable outputs (UO).

LENGTH is the length of the route from the origin to the final destination (km). Here we propose to classify LENGTH as a non-discretionary measure (pND). In most cases, we assumed that it is not physically possible to meaningly shorten the transportation distance, by moving the position of the most productive areas, the position of the main infrastructure poles and destinations (ports), planting in similar areas with shorter length of the routes or investing in the construction of a straighter route.

3.3 Slack-Based Measure Model with Partially Non-discretionary Measures (pND–SMB)

The equating of the SBM model (Tone, 2001) with incorporated non-discretionary measures (Saen, 2005) follows the objective function in Eq. (1), and it is constrained by Eqs. (2)–(4), and (7) (SBM constraints), and Eqs. (5) and (6) (non-discretionary constraints) (Saen, 2005):

$$ \operatorname{Minimize}\ \uptau =\mathrm{t}-\left(\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$\mathrm{m}$}\right.\right)\sum_{\mathrm{i}=1}^{\mathrm{m}}\raisebox{1ex}{${\mathrm{S}}_{\mathrm{i}}^{-}$}\!\left/ \!\raisebox{-1ex}{${\mathrm{x}}_{\mathrm{i}0}$}\right. $$
(1)

Subject to:

$$ \mathrm{t}+\left(\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$\mathrm{s}$}\right.\right)\sum_{\mathrm{r}=1}^{\mathrm{s}}\raisebox{1ex}{${\mathrm{S}}_{\mathrm{r}}^{+}$}\!\left/ \!\raisebox{-1ex}{${\mathrm{y}}_{\mathrm{r}0}$}\right.=1 \vspace*{-18pt}$$
(2)
$$ \sum_{\mathrm{k}=1}^z{\Lambda}_{\mathrm{k}}{\mathrm{x}}_{\mathrm{i}\mathrm{k}}+{\mathrm{S}}_{\mathrm{i}}^{-}-{\mathrm{tx}}_{\mathrm{i}0}=0\kern1.25em \mathrm{i}=1,2,\dots, \mathrm{m} \vspace*{-18pt}$$
(3)
$$ \sum_{\mathrm{k}=1}^{\mathrm{z}}{\Lambda}_{\mathrm{k}}{\mathrm{y}}_{\mathrm{r}\mathrm{k}}-{\mathrm{S}}_{\mathrm{r}}^{+}-{\mathrm{ty}}_{\mathrm{r}0}=0\kern1.25em \mathrm{r}=1,2,\dots, \mathrm{s} \vspace*{-18pt}$$
(4)
$$ {\mathrm{S}}_{\mathrm{i}}^{-}\le {\upbeta}_{\mathrm{i}}{\mathrm{x}}_{\mathrm{i}0}\kern0.75em \mathrm{i}=1,2,\dots, \mathrm{m} \vspace*{-18pt}$$
(5)
$$ {\mathrm{S}}_{\mathrm{r}}^{+}\le {\upgamma}_{\mathrm{r}}{\mathrm{y}}_{\mathrm{r}0}\kern0.75em \mathrm{r}=1,2,\dots, \mathrm{s} \vspace*{-18pt}$$
(6)
$$ {\Lambda}_{\mathrm{k}}\ge 0,{\mathrm{S}}_{\mathrm{i}}^{-}\ge 0,{\mathrm{S}}_{\mathrm{r}}^{+}\ge 0\ \mathrm{and}\ \mathrm{t}>0 $$
(7)

where τ is the efficiency, t is the model linearization variable, \( {S}_i^{-} \) is the slack of the ith input, \( {S}_r^{+} \) is the slack of the rth output, Λ k is the contribution of the kth DMU to the analyzed DMU, x i0 is the ith input of the DMU under analysis, y r0 is the rth output of the DMU under analysis, x ik is the ith input of the kth DMU, y rk is the rth output of the kth DMU, m is the number of inputs, s is the number of outputs, z is the number of DMUs, and β i and γ r are constants of discretion, respectively, for inputs and outputs (when assuming a value equal to 0, they represent a tND measure and infinite or excluding the constraint represents a totally discretionary input, i.e., a standard SBM model).

As explained in the Literature Review, it was assumed that the length of the route could be 5% controllable due to slight changes on the routes. For example, even in a microregion, there are differences in the length of the route depending on how distant from the center of the origin it is or it can be changed by public investments in transportation infrastructure and land use (DNIT, 2020). In other words, we assumed βi = 0.05 in Eq. (5).

According to Cook et al. (2014), mixing raw data with ratios is permissible in DEA, but the Variable Return to Scale (VRS) assumption is preferable, mainly if the ratio data is in percentages because considering Constant Return to Scale (CRS) assumption not always maintain the projection between 0% and 100%. As the present application requires the VRS assumption, it was necessary to add a constraint, according to Eq. (8).

$$ \sum_{\mathrm{k}=1}^{\mathrm{z}}{\Lambda}_{\mathrm{k}}=\mathrm{t} $$
(8)

The optimum solution (\( {\tau}^{\ast },{t}^{\ast },{\varLambda}_k^{\ast },{S}_i^{-\ast },{S}_r^{+\ast}\Big) \) is described by the conditions in Eq. (9):

$$ {\uptau}_{\mathrm{optimal}}={\uptau}^{\ast },{\uplambda}_{\mathrm{k}}^{\ast }=\raisebox{1ex}{${\Lambda}_{\mathrm{k}}^{\ast }$}\!\left/ \!\raisebox{-1ex}{${\mathrm{t}}^{\ast }$}\right.,{\mathrm{s}}_{\mathrm{i}}^{-\ast }=\raisebox{1ex}{${\mathrm{S}}_{\mathrm{i}}^{-\ast }$}\!\left/ \!\raisebox{-1ex}{${\mathrm{t}}^{\ast }$}\right.,{\mathrm{s}}_{\mathrm{r}}^{+\ast }=\raisebox{1ex}{${\mathrm{S}}_{\mathrm{r}}^{+\ast }$}\!\left/ \!\raisebox{-1ex}{${\mathrm{t}}^{\ast }$}\right. $$
(9)

In this model, a DMU will be considered efficient when τ  = 1. Where \( {\lambda}_k^{\ast } \), \( {S}_i^{-\ast } \), and \( {S}_r^{+\ast } \) are the original optimal variables (before linearizing) solutions. In the model, we treated UO as a negative factor and UI as a positive factor. In other words, UO is mathematically treated as the opposite of an output, i.e., as an input, so, in a post-efficiency analysis, the goal is to decrease the UO. Similarly, UI is mathematically treated as the opposite of an input, i.e., as an output, so, in a post-efficiency analysis, the goal is to increase the UI. This approach was already adopted and discussed by previous papers (Alves Junior et al., 2021; Melo et al., 2018).

3.4 Tiebreaking Method: Double-Frontier Logistic Composite Index (LCI)

The tiebreaking method of the composite index (Leta et al., 2005), also named as Double-Frontier method, was applied, according to Eq. (10). It represents an arithmetic average between standard and inverted efficiencies standardized by the maximum composite index of the analyzed population.

$$ \begin{array}{l}\mathrm{LCI}=\raisebox{1ex}{$\left[{\mathrm{E}}_{\mathrm{k}}^{\mathrm{standard}}+\left(1-{\mathrm{E}}_{\mathrm{k}}^{\mathrm{inverted}}\right)\right]/2$}\!\left/ \!\raisebox{-1ex}{$\max\ \left\{\left[{\mathrm{E}}_{\mathrm{k}}^{\mathrm{standard}}+\left(1-{\mathrm{E}}_{\mathrm{k}}^{\mathrm{inverted}}\right)\right]/2\right]\Big\}$}\right.\ \mathrm{k}=1,2,\dots, \mathrm{z}\end{array} $$
(10)

where \( {E}_k^{standard} \) is the standard efficiency resulted from the application of the DEA model for the kth DMU, \( {E}_k^{inverted} \) is the inverted efficiency of the kth DMU, i.e., the resulted efficiency when inputs are inserted in the SBM model as outputs and vice versa.

4 Findings

Table 6 in the Appendix presents the resulting LCI when considering LENGTH as pND, for each DMU, as well as the Rank position based on the LCI. For demonstrating and discussing the proposed approach, Table 6 also presents LCI and Rank results, when considering LENGTH as a Controllable measure (input) as well as considering LENGTH as a totally non-discretionary (tND) measure.

Observing the results (Table 6, in Appendix) and the data descriptive statistics (Table 1), it is possible to see that the five routes with the best performance in all configurations were not those with great EXPORTS. The routes with greatest STORAGE were related to Belém (PA), Itacoatiara (AM), Santarém (PA), Santos (SP), and São Luis (MA). As can be seen in Table 6, when LENGTH is treated as a controllable measure, the most efficient routes are those with the shortest length. When LENGTH is treated as tND, there is a relative performance improvement of those routes with other desired measures (for example, those with the greatest STORAGE). Finally, when LENGTH is treated as pND, there is a balance between routes with short LENGTH and other desired measures.

For a faster and easier visualization, Table 2 presents the same results of Table 6, but aggregated by GTC, through the arithmetic average of the results of the DMUs in the same GTC.

Table 2 Aggregated GTC’s LCI results considering LENGTH as a Controllable (control) measure (input), a totally non-discretionary (tND) measure, and a partially non-discretionary (pND) measure, followed by their respective rank positions

It is possible to observe in Table 2 that the main differences in the aggregate results regarding the models with pND, tND, and controllable measures are between the GTC from Santarém (PA) and Itacoatiara (AM). Santarém (PA) is in the second and Itacoatiara (AM) is in the fourth position in controllable and pND ranks, while Santarém (PA) is in the fourth and Itacoatiara (AM) is in the second position in the tND rank. Despite a measure being not controllable, it happens, because the tND neutralize a measure in terms of source of inefficiency, and the average length from Santarém (PA) is 1360.98 km while the ones from Itacoatiara (AM) is 1880.92 km, so that huge difference in distance is totally ignored in a model with tND, but the model with pND allows it being almost no controllable and be a source of inefficiency yet.

On the other hand, São Luís (MA) was the fifth GTC under controllable assumptions and São Francisco do Sul (SC) was the sixth. Under both tND and pND assumptions, they inverted positions. In other words, São Luís (MA) is sixth and São Francisco do Sul (SC) fifth in both assumptions. The rank change from controllable (discritionary) assumption to a partially Non-Discretionary (pND) and a totally Non-Discretionary (tND) assumptions. This is explained by the fact that São Luís (MA) corridor presents desired observed values for other target measures (e.g., DEPTH of the port, multimodal infrastructure, and very low ACCIDENTS). Once São Luís (MA) and São Francisco do Sul (SC) present similar LENGHT of roads, but São Luís (MA) has more multimodal infrastructure, when it is constrained, its better-observed values in these three aspects improve its relative position. However, the aggregation through arithmetic average presents limitations. One of them is the dependency on the number of routes in a GTC. Although the aggregated values are useful for fast visualization and understanding, it is recommended to investigate routes’ (DMUs’) results (Table 6) for taking decisions and making policies. Also, other types of aggregations and models can be explored, as the network ones.

Even though with these results, someone could argue about the lack of big differences between the models with controllable and pND measures, but a deep investigation in the percentage of variation to achieve the goals to be in the efficient frontier, computed as a post-efficiency analysis and shown in Table 3.

Table 3 GTCs’ % of variation to achieve the goal to be in the frontier

As it can be seen in Table 3, the model with the LEGTH as a controllable measure shows changes (reductions) up to 18.74% in the length of the routes in a GTC. Considering long-distance haulage, it could be enough to move to another state, so sometimes it is not a viable outcome to be implementable in practice. While the model with the LENGTH as a pND measure shows similar final average ranks for the GTCs, but with changes in (reductions) up to 18.74% in the length of the routes in a GTC. It is a more viable outcome. And about the model with the LENGTH as a tND measure, it does not even allow changes in it, what sometimes it is not in accordance with the practice (e.g., when the farmer is far away from the center of an origin region).

5 Discussion

As stated in the Findings, the five routes with the best performance in all configurations were not those with great EXPORTS. Such as Alves Junior et al. (2021), the model configuration proposed here dealt well in avoiding bias due to productive inequalities. This represents one step further in methodological evolution, one it was not achieved by Melo et al. (2018, 2020), which presented, among the admitted limitations of the results, the greater producers also as part of the most efficient routes. The current paper, as well as Alves Junior et al. (2021), is focused on the destination (origin) instead of the origin (production).

The aggregated results in Table 2 shows the GTC of Rio Grande in the top position independently of the LENGTH treatment. Also, Paranaguá maintained the third position in the three treatments. Besides, the worst performers Belém, Ilhéus, Salvador, and Fortaleza did not change rank positions. These relative positions agree with the previous literature, which demonstrated that, in general, routes and corridors in the Southern of Brazil are more efficient than those from Northern and North-eastern (Alves Junior et al., 2021; Garcia et al., 2019; Branco et al., 2020; Melo et al., 2018, 2019; Rentizelas et al., 2019).

It is possible to observe in Table 2 that the last four corridors (average of routes) are from the North and Northeast regions from Brazil. With this in mind and comparing with Alves Junior et al. (2021), Branco et al. (2020), and the Brazilian Planning and Logistics Company (2021), it is possible to suggest public policies to improve the performance of the corridors from the North and Northeast regions. For example, investing in new railways. Also, investing in new waterways, multimodal routes, and GTCs enable connecting these regions to other productive areas. In this regard, the synergy of integration supports the mitigation of CO2 emissions. It is possible to highlight the prioritization of four railways: Ferrograo (connecting the Center-West productive region to the PA state, and providing alternative access to the Port of Santarém (PA) using the Tapajós waterway). Ferrovia Norte-Sul (connecting the North and Northeast regions to the Southeast one). Ferrovia de Integração Oeste-Leste [connecting the west of BA state to the Port of Ilhéus (BA)]. Ferrovia Nova Transnordestina (connecting the North to the Northeast region).

Another point to discuss is related to the aggregated values that are useful for fast visualization and understanding, it is recommended to investigate routes’ (DMUs’) results (Table 6). These detailed results permit the decision-makers to understand where in the GTC (and how) is required to guide more efforts to improve local and aggregated performance.

For example, the GTC of Itacoatiara has 12 routes (DMUs 18–29) (Table 4). One of them is the fourth best ranked, considering the 245 DMUs under analysis (DMU24). Though other DMUs are among the worst-ranked (18, 20, 23, 26, and 27). Efforts directed to improve the efficiency of the worst-ranked routes will result in a GTC better performance as well as promote regional development.

Table 4 DMUs’ results of the GTC of Itacoatiara (MA)
Table 5 DMUs’ results of the GTC of Santarém (PA)

In parallel, the GTC of Santarém has eight routes (DMUs 88–95) (Table 5). Following the proposed methodology, efforts should be guided to worst-ranked DMUs. The aggregation through the arithmetic average may have benefited the GTCs with fewer routes. In practical terms, it may be not possible to build more routes due to natural barriers such as mountains and forests. In this case, the DMUs’ results point which existing route should be the focus of efforts. For example, in the case of Santarem (in Amazon Forest), they are DMUs 88, 93, and 92. Though, in cases where it is possible to build more routes, planned routes can also be incorporated into the analysis and their expected performance can be investigated.

6 Conclusions

We presented a methodology for building a long-distance cargo-haulage performance index, named Logistic Composite Index (LCI). In this context, the LCI brings the novelty of incorporating partially Non-Discretionary (pND) measures in Double-Frontier Data Envelopment Analysis (DEA), Slack-Based Measure (SBM) under Variable Return to Scale (VRS) assumption to study Green Transport Corridors and its routes.

For deepening the discussion about the impact of the partial non-discretionarily treatment, we also ran the model considering two other possibilities: (1) route transport distances as controllable measures (inputs), i.e., assuming decision-makers and policy-makers have the possibility of shortening the physical transport distance between producers and exporting ports, without any external constraints (this was the most adopted assumption in previous studies); (2) route transport distances as totally non-discretionary (tND), i.e., assuming decision-makers and policy-makers have no possibility of shortening the physical transport distance between producers and exporting ports. They are completely limited by external constraints.

The three results were aligned to the previous literature, pointing routes and corridors in Southern Brazil more efficient than those in the Northern and North-eastern. But treating the length of the route as a partially Non-Discretionary (pND) measure proved to be more accurate, mainly when calculating the percentages of variation to achieve the goals to be in the frontier. Once the top-ranked DMUs under the pND assumption also presented better-ranked positions under controllable assumptions. They these DMUs presented worse-ranked positions under the tND-distance assumption. Also, both assumptions (tND and controllable) are not achievable in real life for the studied context.

For creating a Green Transport Corridor’s (GTC) index and avoiding the lower number of GTCs, we considered the routes as DMUs, computed the LCIs, and aggregated the routes’ LCIs of each GTC, through an arithmetic average. Although the GTC values are useful for fast visualization and understanding, DMU’s results should be considered when planning efforts for improving GTC’s efficiency as well as promoting regional development.

For future investigations, in terms of application, we recommend studies focused on the logistic operators, such as related to the availability of return freight. We also recommend the use of big data and real-time logistic data, when they are available. This application can improve the model developed through the incorporation of other techniques, such as hierarchical network models and deep learning. For example, we recommend the development of a model where the discretionary level of measure could be customized for each DMU. This way, the same model could assume (for the same measure) a higher discretionary level for those DMUs where the measure is less externally constrained. Once the data is available, the discretionary level of each DMU could be calculated through deep learning and other techniques. Similarly, it is also possible to improve the aggregation method from routes to corridors, such as proposing (dynamic) network-DEA and hierarchical-DEA models. Finally, we suggest for future studies to investigate the impact of the dispersion of the data (standard-deviation) on the efficiency results, and other aggregation methods or Network models applied to evaluate the GTCs.