1 Introduction

Composite indicators are becoming increasingly important within discourse on benchmarking countries’, regions’ or institutions’ performance. In fact, their huge variety reveals their recognition as tools for policy evaluation and communication. According to Nardo et al. (2008) the number of composite indicators in existence around the world is growing year after year, especially due to their aims of summarising, focusing and condensing the complexity of our dynamic environment. According to Paruolo et al. (2013) and Singh et al. (2009), they are much easier to interpret than trying to find a common trend in many single indicators. In practice, they have been applied in relevant dimensions of reality such as country’s competitiveness (World Economic Forum 2017a), the quality of its governance (World Justice Project 2016), the freedom of its press (Freedom House 2017), the global, regional and national Human Development (The United Nations Development Programme 2016), the worlds measure of global peacefulness (Institute For Economics & Peace 2017), the travel and tourism competitiveness (World Economic Forum 2017b), the country’s economy measure (World Development Indicators: The World Bank 2017), the efficiency of its universities (the Academic Ranking of World Universities, the Times Higher Education World University Ranking or the QS World University Ranking), etc.

According to Saisana and Tarantola (2002), a composite indicator is the mathematical combination of single indicators that represent different dimensions of a concept whose description is the objective of the analysis (see Saltelli 2008; Nardo et al. 2005, for a detailed discussion of their pros and cons). The construction of composite indicators has been dealt with from several angles, in fact, the literature provides a wide range of methodological approaches. Although generally, constructing composite indicators involves three main processes, which are:

  • Normalisation Prior to any data aggregation, in general, a normalisation is required, as the indicators in a data set often have different measurement units. For example, techniques such as standardisation (or z-scores) convert indicators to a common scale with a mean of zero and standard deviation of one; categorical scale assigns a score for each indicator; range normalisation uses min–max for data transformation and making values to be normalized from 0 (minimum) to 1 (maximum); or distance to a reference, which measures the relative position of a given indicator with respect to a reference level.

  • Weighting When building composite indicators, weights can have a significant effect on the overall composite indicator and the results obtained. Many composite indicators rely on equal weighting, meaning that all variables are given the same weight. In other cases, the weights are directly obtained from the data, for example, with factor analysis, principal component analysis or data envelopment analysis (DEA), while other approaches estimate the weights externally to the data using, for example, participatory methods like conjoint analysis (CA). Also, MCDM methods have been used, such as Singh et al. (2007), who calculate the composite sustainability performance index for the year 2003–2004 for steel industry adopting Principal Component Analysis, by using an Analytic Hierarchy Process (AHP, Saaty 1977, 1980) model to determine the weights at various levels. Similarly, Krajnc and Glavic (2005) present a model for designing a composite sustainable development index that depicts performance of companies along all the three dimensions of sustainability using AHP to determine weights of indicators. Then, the sustainability sub-indices are combined into the composite sustainable development index using the weighted average.

  • Aggregation Aggregation methods also vary. A basic distinction is often made between additive aggregation and geometric aggregation. There are also techniques based on multicriteria decision making (MCDM) methods. According to Paruolo et al. (2013), linear aggregation rules have been criticized because bad performances in certain single indicators can be compensated by good performances in others; this characteristic is called compensation. In opposition to this, non compensatory and non linear aggregation rules have been advocated by the multicriteria decision making (MCDM) literature (see Billaut et al. 2010; Munda and Nardo 2009; Munda 2008). In this line, depending on the aggregation method chosen, the compensatory character between indicators varies (Munda 2008). In this way, this compensation can be: full (additive methods and compensatory MCDM methods), partial (geometric methods and partially-compensatory multicriteria methods), or zero (non-compensatory MCDM techniques).

According to Becker et al. (2016), the construction of a composite indicator involves making choices when combining criteria of different natures, and it requires a number of steps in which the decision maker must make decisions. In this aspect, some authors claim that MCDM techniques are highly suitable in multidimensional frameworks when aggregating single indicators into a composite one (see Nardo et al. 2005, 2008; Jacobs et al. 2004; Freudenberg 2003; Saisana and Tarantola 2002), since MCDM is a well known branch of decision making, consisting of analysing (ranking, classifying, choosing) a series of possible alternatives, taking into account different criteria simultaneously.

Therefore, the aim of this study is to carry out a review of the literature in order to identify the different MCDM methods used for aggregating single indicators into composite ones. This has been achieved by conducting a literature review of papers published after 2002 in leading international journals indexed in recognised databases (JCR). To do so, the keywords used have been composite/synthetic indicator, multicriteria decision making, indicator framework and aggregation, and equivalent formulations of these keywords.

After the introduction, Sect. 2 gives a short overview of the main MCDM methods applied to the construction of composite indicators. The literature review is detailed in Sect. 3, followed by a discussion about some remarkable aspects found in the search in Sect. 4. The conclusions are included in Sect. 5.

2 MCDM methods

This section presents an overview characterizing the main features of the MCDM methodologies used to construct composite indicators. Multicriteria decision making is a set of methods that can be used to support the process of decision making in a flexible manner when more than one criterion are being considered (Cinelli et al. 2014). Within MCDM approaches, one of the most extended classifications differentiates between Multi-Objective Decision-Making (MODM) (used in problems with an infinite (continuous) or large number of alternatives) and Multi-Attribute Decision-Making (MADM) (designed for situations with a limited (discrete) number of alternatives). In our case, for simplicity and clarity we have decided to classify MCDM methodologies used to construct composite indicators in five categories. It must be pointed out that we have focused this classification on the technique used for the aggregation of the criteria (indicators), regardless of the procedure considered for weighting these criteria.

  1. 1.

    First, the elementary methods appear, which form the basis of initial MCDA assessments. According to Lai et al. (2008), they can reduce complex problems to a singular basis for selection of a preferred option by simple conditions. The most common elementary methods are the Simple Additive Weighting (SAW) and the Weighted Product (WP). The basic idea underlying these methods is that the global performance of an alternative is computed as the weighted sum (as the product in the WP case) of its values for each criterion. This means a total compensation among the different criteria in the case of the SAW method, and a partial compensation in the case of the WP method. The SAW and WP methods normally require normalising variables before aggregating. In general, the standardisation (or z-scores) and min–max normalizations are the most frequently used ones. According to Díaz-Balteiro et al. (2017), SAW and WP represent to some extent practical simplifications of the Multi-Utility Theory (MAUT) and the Multi-Attribute Theory (MAVT), which are described next.

  2. 2.

    Second, the idea of the value and utility based methods consists of designing a means of associating a real number with each alternative and producing a preference order for the alternatives, based on decision-makers’ value judgements (Belton and Stewart 2002; Azapagic and Perdan 2005). Within this group, the Multi-Attribute Utility theory (MAUT) involves the determination of partial utility functions to calculate a global utility function U, while the Multi-Attribute Value Theory (MAVT) involves the determination of partial value functions and to establish weights for each criterion to calculate a global value function V. In both cases, additive and multiplicative models are used to obtain the global utility/value function allowing for a partial compensation. A key reference for MAUT and MAVT is Keeney and Raiffa (1976). The Utility Theory Additive (UTA) method was proposed by Jacquet-Lagrèze and Siskos (1982) to assess a set of values or utility functions, assuming the axiomatic basis of MAUT and adopting the preference disaggregation principle. It uses linear programming techniques in order to optimally infer additive value or utility functions. The Simple Multi-Attribute Rating Technique (SMART) consists of a simple way to implement MAUT by using weighted linear averages, which produce close approximations to utility functions. In this case, the weights are defined using ratio estimation. UTA and SMART allow for a total compensation. The MACBETH (Measuring Attractiveness by a Categorical Based Evaluation Technique) was introduced by Bana-e-Costa and Vansnick (1994) and it requires qualitative judgements about value differences to help an individual or group quantify the relative attractiveness of options. It uses a simple question answer protocol that involves only two options for each question, resulting in a pairwise comparison. The difference in attractiveness is measured in terms of seven semantic categories ranging from null to extreme. In general, when constructing composite indicators, these methods associate a partial value or a utility function to each indicator, according to their values and the decision makers’ judgements. Then, the global function (using the corresponding weights) plays the role of the composite indicator, and the partial utility functions imply, in fact, an implicit normalisation.

  3. 3.

    On the other hand, the outranking relation approach involves methods based on comparisons between pairs of options to determine whether “alternative a is at least as good as alternative b”. Within this family, the two most used methods are ELECTRE (Elimination and Choice Expressing Reality) and PROMETHEE (Preference Ranking Organization Method for Enrichment Evaluations). The former was introduced by Roy (1968, 1991) and it chooses an alternative as the best action(s) from a given set of actions, and it is applied to three main problems: choosing, ranking and sorting. The concordance, discordance indices and threshold values are used in this method. The latter, PROMETHEE, was proposed by Brans et al. (1986) and it performs a pair-wise comparison of alternatives to rank them with respect to some criteria. Both methods allow for a partial compensation among the criteria. The application of these methods to construct composite indicators implies that the decision-maker must associate the corresponding parameters (thresholds or pseudo-criteria) to each indicator, in addition to the weights. The final value provided by the method is taken as the composite indicator. These methods do not require a normalisation before aggregating variables, since they use the original data for the comparisons. Subsequently, the outranking relation methods converts each indicator to a 0–1 scale.

  4. 4.

    The Data Envelopment Analysis (DEA) based methods is a well-known non-parametric approach that uses linear programming as an instrumental element, for the purpose, originally, of evaluating the efficiency of a set of comparable units (Charnes et al. 1978). DEA allows for full compensation among the criteria. The resulting composite indicator using these methods can be done in two different ways. First, considering the partial indicators as input or output variables, depending on whether they are the “the less, the better” or “the more, the better” type indicators, and then a composite indicator is obtained using the usual DEA procedure. The second approach consists of creating a dummy output (or input) and considering all the partial indicators as inputs (or outputs). This approach is also known as the Benefit of the Doubt (BoD) model, which was originally proposed for evaluating macroeconomic performance (Melyn and Moesen 1991) and, later, it was adapted to the index theory (Cherchye and Kuosmanen 2004; Cherchye et al. 2004). Nowadays, it is increasingly recognised as a method for constructing composite indicators. According to Nardo et al. (2005), using this method, the composite indicator ranges between zero (lowest possible performance) and 1 (the benchmark), and it is defined as the ratio of a variable’s actual performance over its benchmark performance, where indicators are often based on the normalised values from linear scaling in the min–max range. Also, this technique allows the analyst to endogenously determine the weighting of the partial indicators. Interesting links between DEA and MCDM methods can be seen in Stewart (1996), Joro et al. (1998) and Cooper (2005).

  5. 5.

    Finally, the distance functions based methods appear. According to Díaz-Balteiro et al. (2017), the basic idea underlying these methods consists of substituting the maximization of a function comprising the preferences of the decision maker, as a utility function, by the minimization of the distance existing between an alternative and a point or points enjoying good preferential properties. The use of these methods to construct composite indicators requires the assessment of the corresponding reference levels by the decision maker, in addition to the weights. In some cases, a prior normalisation is required, while in others the achievement functions produce normalised values. Within this family, we can distinguish:

    • The goal programming method was introduced by Ijiri (1965) and Ignizio (1976), where decision-makers establish goals for each criterion that they consider good. Then, with the help of two deviation variables, measuring under and over achievements, a preferred solution is defined as the one which minimises the deviations from the set goals.

    • In the compromise programming method, the reference point is an ideal or utopian vector defined by the optimal values of the criteria considered. The components of this vector receive the name of “anchor values”. This method was proposed by Yu (1973) and Zeleny (1974).

    • Within the reference point method context, an achievement function measures the deviation between the values of each indicator and their corresponding reference levels. It was originally proposed by Wierzbicki (1980), an later on adapted to build composite indicators by Ruiz et al. (2011) and Cabello et al. (2014b). It allows for different compensation degrees among the criteria depending on the aggregation scenario.

    • The Technique for Order Preferences by Similarity to Ideal Solutions (TOPSIS) considers two reference points, the ideal and the anti-ideal point, in such a way that a preferred solution should have the minimum distance with respect to the ideal and, simultaneously, the maximum distance with respect to the anti-ideal. It was introduced by Hwang and Yoon (1981) and it allows for a full compensation.

    • Finally, the principle of the Grey Relational Analysis (GRA) method, introduced by Deng (1989), is similar to TOPSIS. The grey relation degree is defined to show the closeness between the alternatives. Usually, the ideal solution is defined and the alternatives’ relation degree with it are calculated.

3 A review of MCDM methodologies to build composite indicators

In this section, we carry out a literature review of MCDM methodologies used to aggregate single indicators into composite ones, using the classification of MCDM methodologies proposed in the above section.

  • The elementary methods. Haider et al. (2018) develop a set of suitable sustainability indicators covering all the sustainability aspects of small-sized neighbourhoods from state-of-the art neighbourhood sustainability assessment tools and expert SAW methods. Zhou et al. (2006) present an objective measure that may be used to compare different aggregating methods (the SAW, the WP and the weighted displaced ideal) for constructing a composite environmental index, based on the concept of “loss of information”. Kropp and Lein (2012) focus on the use of a spatial MCDA as a means of modelling the spatial pattern of urban sustainable development under contrasting objectives, constructing aggregated indexes of sustainability. To do this, they use the SAW method to design the model for its easy implementation within a Geographic Information System using map algebra operations and transparency in aiding decision making. Giannetti et al. (2009) analyse a procedure that uses different experts’ opinions in constructing a composite environmental index, based on the Environmental Sustainability Index 2005, which uses the SAW method. Similarly, Arbolino et al. (2018) use the SAW method and the Principal Component Analysis model to measure ecological industrial policies designing the Industrial Environmental Sustainability Index.

    Kang et al. (2002) adopt the weighted-sum form to construct a Composite Environmental Index, where the weights of environmental problems are obtained through an AHP. Similarly, Torres-Sibille et al. (2009) develop an indicator to assess the magnitude of the objective aesthetic impact on the landscape caused by the installation of the wind farm. Their formula takes the form of a weighted sum, in which the weights have been given by expert judgement and analysed by means of the AHP. Krajnc and Glavic (2005) present a model for designing a composite sustainable development index that depicts performance of companies along all the three dimensions of sustainability using AHP to determine the weights of the indicators. Then, the sustainability sub-indices are combined into the composite sustainable development index using the weighted average. Azevedo et al. (2012) propose an index to assess the agility and leanness of individual companies and the corresponding supply chain. They use the SAW method to aggregate individual indicators and two rounds of the Delphi questionnaire to determine the weights. Zeng et al. (2017) apply the DEA-linked approach by Kao (2010) and a modified SAW approach by Wang and Luo (2010) to construct integrated energy security indicators for the Baltic States.

  • The value and utility based methods. Hajkowicz (2006) develops a water service index for the Great Barrier Reef region in Queensland based on stakeholder consultation combined with MAUT. van Calker et al. (2006) use MAUT to develop an overall sustainability function for Dutch dairy farming systems by aggregating preferences of stakeholders and experts using a goal programming approach. Frank et al. (2016), based on a multi-criteria approach which uses an S-curve scale transformation for standardizing the indicators, and MAUT for consolidating the indicators, propose an assessment framework that consolidates environmental sustainability indicators reported by oil and gas companies into an integrative index. Cracolici and Nijkamp (2008) assess the relative attractiveness of competing tourist destinations on the basis of individual visitors’ perceptions regarding a holiday destination. They use individual survey questionnaires on the tourists’ evaluation of the quality of tourist facilities and attributes in a given area as the basis for constructing an aggregate expression for the relative attractiveness of that area. According to them, this aggregate expression can be interpreted in the spirit of MAUT. Würtenberger et al. (2006) present a coupled Material Flow Analysis–MAUT method to assess the environmental and socio-economic impact of agricultural trade by using the concept of virtual land use.

    Dantsis et al. (2010) use the MAVT to assess and to compare the sustainability level of agricultural plant production systems on regional scale combining 21 individual indicators to obtain a unique indicator. They test this methodology on two geographical regions in Greece, through an empirical study, utilizing questionnaires completed during interviews with farm managers. In this line, Langhans et al. (2014) consider different aggregation value methods (the weighted arithmetic mean, the weighted geometric mean, the weighted harmonic mean, the minimum aggregation and a mixture of them) to compare hypothetical examples from ten river reaches in Switzerland, illustrating the consequences of the different aggregation methods on ecological quality assessments.

    Carayannis et al. (2016) present an integrated assessment and classification framework for national and regional innovation efficiency in 23 European countries. They propose to use a multiobjective DEA model in order to consider the objectives and constraints of the different stages and levels of the innovation process. Then, UTASTARFootnote 1 is applied in order to study how environmental factors on innovation and entrepreneurship affect the estimated efficiency scores. In the same context, Grigoroudis et al. (2012) present the development of a performance measurement system for public health care organisations, in the context of balanced scorecard methodology, where the UTASTAR method is used in order to aggregate the marginal performance of key performance indicators. Also, Papapostolou et al. (2017) implement the UTASTAR MCDM method in order to evaluate the attractiveness of seven Western Balkan countries in terms of implementing the Directive 2009/28/EC joint project development mechanism, promoting the use of energy from renewable sources.

    Plakas et al. (2016) develop a composite sustainability indicator based on the participatory method called Simple Multiattribute Rating Technique Exploiting Ranks (SMARTER), introduced by Edwards (1977) for comparing wastewater treatment processes.

    Bezama et al. (2007) combine several multicriteria analysis tools in order to develop a balanced scorecard model for an Austrian remediation project: AHP has been applied for the evaluation of weights of the first level of sub-goals, the SMART process supports the weighting of the bottom goals of the balanced scorecard, and the final performance composite measures have been evaluated using MAUT.

    Clivillé et al. (2007) consider MACBETH to quantify the elementary performances and the aggregated one in a coherent way from qualitative knowledge given by decision-makers, and then they extend it to the Choquet integral operators. They implement the methodology to an industrial case study. Similarly, Berrah and Clivillé (2007) propose to build performance measurement systems by linking an overall performance expression to elementary ones, adopting an aggregation methodology, based on the Choquet integral operator and MACBETH framework. Marques et al. (2010) propose a multi-dimensional Project Performance Measurement System focused on decision support in the context of product and service development projects. Then, they use MACBETH as an aggregation tool to analyse the performance measures according to project managers’ own performance interests. Similarly, Lauras et al. (2010) discuss the difficulty of controlling a complex project caused by the great number of performance indicators, adopting MACBETH to aggregate performance expressions. Madeira-Junior et al. (2012) use factor analysis together with MACBETH in the calculation of port index attractiveness, using all the operational indicators contained in the technical report from the Brazilian Port Authority for the period from 2006 to 2008. Lavoie et al. (2016) create an index with a group of experts to assess the ecological value of wetlands in Quebec city using MACBETH approach. Rodrigues et al. (2017) propose extending the Reasoning Maps approach through a multi-linear evaluation model structure, built with the MACBETH multicriteria method. They apply the method into a real-world evaluation problem, where it was necessary to aggregate several performance indicators in an overall index to evaluate the health of populations of Portuguese municipalities.

  • The outranking relation approach. Attardi et al. (2018) define and test an ELECTRE III based approach for the construction of the Land-Use Policy Efficiency Index, resulting in non-compensatory composite indicators. Petrovic et al. (2014) use ELECTRE to establish the hierarchical position of benchmarking European Union Member countries against Digital Agenda key performance targets. Also, Yunna et al. (2016) propose an application for the site selection of an off-shore wind power station in China using ELECTRE-III, in order to avoid compensation among criteria, deal with incomplete information, and interact with decision-makers so as to obtain a proper definition of the decision model.

    Antanasijevic et al. (2017) analyse 38 composite indicators over a 10 year time period (2004–2014) in order to determine theme specific and overall sustainability performance for 30 European countries using PROMETHEE. Hernandez-Perdomo et al. (2017) present a methodological approach for integrating real options analysis into multicriteria analysis in order to evaluate and holistically rank a portfolio of multiple firms’ projects. They divide the decision-making problem into three main areas that consider aspects related to decision makers’ preferences, data analysis, and projects’ rankings based on the use of PROMETHEE.

  • The DEA based methods. Cherchye et al. (2007) propose the BoD methodology to construct “technology creation” composite indicators. Cherchye et al. (2008) use the Technology Achievement Index as an illustration of DEA-based composite indicators. Murias et al. (2008) present a synthetic indicator for quality assessment in the Spanish public university system using DEA. Dobos and Vörösmarty (2014) propose a theoretical approach to investigate the environmental aspects in supplier selection and evaluation problems based on the data envelopment analysis like composite indicators model.

  • Martí et al. (2017) propose the use of DEA as a tool for MCDM to compute a synthetic index of overall logistics performance and benchmark the logistics performance of the countries with The Logistics Performance Index. In the similar context, Amado et al. (2016) propose a methodological approach based on DEA to enhance the measurement of active ageing in the European Union countries. They use a model with 22 indicators grouped into four domains, and by introducing different types of virtual weight restrictions, they combine normative judgements with an optimisation procedure. Hatefi and Torabi (2010) propose a common weight MCDA-DEA approach to construct composite indicators. In order to validate this approach, they apply their model to construct the Sustainable Energy Index and the Human Development Index.

    Rogge (2018) proposes an extension to the BoD-model that derives from Färe and Zelenyuk (2003) to aggregate individual composite indicators into a group composite indicator. To illustrate the procedure, an aggregate composite indicator for human development for the global EU-region is computed using the EU Member States’ HDI sub-indicators. Verbunt and Rogge (2018) build “indirect” multiplicative BoD index number framework, in which the linear, data-driven BoD-model is used to estimate the importance of various sub-indicators within a geometric composite index. Similarly, Van Puyenbroeck and Rogge (2017) adopt a two-step procedure (“indirect approach”) to combine multiplicative aggregation and BoD weighting in the construction of composite indicators. In a first step, information on the importance weights of the sub-indicators in the composite indicator are derived from a linear BoD model. Then, this weight information is used to impute shadow price based budget share expressions in the construction of the composite indicator as a geometric quantity index.

    Kao and Hung (2007) propose a methodology to measure the management performance of the firms of an industry aggregating the performances of all subjects using the weighted arithmetic sum. Then, they combine it with a mathematical programming problem, proposing a two-stage method to determine a common set of weights associated with the subjects. Zhou et al. (2007) propose two classical DEA like models to determine the best and worst set of weights for underlying subindicators and a SAW method for data aggregation for modelling the sustainable energy development of eighteen Asia-Pacific Economic Cooperation economies. Zhou et al. (2010) propose similar DEA like models but considering the WP method instead of the SAW one for data aggregation, analysing the data aggregation problem in composite indicator construction from the point of view of information loss.

  • The distance functions based methods. Blancas et al. (2010) formulate the Goal Programming Synthetic Indicator, analyzing their properties and applying them to the analysis of sustainable tourism. Similarly, Molinos-Senante et al. (2016) discuss a process of indicator aggregation using two approaches based on multi-criteria decision analysis (distance-principal component and goal programming synthetic indicator) to evaluate and compare the sustainability of water companies from a holistic perspective. Then, a synthetic indicator embracing economic, environmental and social performance indicators was computed for a sample of 154 Portuguese water companies. Jadidi et al. (2015) formulate a single product supplier selection problem as a multi-objective optimization model, and then, proposed a new multi-choice goal programming (MCGP) approach in order to provide the decision makers with more control over their preferences. Voces et al. (2012) establish a ranking of the European countries analysed in terms of sustainability, aggregating several indicators of diverse nature (economic, environmental and social) into a composite or synthetic index using a binary goal programming model. Trenado et al. (2014) formulate a model that combines goal programming with “goal games” against nature in a scenario where the social responsibility is defined through the introduction of a battery of sustainability indicators amalgamated into a synthetic index. Carrillo and Jorge (2017) propose the construction of the Spanish Tourism Sustainability Performance composite indicator using the Compromise Programming approach to aggregate the single indicators considered.

    Blancas et al. (2011) concentrate their study in obtaining sustainable tourism indexes combining Principal Component Analysis (PCA) and the distance to a reference point concept. Moreover, Ruiz et al. (2011) propose the double (reservation–aspiration) reference point approach to develop synthetic sustainability indicators. Later, Cabello et al. (2014b) use the double Reference Point Method to assist individual investors in their investment decisions providing them with a composite indicator of mutual funds environmental responsibility. Navarro et al. (2012) develop a multicriteria analysis, based on the reference point methodology, to assess the carrying capacity of tourist destinations, and this method is then applied to the management and planning of an open tourist resort, based on synthetic indicators applied to two scenarios: weak and strong sustainability. Cabello et al. (2014a) propose a dual reference point method to provide a synthetic measurement of sustainability serving to analyse overall sustainability, in addition to sustainability in each separate aspect, allowing to detect whether regions are failing in one particular dimension. The method also incorporates the opinions of various experts by means of a system of preference aggregation. Also, Ruiz et al. (2017) propose the use of a Double Reference Point based methodology to obtain synthetic indicators allowing for different degrees of compensation, comparing them with the system of Ease-of Doing-Business indicators proposed by the World Bank. They make comparisons taking into account the imprecision, ambiguity and uncertainty of the data by means of the Fuzzy Degree of Similarity between two rankings.

    Gu et al. (2012) provide a case-based approach for the Management System safety assessment decision making of thermal power plants, proposing a method that integrates grey system theory and the Delphi method into case-based reasoning methodologies. To do this, they use an additive weighted aggregation scheme, and they make different proposals for an individual assessment of the indicators using achievement functions.

    Díaz-Balteiro and Romero (2004) propose a sustainability index to be attached to any natural system evaluated according to several indicators of sustainability using ideal and anti-ideal values. Similarly, Garcia et al. (2016) develop the integrated environmental evaluation model of sustainability in business model that calculates a balanced single measure to assess sustainability performance using the distance from the current state of the system to an ideal point that corresponds to the best value for the Ecology and Socio-economics dimensions. Wang et al. (2012) propose a novel method for establishing a mixed model, which adopts TOPSIS to comprehensively and objectively integrate different criteria into a composite indicator. Wang et al. (2017) propose a smart MCDM framework based on TOPSIS to evaluate multiple factors of air pollutants and economic development of Wuhan City in China. According to them, this novel technique can overcome the drawbacks of conventional TOPSIS methods by using Bayesian regularization and the Back-Propagation neural network to optimize the weight training process. Escrig-Olmedo et al. (2017) adopt fuzzy TOPSIS to assess environmental, social and governance performance. First, they elaborate synthetic indicators for each corporate sustainability domain, and second, the investor’s preferences related to environmental, social, governance and financial domains are integrated to design a unique investment decision solution. Chen et al. (2015) use the TOPSIS–Rank Sum Ratio methodology to conduct the road safety risk evaluation process from an overall perspective, based on a composite Road Safety Risk Index. Similarly, Wang et al. (2018) use the rough set-TOPSIS–Rank Sum Ratio methodology and construct a composite coal miners’ livelihood vulnerability index to assess coal workers’ livelihood vulnerability in 33 coal-mining cities. Boggia et al. (2018) present a new model, called GeoUmbriaSUIT, integrating Multicriteria Analysis (TOPSIS) and Geographic Information Systems. They treat individually the indicators representing each of the three dimensions of sustainability (environmental, economic, social), and also, they compute three different indices, for each geographic unit analyzed: Environmental Index, Economic Index and Social Index. Mi et al. (2015) develop a climate change mitigation index with 15 objective indicators. They apply this index to assess China’s provincial performance in climate protection based TOPSIS method. Also, Bao et al. (2012) propose an improved hierarchical fuzzy TOPSIS model to combine the multilayer safety performance indicators into one overall index by incorporating experts’ knowledge. Similarly, Hu et al. (2017) construct a comprehensive evaluation index system suitable for carbon market construction based on the thorough analysis of Beijing’s carbon emission market using TOPSIS model. Jun et al. (2011) develop a framework to quantify spatial vulnerability for sustainable water resources management, modifying and improving four hydrologic vulnerability indices using TOPSIS.

  • Hybrid approaches. Some papers tend to use different MCDM methods simultaneously. In this line, Zhang et al. (2017) propose a hybrid MCDM approach, where the Decision-Making Trial and Evaluation Laboratory methodFootnote 2 is applied to improve the normalisation process in ANP, and GRA is integrated with TOPSIS to obtain the final rank of each alternative and select the optimal green material for sustainability based on the product’s needs. Additionally, a non-linear programming model with constraints is proposed to obtain the integrated closeness index based on the similarity closeness index from GRA and the distance closeness index from TOPSIS.

    Siskos et al. (2014) evaluate e-government, where the overall evaluation is obtained through an additive value model which is assessed with the involvement of a single decision maker–evaluator and the use of a multicriteria ordinal regression approach. Specifically, they use UTA II, whose interactive application process is divided in two phases (to construct the marginal value functions, they adopt MAUT and MACBETH).

    Morais and Camanho (2011), based on DEA, construct a composite indicator of quality of life and the assessment of local management performance, contextualised by the GDP per capita to measure the ability of local authorities to promote quality of life given the economic condition of the country. Then, they use goal programming to compare cities and countries performance. Similarly, Despotis (2005) estimates an ideal value of the composite human development index for each country in the region of Asia and The Pacific by a DEA-like index-maximizing model. Then, he extends his calculations through a goal-programming model to obtain global estimates of human development, based on optimal common weights for the component indicators.

    Rosic et al. (2017) use composite indexes obtained from different models, based on DEA and TOPSIS. Then, based on three parameters (average correlation, average rank variation and average cluster variation), they propose the PROMETHEE-RS method to choose the optimal road safety composite index by comparing 27 police departments in Serbia.

    De-Mare et al. (2015) test the simplified linear aggregative model SMART and the PROMETHEE II model with the aim of verifying their utility in the elaboration of synthetic indexes for the choice or ranking of investments in urban development.

    Kim and Chung (2014) develop an index-based robust decision making framework for integrated water management dealing with water quantity and quality issues in a changing climate, using the SAW and TOPSIS. Oltean-Dumbrava et al. (2016) use SAW, PROMETHEE and ELECTRE III to assess the absolute sustainability of the Spanish and Italian Transport Noise Reducing Devices projects.

    Gómez-Limón and Riesgo (2009) apply the Principal Components Analysis, AHP and a MCDM technique based on the augmented Tchebycheff distance function to build the composite indicators of agricultural sustainability in the Duero River basin in Spain. Li et al. (2017) propose a hybrid approach based on fuzzy AHP and 2-tuple fuzzy linguistic method in order to evaluate the in-flight service quality. The 2-tuple linguistic approach is an effective method to solve the uncertainty information proposed by Herrera and Martinez (2000).

4 Discussion

Next, we will discuss some relevant aspects found in our search. Section 4.1 discusses two important aspects related to the construction of composite indicators using MCDM methods: the weighting and the compensation issues. Section 4.2 gives an overview of the categories of MCDM methods used to construct composite indicators, discussing the time-based evolution of published papers, and Sect. 4.3 is devoted to the scopes of application and the journals’ categories of published papers.

4.1 The weighting and the compensation issues

According to Nardo et al. (2005) weighting and aggregation are key steps in constructing composite indicators. Weights can have a significant effect on the overall composite indicator and the results obtained. When building composite indicators, weighting methods can be categorized into three main categories: equal weighting means that all the indicators are given the same weight, data-based methods derives weights from the data characteristics, and participatory based methods, where subjective opinions of experts and/or stakeholders are taken into account.

In our literature review, concerning weighting methods used to construct composite indicators in the field of MCDM methods, we found that data-based methods and participatory based methods are the most frequently used approaches.

Within data-based methods, a weighting technique based on DEA is the most widely used approach (Despotis 2005; Morais and Camanho 2011; Hatefi and Torabi 2010; Dobos and Vörösmarty 2014; Carayannis et al. 2016; Amado et al. 2016; Martí et al. 2017; Carrillo and Jorge 2017; Van Puyenbroeck and Rogge 2017; Verbunt and Rogge 2018; Rosic et al. 2017; Rogge 2018). Besides, the entropy method (Chen et al. 2015; Rosic et al. 2017; Zinatizadeh et al. 2017; Wang et al. 2017), the principal component analysis (Gómez-Limón and Riesgo 2009; Blancas et al. 2011; Arbolino et al. 2018), the distance principal component (Molinos-Senante et al. 2014) and the programming model (Kao and Hung 2008; Wang et al. 2012) are also used.

Concerning participatory based methods, weighting based on expert’s opinions and decisions makers are widely used (Díaz-Balteiro and Romero 2004; Kao and Hung 2008; Giannetti et al. 2009; Jun et al. 2011; Navarro et al. 2012; Bao et al. 2012; Cabello et al. 2014a; Kim and Chung 2014; Oltean-Dumbrava et al. 2016; Li et al. 2017). Also, AHP/ANP (Krajnc and Glavic 2005; Torres-Sibille et al. 2009; Gómez-Limón and Riesgo 2009; Grigoroudis et al. 2012; Zhang et al. 2017), the Delphi technique (Azevedo et al. 2012; Gu et al. 2012), SMARTER (Plakas et al. 2016) and MACBETH (Cabello et al. 2014b; Marques et al. 2010; Lauras et al. 2010; Madeira-Junior et al. 2012; Lavoie et al. 2016) are adopted.

Finally, the equal weighting approach is also widely used (Zhou et al. 2006, 2010; Hajkowicz 2006; van Calker et al. 2006; Würtenberger et al. 2006; Blancas et al. 2010; Petrovic et al. 2014; Langhans et al. 2014; De-Mare et al. 2015; Oltean-Dumbrava et al. 2016; Antanasijevic et al. 2017; Ruiz et al. 2017; Rosic et al. 2017; Haider et al. 2018).

On the other hand, when aggregating single indicators into a composite one, an important aspect to emphasize is the compensation degree among the different criteria. According to Asadzadeh et al. (2017), in general, the most widely applied techniques in the literature of composite indicators can be classified into compensatory and non-compensatory techniques. The former refers to the existence of trade-offs among the different criteria and it adopts additive aggregation or simply sums-up techniques for constructing composite indicators. In opposition to this, the latter differentiates between those methods that apply geometric aggregation techniques and those non-compensatory multicriteria methods for constructing composite indicators.

In the field of MCDM methods used to construct composite indicators, in general, some techniques allow for full compensation among the criteria, such as the SAW, UTA, SMART, DEA and TOPSIS methods. Other techniques limit the compensation degree, such as MAUT, MAVT or the WP method, which adopt multiplicative instead of additive functions and they only allow compensation between indicators within certain limitations. In the case of the outranking methods (ELECTRE and PROMETHEE), according to Attardi et al. (2018), they limit or completely prevent compensation thanks to the presence of veto thresholds and only ordinal comparisons among alternatives (see Attardi et al. 2018, who adopt ELECTRE III to construct non-compensatory composite indicators and Yunna et al. (2016), who also adopt the same method in order to avoid compensation among criteria).

According to Gan et al. (2017), the non-compensatory multicriteria approaches are based on decision maker preferences and they are centred around the fact that a general objective of most composite indicators is to create rankings. Therefore, the core of this approach is to construct a ranking algorithm that is more consistent than the linear aggregation rule (Munda 2005), where no compensation is allowed among the criteria, and thus, all the weights reflect the relative importance of each indicator instead of a trade-off ratio.

Moreover, some techniques allow for different compensation degrees depending on the aggregation scenario chosen. A clear example of this is the double reference point method proposed by Ruiz et al. (2011) and Cabello et al. (2014b). This method allows for full compensation (when the weak aggregation scenario is adopted), zero compensation (when the strong aggregation scenario is adopted) or partial compensation (when the mixed aggregation scenario is adopted developing composite indicators for different permitted levels of aggregation).

4.2 Categories of MCDM methods and time-based evolution

In general, a clear tendency towards an increasing number of papers that use MCDM methods to build composite indicators is observed since 2014 (see Fig. 1). More specifically, our review has shown that, since 2012, there has been an increasing tendency to adopt the distance functions based methods (within this category, 83% of the papers reviewed are published after 2012).

Fig. 1
figure 1

Time-based evolution of published papers using MCDM methods to construct composite indicators

After examining and reviewing papers that build composite indicators using MCDM methods, our review has clearly shown that most of papers adopt the distance functions based methods (29%), as can be seen in Fig. 2. Within this category, it can be said that TOPSIS (50%) and the reference point method (29%) are the most widely used MCDM methods. Also, results shown in Fig. 2 indicate that 23% of the papers published adopt the value and utility based methods, while the rest of categories of MCDM methods have been roughly used in the same proportion to construct composite indicators (the elementary methods (12%), the DEA based methods (16%)), with the exception of the outranking relation approach (6%), which is the less used category to construct composite indicators.

Fig. 2
figure 2

Categories of MCDM methods used to construct composite indicators

Furthermore, it can also be observed in Fig. 2 that the use of the hybrid MCDM approaches to construct composite indicators corresponds to 14% of the total. Special attention has to be devoted to this category, since our review confirms the tendency towards an increasing number of papers in the last years, which use hybrid approaches to construct composite indicators (75% of papers reviewed within this category are published from 2014).

4.3 Scopes of application and journals’ categories

As mentioned by Zhou et al. (2010) and Paruolo et al. (2013), composite indicators have been widely advocated and increasingly accepted as a useful tool for performance comparisons, publication communication and decision support in a wide spectrum of fields. In this line, MCDM methods are highly suitable in multidimensional frameworks when aggregating single indicators into a composite one. As shown in Fig. 3, MCDM methods to construct composite indicators have been applied in a wide variety of fields: sustainability (29%), environment (12%), business (8%), energy (6%), tourism (5%), human development (5%), water service (4%) and investment (4%). They have also been applied to assess other fields, such as air pollution (Wang et al. 2017), road safety (Chen et al. 2015; Rosic et al. 2017), coal-mining (Wang et al. 2018), logistics (Martí et al. 2017), universities (Murias et al. 2008), health (Grigoroudis et al. 2012; Rodrigues et al. 2017), climate change (Mi et al. 2015), etc.

Fig. 3
figure 3

Scopes of application of MCDM methods used to construct composite indicators

The large scope of applications of MCDM methods to construct composite indicators has led to publications in 37 different journals indexed in JCR since 2002. As shown in Fig. 4, most of the papers reviewed have been published in journals belonging to the category of “Environmental Sciences” (48%), which seems logical since sustainability and environment are the most assessed fields using MCDM methods to construct composite indicators. Within this category, Ecological Economics (26%), Journal of Cleaner Production (21%), Ecological Indicators (10%), Journal of Environmental Management (8%), Sustainability (8%) and Science of the Total Environment (8%) are the most frequent journals. Other papers have chosen journals focusing on the category of “Operations Research and Management Science” (27%). Within this category, the European Journal of Operational Research is the most widely used one (37%).

Fig. 4
figure 4

Journals’ categories of published papers using MCDM methods to construct composite indicators

Finally, it can be said that, in general, the large scope of applications highlights once again the multidisciplinary nature of the MCDM methods and, in particular, their ability to construct composite indicators for different application fields.

5 Conclusions

Composite indicators are becoming an increasingly important way of benchmarking countries’, regions’ or institutions’ performance. In general, their construction involves making choices, and combining criteria of different natures. In this line, many works have attempted to construct composite indicators by falling back on MCDM methods. In this paper, we have conducted a literature review of papers published after 2002 in leading international journals indexed in a recognised database (JCR), in order to identify the different MCDM methods used for aggregating single indicators into composite ones.

MCDM methods have been widely used to construct composite indicators. In this paper, we have decided to classify them in five categories: the elementary methods, the value and utility based methods, the outranking relation approach, the DEA based methods and the distance functions based methods. We found that most of the papers published adopt the distance functions based methods to construct composite indicators, noticing an increasing tendency since 2012. Furthermore, some papers tend to use MCDM methods from different categories simultaneously (hybrid approaches). Concerning this category, our review confirms the tendency towards an increasing number of papers in the last years, especially from 2014.

In general, our review has shown a clear tendency towards an increasing number of papers that use MCDM methods to construct composite indicators since 2014. Furthermore, we found that these methods have been applied in a wide variety of fields, especially in sustainability and environment. Papers have been published in many different journals indexed in JCR since 2002, most of them on the category of “Environmental Sciences”.