1 Introduction

Group decision making (GDM) is defined as a situation in which several members or experts are involved. These group members have their own backgrounds and motivations, all face a common problem, and are all attempting to reach a collective decision (Bilbao-Terol et al. 2014). As experts may have a different view of a problem because of differing interests and experience, it is normal that conflicts and disagreements occur. The goal of a GDM problem is to reconcile the different points of view expressed by the individual experts to find an alternative (or set of alternatives) which is most acceptable to the group as a whole (Cabrerizo et al. 2013; Spyridakos and Yannacopoulos 2014).

The evaluation and assessment of lean performance is just one problem that can be solved using GDM, yet such a problem has not yet been addressed sufficiently in research. In the last few decades, lean management has attracted a great deal of attention within academic and practitioner literature (Cil and Turkan 2013; Khanchanapong et al. 2014; Moyano-Fuentes and Sacristán-Díaz 2012). Many manufacturing companies with mass production strategies have come to realize the importance of adopting lean production strategies. Lean production as a philosophy has been adopted by companies in a variety of economic sectors to continually improve operations (Powell et al. 2013; Womack and Jones 1996). Lean practices are concerned with manufacturing tools and techniques and include such practices as total quality management (TQM), just-in-time (JIT), total productive maintenance (TPM), Kaizen, Kanban, single minute exchange of dies (SMED) and value stream mapping (VSM) (Forno et al. 2014; Modrak and Seman 2014; Vinodh and Chintha 2011). The relationship between the implementation of these tools and improved performance is well established in the literature (Hajmohammad et al. 2013; Khanchanapong et al. 2014; Womack and Jones 1996).

Lean practices are focused on the elimination of waste (such as the seven forms of production waste in the manufacturing process), thereby reducing costs and increasing productivity. A poor understanding of the main attributes of leanness, lean performance and its measurements contribute to the failure of lean practices (Anvari et al. 2013). However, one problem for these companies and organizations is the development of methods which could assist in leanness quantification and lean practice performance evaluations. Only after implementing the assessment of lean practices, can the decision makers take appropriate actions to improve the identified areas.

Assessment of lean practices can be regarded as a complex multiple attribute group decision making (MAGDM) problem. In the current literature, which is reviewed in Sect. 2, there are few papers using a GDM methodology. For instance, since the logistics distribution centers of a tobacco commercial industry involve several departments such as storage, sorting and delivery, it is better to include several experts both from the internal and external environment to execute the evaluation task. During the assessment process, it is often difficult for the experts to provide crisp values for the attribute ratings. The subjectivity and vagueness inherited from the experts’ preferences are better addressed with linguistic variables. One limitation of some previous methods for evaluating lean practices is that they are not suitable to handle the linguistic variables. In Vinodh and Chintha (2011), the authors considered the leanness assessment in a GDM setting. However, they did not consider consensus among the experts. Since various departments are involved in the decision making process, a consensus process is needed to reconcile these different views and interests.

Summarizing the motivations above, a MAGDM framework is desirable to enhance current approaches for the lean practices evaluation. The aim of this paper is to provide such a MAGDM framework to help the decision makers in achieving the assessment. The remainder of this paper is organized as follows. Section 2 briefly reviews the related literature. Section 3 introduces the 2-tuple fuzzy linguistic approach. Section 4 describes the consensus reaching process in detail which will be then combined in the MAGDM framework. Section 5 presents the extended entropy method and the general MAGDM framework to solve the lean practices evaluation problem. Section 6 illustrates the proposed MAGDM framework on a lean practices evaluation problem in a commercial tobacco company in China. Finally, Sect. 7 concludes the paper and gives some directions for future research.

2 Literature review

In this section, we briefly review the MAGDM methodology and lean performance assessment.

2.1 MAGDM methodology

MAGDM problems address decision situations in which a group of experts express preferences about multiple attributes and attempt to find a common solution. In these problems, quantitative aspects are seen as objective information and can be assessed using real data with the corresponding attribute values taking precise numerical values. However, qualitative aspects usually include the subjective judgments of the experts, which cannot be expressed precisely in a quantitative form and can only be stated in linguistic terms (that is, using linguistic variables). For example, if estimating the relevance of documents in information retrieval systems, terms such as “relevant” and “very relevant” may be used (Herrera-Viedma and López-Herrera 2007). To evaluate the Kansei attribute fun, terms such as “solemn”, “fairly” and “funny” can be used (Yan et al. 2012). In leanness assessments, terms like “very poor”, “good”, and “excellent” are widely used (Vinodh and Vimal 2012). Linguistic variables are very useful in situations where the decision making problems are too complex or ill-defined to be adequately described using conventional quantitative expressions (Mendel and Wu 2010; Zadeh 1975). Using linguistic information instead of numerical information has gained significant attention (Dong et al. 2008; Massanet et al. 2014; Parreiras et al. 2010; Wu and Mendel 2010).

There are a series of steps involved in solving a MAGDM problem: identifying the problem, constructing the preferences, evaluating the alternatives, and determining the best alternatives (Pedrycz et al. 2011). From a normative and prescriptive decision analysis view, after the experts express their preferences, they apply two processes before a final solution is obtained: a consensus process and a selection process (Herrera-Viedma et al. 2005). GDM differs from individual decision making in many respects. One clearly observable difference is that a diversity of opinions exists in groups (Cai et al. 2012). Therefore, to reach consensus, negotiation and conflicts management procedures are required. This process can be viewed as a dynamic and iterative group discussion process, in which the experts agree to change their preferences following advice given by a moderator. The moderator, who is in charge of supervising and moving the consensus process towards success, knows the status of the agreement at each step in the consensus process from the computation of the various consensus measures (Cabrerizo et al. 2010). A number of theoretical consensus models have been developed for various preference structures to conduct the consensus reaching process (See Aguarón et al. 2014; Alonso et al. 2013; Dong et al. 2010; Fu and Yang 2012; Palomares et al. 2013, 2014, 2014; Pérez et al. 2010; Xu and Wu 2013; among others).

Previous studies have done excellent work on consensus modeling. However, few papers have discussed the preference structures in a MADM setting (Parreiras et al. 2010, 2012; Roselló et al. 2014; Xu 2009; Xu and Wu 2011; Xu et al. 2014). Fu and Yang (2012) suggested a MAGDM group consensus model based on an evidential reasoning approach. Parreiras et al. (2010) proposed a flexible MAGDM consensus scheme under linguistic assessment. To maximize the soft consensus index, an optimization procedure that searches for the weight of each expert’s opinion was conducted. Parreiras et al. (2012) further studied three consensus schemes based on fuzzy models. Roselló et al. (2014) considered group consensus in a multi-granular linguistic environment. Xu (2009) investigated a MAGDM consensus problem in numerical settings, and developed a straightforward algorithm to reach a group consensus. Xu and Wu (2011) presented a discrete consensus support model to deal with a MAGDM in numerical settings. Xu et al. (2014) proposed a consensus process under an uncertain linguistic setting. However, some of these methods can only be used in crisp cases. Although some of the papers have considered linguistic setting, the consensus methods considered have been relatively complex and lack easy feedback strategies. Different consensus measures and different feedback strategies may lead to different consensus processes and properties. As pointed out in Herrera et al. (2009), the search for a simple yet reasonable GDM process is ongoing. In this paper, a simple and straightforward continuous-type consensus process based on the 2-tuple linguistic representation model is proposed. The proposed consensus process is an essential part in the proposed MAGDM framework.

2.2 Lean performance assessment

It has been argued that manufacturing technologies and lean practices are distinct from each other. The former is more technical (hard) and refers to certain types of technologies such as hardware and computer programs, whereas the latter is more concerned with managerial practices, organizational infrastructure, and the behavioral (soft) aspects of the firms (Khanchanapong et al. 2014). Lean performance assessment is utilized to determine the effect of lean practices, and consists of a leanness measurement and an evaluation of the leanness level in a given organization. To conduct such assessments, the organizations needs to identify their goals and familiarize themselves with the criteria needed to reach those objectives. Then, they are able to make quality and quantity improvements in their existing procedures and develop criteria to reach the level needed to accomplish their objectives (Seyedhosseini et al. 2011).

There have been some studies which have significantly contributed to leanness measurements and assessment models. Doolen and Hacker (2005) reviewed the lean assessment tools and developed a survey instrument to assess the implementation of lean practices within an organization. Bayou and Korvin (2008) presented a fuzzy logic approach for measuring leanness and compared the production leanness of Ford Motor Company and General Motor Company. Vinodh and Chintha (2011) proposed a multi-grade fuzzy approach for assessing leanness. A leanness index can be computed which is divided into five grades. Vinodh and Vimal (2012) presented a conceptual model with 30 criteria for leanness assessment. Anvari et al. (2013) identified four lean attributes including lead time, defects, cost, and value in their study. By using fuzzy membership functions, the weight of lean attributes was obtained. Azevedo et al. (2012) proposed an Agilean index to assess the agility and leanness of the automotive supply chain. Shah and Ward (2007) developed 10 underlying lean production factors, in which there were three measures for supplier involvement, one measure for customer involvement, and the remaining six measures addressed the firm’s internal issues. Seyedhosseini et al. (2011) extracted 52 criteria from the various lean objectives outlined in previous research. In Khanchanapong et al.’s (2014) empirical study, lean practices were measured using five dimensions: production flow management, customer focus, process management, workforce management, and supplier management. Cil and Turkan (2013) identified 26 different lean practices and utilizes an analytic network process (ANP) to determine the relative importance weights for each lean component.

The literature review indicates that many researches focus on the leanness measurements. These studies differed greatly in the number and dimension of the lean criteria. However, there are only few papers which have dealt with the leanness assessment problem using a GDM approach. In Anvari et al. (2013), Vinodh and Chintha (2011), although the authors considered GDM setting, they did not use the linguistic approach and did not consider the consensus process. As mentioned earlier, since various departments are involved in the decision making process, a consensus process is needed to reconcile these different views and interests. In the following, a consensus process and an entropy method integrated MAGDM framework under a linguistic setting is presented to facilitate the lean practices evaluation process.

3 The 2-tuple fuzzy linguistic model

The fuzzy linguistic approach represents the qualitative aspects as linguistic values using linguistic variables (Zadeh 1975). The main purpose for using linguistic variables is that linguistic characterizations are closer to the way people express their ideas and make judgments.

Suppose that \(S=\{s_i \vert i =0, \ldots ,g\}=\{s_0, s_1, s_2, \ldots , s_g\}\) is the linguistic term set accompanying a pre-ordered structure such that \(s_{i_1}<s_{i_2}\) iff \(i_1<i_2\). Here, \(S\) is a definite and totally ordered discrete term set with an odd cardinality value, such as 7 and 9, where \(s_\alpha \) represents a possible value for a linguistic variable. Linguistic variable semantics are usually represented using fuzzy numbers. For example, the following semantics can be assigned to a set of seven terms using triangular fuzzy numbers (Herrera-Viedma et al. 2005) (see Fig. 1):

$$\begin{aligned} S&= \{N=\text {None}=(0,0,0.17),~{ VL }=\text {Very Low}=(0,0.17,0.33),\\&L=\text {Low}=(0.17,0.33,0.5),~M=\text {Medium}=(0.33,0.5,0.67),\\&H=\text {High}=(0.5,0.67,0.83),~{ VH }=\text {Very High}=(0.67,0.83,1),\\&P=\text {Perfect}=(0.83,1,1)\}. \end{aligned}$$
Fig. 1
figure 1

Set of seven linguistic terms with their semantics

The 2-tuple fuzzy linguistic representation model was introduced to conduct precise processes for computing with words (CWW) when the linguistic term sets are symmetrically and uniformly distributed, and to improve several aspects of the ordinal fuzzy linguistic approach (Herrera and Martínez 2000). This model was widely used (Dong et al. 2009, 2013).

Definition 1

(Herrera and Martínez 2000) Let \(\beta \) be the result of an aggregation of the position indices of a set of labels assessed in a linguistic term set \(S=\{s_0,s_1,\ldots ,s_{g-1},s_g\}\), where \(g+1\) stands for the cardinality of \(S\), i.e., the result of a symbolic aggregation operation. Let \(i=round(\beta )\) and \(\alpha =\beta -i\) be two values such that \(i\in [0, g]\) and \(\alpha \in [-0.5,0.5)\). Then, \(\alpha \) is called a symbolic translation, with \(round\) being the usual round operation.

This model defines a set of transformation functions to manage the linguistic information expressed by the linguistic 2-tuples.

Definition 2

(Herrera and Martínez 2000) Let \(S\) be a linguistic term set and \(\beta \in [0, g]\) a value representing the result of a symbolic aggregation operation, then the 2-tuple that expresses the equivalent information to \(\beta \) is obtained with the following transformation:

$$\begin{aligned}&\triangle : ~[0, g]\rightarrow S\times [-0.5,0.5),\\&\triangle (\beta )=(s_i,\alpha )\;\;\text {with} \; i=round(\beta ) \;\; \text {and} \;\; \alpha =\beta -i, \end{aligned}$$

where \(round(\cdot )\) is the usual round operation, \(s_i\) has the closest index label to \(\beta \), and \(\alpha \) is the value of the symbolic translation. In addition, we have

$$\begin{aligned}&\triangle ^{-1}: ~S\times [-0.5,0.5)\rightarrow [0, g],\\&\triangle ^{-1}(s_i,\alpha )=i+\alpha =\beta . \end{aligned}$$

Roughly speaking, the above linguistic representation model defines a function between the linguistic 2-tuples and the numerical values. The conversion of a linguistic term into a linguistic 2-tuple consists of adding a value 0 as a symbolic translation: \(s_i\in S\Longrightarrow (s_i,0)\). Herrera et al. (2000, 2009) stated that the linguistic domain in the 2-tuple model is managed in the same way as a continuous domain. The 2-tuples linguistic computational model has different techniques to manage the linguistic information (Herrera and Martínez 2000).

A 2-tuple comparison operator: The comparison of linguistic information represented by the 2-tuples is carried out according to an ordinary lexicographic order. Let \((s_k, \alpha _1)\) and \((s_l, \alpha _2)\) be two 2-tuples, with each one representing a counting of the information:

  1. (a)

    if \(k<l\), then \((s_k, \alpha _1)\) is smaller than \((s_l,\alpha _2)\).

  2. (b)

    if \(k=l\), then

    1. (1)

      if \(\alpha _1=\alpha _2\), then \((s_k, \alpha _1)\), \((s_l,\alpha _2)\) represent the same information.

    2. (2)

      if \(\alpha _1<\alpha _2\), then \((s_k, \alpha _1)\) is smaller than \((s_l,\alpha _2)\).

    3. (3)

      if \(\alpha _1>\alpha _2\), then \((s_k, \alpha _1)\) is bigger than \((s_l,\alpha _2)\).

A 2-tuple negation operator: This is defined as

$$\begin{aligned} Neg(s_i,\alpha )=\triangle (g-\triangle ^{-1}(s_i,\alpha )). \end{aligned}$$

2-tuple aggregation operators: Using the function \(\triangle \) and \(\triangle ^{-1}\), any aggregation operator can be easily extended to deal with the linguistic 2-tuples, such as an ordered weighted average operator, or a weighted average operator.

In the following, the range for \(\triangle \) is denoted as \(\overline{S}\).

Definition 3

Let \(\{a_1,a_2,\ldots ,a_n \}\) where \(a_i\in \overline{S}\) be a set of variables to be aggregated. Let \(w=\{w_1,w_2,\ldots ,w_n \}\) be their associated weights where \(w_i\ge 0,\sum \nolimits _{i=1}^n{w_i}=1\). The linguistic weighted arithmetic averaging (LWAA) operator based on the 2-tuples is

$$\begin{aligned} { LWAA }_{2-tuple}(a_1,a_2,\ldots ,a_n)= \triangle \left( \sum \limits _{i=1}^n{\triangle ^{-1}(a_i)\cdot w_i}\right) . \end{aligned}$$
(1)

In Herrera and Martínez (2000), the \({ LWAA }_{2-tuple}\) operator is called the 2-tuple weighted average operator. For notation simplicity, in the sequel, \({ LWAA }_{2-tuple}\) is denoted by \({ LWAA }\).

Based on the previous definitions, the multiplication of a 2-tuple \(b_\alpha =(s_\alpha ,x_\alpha )\) with a number \(\mu \) can be defined by

$$\begin{aligned} \mu b_\alpha =\triangle (\mu \triangle ^{-1}(b_\alpha )). \end{aligned}$$

Considering any two 2-tuples \(b_\alpha =(s_\alpha ,x_\alpha )\) and \(b_\beta =(s_\beta ,x_\beta )\), and \(\mu ,\mu _1, \mu _2\in [0,1]\). The addition of two 2-tuples can be defined by

$$\begin{aligned} b_\alpha \oplus b_\beta =\triangle (\triangle ^{-1}(b_\alpha )+\triangle ^{-1}(b_\beta )). \end{aligned}$$

Further, we have

$$\begin{aligned} \mu _1 b_\alpha \oplus \mu _2 b_\beta =\triangle (\mu _1\triangle ^{-1}(b_\alpha )+\mu _2\triangle ^{-1}(b_\beta )). \end{aligned}$$
(2)

In this way, the result of the addition and scalar multiplication on the 2-tuples becomes a 2-tuple.

4 Consensus reaching model

Consensus process is an essential part in GDM which is used to obtain the maximum degree of agreement between experts on the solution set of alternatives. Due to the fact that few papers consider consensus reaching process in the evaluation of lean practices, we provide a separate section to describe such a process. Based on the 2-tuple linguistic representation model, this section first defines a deviation measure and a consensus index, and then presents an algorithm to describe the consensus reaching process. Then, some properties for the proposed algorithm are given. In the last subsection, we discuss the possible extension of the proposed model.

Let \(M=\{1,2,\ldots ,m\}\), \(N=\{1,2,\ldots ,n\}\). Suppose there are \(n(n\ge 2)\) potential alternatives denoted by \(X=\{X_1 ,X_2 ,\ldots ,X_n \}\). Each alternative is evaluated with respect to a predefined attribute set \(C=\{C_1 ,C_2 ,\ldots ,C_m\}\). There are a group of experts \(E=\{e_1,e_2 ,\ldots ,e_t \}(t\ge 2)\). Assume \(\lambda =(\lambda _1 ,\lambda _2 ,\ldots ,\lambda _t)\) is the weight vector for the experts, where \(\lambda _k \in (0,1)\), \(k=1,2,\ldots , t\), \(\sum \nolimits _{k=1}^t {\lambda _k } =1\). Suppose that \(R_k=\left( r_{ij}^{(k)}\right) _{n\times m}\) is a linguistic decision matrix given by the expert \(e_k \in E\), where \(r_{ij}^{(k)}\in \overline{S}\) represents the performance of alternative \(X_i \) over the attribute \(C_j \in C\). The problem in this paper is concerned with the ranking of the alternatives or the selection of the most desirable alternative(s) using the linguistic decision matrices \(R_k\), \(k=1,2,\ldots ,t\).

4.1 Consensus index

Definition 4

(Wu and Xu 2012) Let \(a_\alpha =(s_\alpha ,x_\alpha )\) and \(a_\beta =(s_\beta ,x_\beta )\) be the two linguistic 2-tuples. The deviation measure between \(a_\alpha \) and \(a_\beta \) is defined by

$$\begin{aligned} d(a_\alpha ,a_\beta )=\frac{\left| {\triangle ^{-1}(s_\alpha ,x_\alpha )-\triangle ^{-1}(s_\beta ,x_\beta )} \right| }{g}. \end{aligned}$$
(3)

It is easy to verify that \(0\le d(a_\alpha ,a_\beta ) \le 1\).

Based on the deviation measure between the two linguistic 2-tuples, we introduce a similarity degree between the two linguistic decision matrices.

Definition 5

(Wu and Xu 2012) Let \(A=(a_{ij})_{n\times m}\) and \(B=(b_{ij})_{n\times m}\) be the two linguistic decision matrices, where \(a_{ij}, b_{ij}\in \overline{S}\) , so then, the similarity degree between \(A\) and \(B\) is defined as

$$\begin{aligned} { SD }(A,B)=\sqrt{\frac{1}{nm}\sum \limits _{i=1}^{n} {\sum \limits _{j=1}^m {d^2(a_{ij},b_{ij})} }}. \end{aligned}$$
(4)

The similarity degree is used to measure the closeness of two experts’ preferences. Chiclana and Tapia Garcia (2013) conducted a comparative study of the effect of different similarity measures. The results demonstrated that the Euclidean distance functions helped the consensus process to converge faster than other distance functions. The Euclidean distance is one of the most widely used distance measures, so here the Euclidean distance is used to define the similarity degree of the preferences between any two experts in the group.

Let \(R_1 ,R_2 ,\ldots , R_t \) be \(t\) linguistic decision matrices provided by \(t\) experts, where \(R_k =\left( r_{ij}^{(k)} \right) _{n\times m} \), \(r_{ij}^{(k)} \in \overline{S}\). Then the weighted combination \(R=\lambda _1 R_1 \oplus \lambda _2 R_2 \oplus \cdots \oplus \lambda _t R_t\) is the group linguistic decision matrix \(R=(r_{ij} )_{n\times m} \), where

$$\begin{aligned} r_{ij}={ LWAA }(r_{ij}^{(1)},r_{ij}^{(2)},\ldots ,r_{ij}^{(t)}) =\triangle \left( \sum \limits _{k=1}^t{\triangle ^{-1}\left( r_{ij}^{(k)}\right) \cdot \lambda _k}\right) . \end{aligned}$$
(5)

Definition 6

(Wu and Xu 2012) Let \(R_k=\left( r_{ij}^{(k)}\right) _{n\times m}\), \(k=1,2,\ldots ,t\) and \(R=(r_{ij})_{n\times m}\) be \(t\) linguistic decision matrices and the group linguistic decision matrix, respectively. Then, based on the similarity degree between the two linguistic decision matrices, the group consensus index for \({R}_k \) is defined by

$$\begin{aligned} { GCI }(R_k)=1-{ SD }(R_k,R)=1-\sqrt{\frac{1}{nm}\sum \limits _{i=1}^{n} {\sum \limits _{j=1}^m {d^2\left( r_{ij}^{(k)},r_{ij}\right) }}}. \end{aligned}$$
(6)

From Definition 6, it follows that \(0\le { GCI }(R_k ) \le 1\). Given a threshold value \(\overline{{ GCI }}\), if \({ GCI }(R_k )\ge \overline{{ GCI }}\), then \(R_k\) is a linguistic decision matrix with an acceptable consensus level. The value \(\overline{{ GCI }}\) can be determined in advance by the decision makers. If \({ GCI }(R_k )=1\), then the \(k\)th expert \(e_k\) achieves the maximum consensus level. In this case, the preferences for \(e_k\) are the same as the group preferences. Otherwise, the larger the value of \({ GCI }(R_k )\), the closer that expert is to the group.

Remark 1

Depending on the actual situation, the experts establish the threshold \(\overline{{ GCI }}\) for the deviation degree between the individual linguistic decision matrix and the group linguistic decision matrix. In the literature, there is no uniform method for choosing the threshold values (Xu and Wu 2013). When the consequences of the decision to be made are considered important and have a significant influence on the related group, the consensus level required to make that decision may take a value as high as possible (Mata et al. 2009). At the other extreme, when it is urgent to obtain a solution to the problem, a minimum consensus value could be approved. In this paper, we set \(\overline{{ GCI }}=0.95\).

4.2 Consensus reaching process algorithm

As with other research (Herrera-Viedma et al. 2005; Parreiras et al. 2010; Xu et al. 2014), an implicit hypothesis in the consensus reaching process is that the experts are expected to effectively support the complete decision making process from problem formulation to solution implementation. The experts have bounded rationality and they express preferences which reflect their true ideas. In the decision process, they are ready to change their preferences according to the suggestions generated by some kind of consensus algorithm. It is also assumed that the experts provide their preferences using the same linguistic term set, although this assumption is not required for the proposed method.

Let \(R_1, R_2 ,\ldots ,R_t \) and \(R\) be \(t\) individual linguistic decision matrices and the group linguistic decision matrix, respectively. Without loss of generality, suppose that the preferences for \(e_p\) have the largest distance from the group preferences in this round. It is reasonable to assume that \(e_p\) is asked to adjust their preferences in the next round. In general, when some of the experts need to alter their preferences, they can do so freely. However, the effect of these preferences on the alternatives and the attributes are regarded as effective reassessments only when the consensus index is improved. It is useful to present a simple algorithm to guide this consensus process. The basic idea of the proposed consensus reaching process is that in each round, the group linguistic decision matrix is thought to be a good reference for the modification of the individual preferences. To reach a predefined consensus level, the following algorithm is designed.

Algorithm 1: (Wu and Xu 2012) Consensus reaching process

Input: Individual linguistic decision matrices \(R_1\), \(R_2, \ldots , R_t\), the weight vector of the experts \(\lambda =(\lambda _1 ,\lambda _2 ,\ldots ,\lambda _t)^T\), the predefined threshold \(\overline{{ GCI }}\), the maximum number of iterative times \(h_{\max } \ge 1\) and the parameter \(0<\gamma <1\).

Output: Modified linguistic decision matrices \(\overline{R_1}\), \(\overline{R_2}\), \(\cdots \), \(\overline{R}_t\), \({ GCI }(\overline{R}_k)\), \(k=1,2,\ldots ,t\), and the number of iterations \(h\).

Step 1: Set \(h=0\) and \(R_{k,0} =\left( r_{ij,0}^{(k)} \right) _{n\times m} =\left( r_{ij}^{(k)} \right) _{n\times m} \).

Step 2: Calculate the group linguistic decision matrix \(R_{h}=(r_{ij,h} )_{n\times m}\) corresponding to \(R_{1,h},R_{2,h},\ldots ,R_{t,h}\), where

$$\begin{aligned} r_{ij,h} ={ LWAA }\left( r_{ij,h}^{(1)}, r_{ij,h}^{(2)},\ldots ,r_{ij,h}^{(t)}\right) . \end{aligned}$$

Step 3: Calculate the group consensus index \({ GCI }(R_{k,h})\), \(k=1,2,\ldots ,t\) by using Definition 6. If \({ GCI }(R_{k,h} )\ge \overline{{ GCI }} \), \(k=1,2,\ldots ,t\) or \(h\ge h_{\max }\), then go to Step 5; otherwise, go to the next step.

Step 4: Suppose that \({ GCI }(R_{p,h} )=\mathop {\min }\limits _k \{{ GCI }(R_{k,h} )\}\). Let \(R_{k,h+1} =(r_{ij,h+1}^{(k)} )_{n\times m} \), where

$$\begin{aligned} r_{ij,h+1}^{(k)} = {\left\{ \begin{array}{ll} \gamma r_{ij,h}^{(k)} \oplus (1-\gamma )r_{ij,h} &{} k=p\\ r_{ij,h}^{(k)} &{} k\ne p \end{array}\right. }. \end{aligned}$$
(7)

Set \(h=h+1\) and go to Step 2. Note that the computation for \(\gamma r_{ij,h}^{(k)} \oplus (1-\gamma )r_{ij,h}\) is given by

$$\begin{aligned} \gamma r_{ij,h}^{(k)} \oplus (1-\gamma )r_{ij,h}=\triangle \left( \gamma \triangle ^{-1}\left( r_{ij,h}^{(k)}\right) +(1-\gamma )\triangle ^{-1}(r_{ij,h})\right) . \end{aligned}$$

Step 5: Let \(\overline{R}_k =R_{k,h} \), for all \(k=1,2,\ldots ,t\). Output \(\overline{R}_1,\overline{R}_2,\ldots ,\overline{R}_t\), \({ GCI }(\overline{R}_k )\),  for all \(k=1,2,\ldots ,t\), and the number of iterations \(h\).

Step 6: End.

Remark 2

The meaning of the linguistic decision matrices used in this paper is the same as for the linguistic preference relations in Wu and Xu (2012). However, these two kinds of preference structures are used to solve different problems. The linguistic decision matrix is used to represent the information when experts give their preference for alternatives over some attributes. The linguistic preference relation is used to represent the information from the pairwise comparison between the alternatives or attributes. The consensus process developed in this paper is based on Wu and Xu (2012).

It is possible that in Step 4, two or more experts simultaneously take the minimum consent index in one round. If this is the case, a random strategy could be used to choose the person who needs to change their preferences in this round. The algorithm provides an automatic feedback mechanism to guide experts in the sequential consensus process. Although the experts have the right to modify their preferences freely, they are strongly recommended to follow the feedback when changing their preferences.

Algorithm 1 is an iterative process. The parameter \(\gamma \) controls the preference modification degree for the experts. Different strategies could be used to choose \(\gamma \) according to different criteria. For example, one criterion could be keeping as much original preference information as possible. If this is the case, \(\gamma \) may take a value closer to 1.

A desirable property of the algorithm is that it can improve the consensus level of each individual in the group. When individual who has the smallest value implemented the improving strategy, the individual will have a better value.

When individual \(e_p\) has the smallest \({ GCI }\) value implemented to improve the strategy, individual \(e_p\) will have a better \({ GCI }\) value. To demonstrate the convergence of Algorithm 1, the following theorems are proposed. These two theorems can be proved similar to Wu and Xu (2012). However, to make this paper self-contained, the proof of the theorems are given in Appendix.

Theorem 1

Let \(R_1, R_2,\ldots ,R_t \) and \(\lambda =(\lambda _1,\lambda _2,\ldots , \lambda _t )^T\) be \(t\) linguistic decision matrices and the weight vector of the experts respectively. Let \(R_{l,h}\) be the decision matrix sequences generated by Algorithm 1 for expert \(e_l\). In the \(h\)th iteration, suppose that the \(p\)th expert \(e_p\) has the minimum GCI value, then

$$\begin{aligned} { GCI }(R_{p,h+1})>{ GCI }(R_{p,h}). \end{aligned}$$
(8)

Theorem 1 guarantees that for expert \(e_p\), the consensus level of this round is better than that of the last round. As mentioned, parameter \(\gamma \) controls the modification degree in every round. At the same time \(\gamma \) influences the process convergence rate. In the following, we demonstrate how the overall situation in each round is improved.

Theorem 2

Let \(R_1 ,R_2 ,\ldots ,R_t\) and \(\lambda =(\lambda _1 ,\lambda _2 ,\ldots ,\lambda _t )^T\) be \(t\) linguistic decision matrices and the weight vector of the experts, respectively. Let \(R_{l,h}\) be the decision matrix sequences generated by Algorithm 1 for expert \(e_l\). Then , we have

$$\begin{aligned} \min \limits _l \{{ GCI }(R_{l,h+1} )\}> \min \limits _l \{{ GCI }(R_{l,h} )\}. \end{aligned}$$
(9)

Theorem 2 concludes that the overall consensus level of the group in this round is better than that of the last round. Generally, after implementing the process finite times, the group achieves a predefined consensus level. When \(h \rightarrow \infty \), it follows that, \(SD(R_{k,h},R_h) \rightarrow 0\), and \({ GCI }(R_{k,h},R_h) \rightarrow 1\), for \(k=1,2,\ldots ,t\).

4.3 Discussion

There are a few ways to generalize this proposed consensus approach. Herrera-Viedma et al. (2014) provided an excellent review of soft consensus models in a fuzzy environment. According to the classifications proposed in Herrera-Viedma et al. (2014), the proposed approach belongs to the branch of soft coincidence. To compute the level of consensus achieved in each discussion round, the similarity between the experts’ preferences on the alternatives over the different attributes is measured. The consensus index takes a value in [0, 1], where a value close to 1 indicates a high consensus level and a value close to 0 indicates a low consensus level.

The proposed consensus reaching model has the potential to be extended to other linguistic representation models. The linguistic representation model in Herrera and Martínez (2012) has been widely used in decision analysis and fuzzy systems modeling. The basic idea of the fuzzy linguistic representation model with 2-tuples is that it defines two functions \(\triangle ^{-1}\) and \(\triangle ^{-1}\). These two functions transform numerical values into 2-tuples and vice versa without any loss of information. The Herrera and Martínez model aims to deal with uniformly and symmetrically distributed linguistic term sets. However, it is common that in decision making problems under uncertainty, the decision framework is more complex and multi-granular linguistic terms and unbalanced terms appear (Herrera et al. 2008; Herrera-Viedma et al. 2005). An unbalanced linguistic term set is shown in Fig. 2.

Fig. 2
figure 2

Example of an unbalanced linguistic term set of nine labels

Wang and Hao (2006) presented a proportional 2-tuple fuzzy linguistic representation model. Dong et al. (2009) further proposed a generalization model which integrated the Herrera and Martínez model and the Wang and Hao model. It was found that the key aspect of the computational techniques based on the linguistic 2-tuples is to develop a function (called a numerical scale) which can perform the transformations between the linguistic 2-tuples and the numerical values. The numerical scale establishes a one to one mapping between the linguistic information and the numerical values. However, to date, there has been no easy way to derive a suitable numerical scale NS for a given linguistic term set \(S\). Once we obtain the numerical scales for each expert, we can use the proposed consensus reaching process to achieve the consensus goal. Note that Dong et al. (2014) have provided a good way to transform multi-granular unbalanced linguistic preference relations into uniform balanced linguistic preference relations. The presented transformation function in Dong et al. (2014) can be used to extend the proposed approach to the multi-granular unbalanced linguistic environment.

As mentioned in the literature, there were a lot of studies focus on other preference structure like preference relations (Dong et al. 2010; Herrera-Viedma et al. 2005; Mata et al. 2009; Pérez et al. 2010; Wu and Xu 2012; Xu and Wu 2013). If preference relation is involved, other important aspects that appear in real applications should be considered. Pérez et al. (2014) considered heterogeneous GDM frameworks in the consensus process. Chiclana et al. (2009) suggested that the cardinal consistency could be incorporated in the consensus model. Alonso et al. (2008) gave a consistency-based procedure to estimate missing pairwise preference values. A compromise direction is to combine the consensus processes for preference relation and for decision matrix in the MAGDM framework. This is the case that the experts have their preferences for both attributes and alternatives in a MADM setting.

5 The proposed framework for the evaluation of lean practices

In this section, the entropy method, which is used to determine the weights of attributes is extended to the linguistic context. Then, a framework integrating the consensus reaching process and the entropy method is proposed to solve the MAGDM problem under a linguistic environment. This integrated MAGDM framework will be applied to a lean practices evaluation problem in the next section.

5.1 The extended entropy method

Many MADM methods require a computation of the relative importance of each attribute. The entropy method is classified as an objective methodology class. It only utilizes the decision matrix data to determine the attribute weights. In the following, the entropy method is chosen to determine the attribute weights.

The basic idea of the entropy method was given in Hwang and Yoon (1981). The decision matrix for a set of alternatives contains a certain amount of information. The information contained in the attribute values for a given attribute \(C_j\) can be measured using the entropy value. An attribute does not play an important role when all alternatives have similar attribute values for that attribute. Further, if all attribute values are the same, such an attribute can be eliminated. The steps for the extended entropy method in a general linguistic setting can be described as follows.

Assume that the group linguistic decision matrix after implementing the consensus reaching process is still denoted as \(R=(r_{ij})_{n\times m}\), where \(r_{ij}\in \overline{S}\).

Let \(\xi =(\xi _1, \xi _2, \ldots , \xi _m)^T\) be the weight vector to be derived by the entropy method, where \(\xi _j \in (0,1)\), \(j\in M\), \(\sum \nolimits _{j=1}^m {\xi _j } =1\). Since all attribute values are given by the linguistic variables, they should be transformed to numerical values before the entropy computation. The outcomes of attribute \(C_j\), that is, the attribute values \(r_{ij}\), \(i=1,2,\ldots ,n\), \(j=1,2,\ldots ,m\) then can be defined as

$$\begin{aligned} q_{ij}=\frac{\triangle ^{-1}(r_{ij})}{\sum \nolimits _{i=1}^n\triangle ^{-1}(r_{ij})}. \end{aligned}$$
(10)

The entropy \(E_j\) for the attribute \(C_j\) is calculated as

$$\begin{aligned} E_j=-K\sum \limits _{i=1}^nq_{ij}\ln q_{ij},\quad j=1,2,\ldots ,m. \end{aligned}$$
(11)

where \(K\) represents a constant: \(K=1/\ln n\) which guarantees that \(0\le E_j\le 1\).

The degree of diversification \(div_j\) of the information provided by the outcomes of attribute \(C_j\) can be defined as

$$\begin{aligned} div_j=1-E_j,\quad j=1,2,\ldots ,m. \end{aligned}$$
(12)

The final step is the computation of the objective weight for attribute \(C_j\), which is obtained by

$$\begin{aligned} \xi _j=\frac{div_j}{\sum \nolimits _{j=1}^mdiv_j}=\frac{1-E_j}{\sum \nolimits _{j=1}^m(1-E_j)},\quad j=1,2,\ldots ,m. \end{aligned}$$
(13)

Remark 3

In a specified problem, the experts may have their own preferences for the attributes. They may construct preference relations as a way of obtaining comparison pairwise matrices using the Analytical Hierarchy Process (AHP). From the preference relations the subjective weights for the attributes can be derived. Denote \(\theta =(\theta _1, \theta _2, \ldots , \theta _m)^T\), where \(\theta _j \in (0,1)\), \(j\in M\), \(\sum \nolimits _{j=1}^m{\theta _j } =1\) and \(w=(w_1, w_2, \ldots , w_m)^T\), where \(w_j \in (0,1)\), \(j\in M\), \(\sum \nolimits _{j=1}^m {w_j } =1\) as the subjective weight vector and the final weight vector, respectively. Using a linear combination, the \(w_j\) can be expressed as

$$\begin{aligned} w_j=\eta \xi _j+(1-\eta )\theta _j,~j=1,2,\ldots ,m. \end{aligned}$$
(14)

where \(\eta \) is a balancing coefficient between the objective weights and the subjective weights.

5.2 A framework for the evaluation of lean practices

Based on the consensuses reaching process and the entropy method, a general framework for the evaluation of lean practices is presented in Fig. 3.

Fig. 3
figure 3

A MAGDM framework for the evaluation of lean practices

The framework provides a method to effectively tackle the MAGDM problem concerning the lean practices assessment, in which the ratings of alternatives are represented by linguistic variables and the importance weights of attributes are unknown. In this framework, the consensus reaching process is introduced to reconcile the different preferences among the experts. The entropy method is introduced to obtain the importance weights of attributes. In this regard, the experts do not need to provide their preferences on the weights. The proposed MAGDM framework is straightforward and can be performed on computer easily.

There are several advantages of the proposed framework for the evaluation of lean practices. (1) Since the leanness assessments involve qualitative attributes, linguistic variables are useful for the experts to express their preferences. (2) The framework employs the 2-tuple linguistic model to manage the linguistic information in the lean practices evaluation process due to its accuracy and simplicity. The 2-tuple linguistic model differs from some of the existing fuzzy approach used in the leanness assessment such as the fuzzy logic approach (Vinodh and Vimal 2012), where the linguistic terms are transformed into fuzzy numbers. (3) The recommendation of consensus reaching process in the proposed framework is necessary. It takes into account of the current level of agreement between experts. Considering the GDM setting of the lean practices evaluation, consensus reaching process suggests a more acceptable solution to all the experts concerned in decision making. (4) The proposed MAGDM framework has the property of flexibility and therefore can deal with a complex lean practices evaluation problem. As discussed in Sect. 4.3, the consensus process can be generalized to other linguistic environment. Likewise, the weight of experts could also be incorporated into the framework. Other extensions are possible by combining the methods applied in the lean practices evaluation.

6 Application example

In this section, the developed MAGDM framework is applied to a lean practices evaluation problem for a commercial tobacco company’s logistics distribution centers.

The essence of lean management is the creation of a culture that encourages learning and continuous process improvement through simplifying and standardizing the way work is performed (Womack and Jones 1996). From a website survey of small and medium-sized enterprises in U.S., it was found that the primary reasons for implementing lean practices are mainly internal and include such aspects as cost reduction, increased profit margins, improved utilization of the plant/facility, and the maintenance of competitive position (Zhou 2012).

Although lean principles have been derived from Japanese manufacturing, they have also been applied in areas other than manufacturing. In China, increasingly more companies and organizations have begun or are beginning to explore lean management methods to adapt rapidly to the changing socio-economic conditions and achieve a better share of the market. A tobacco company in Sichuan Province, China, realized the importance of lean thinking and two years ago initiated lean activities in its four logistics distribution centers. Logistics distribution centers forms an important division in the tobacco industry. The primary functions of these centers are storage, sorting and delivery, amongst others. In the past two years, many lean activities such as work standardization, visual management, and total productive maintenance (TPM) have been implemented in these logistics distribution centers. The company wishes to choose an advanced logistics distribution center to provide a template for the other distribution centers in other tobacco companies in Sichuan Province. To this end, the company invited a group of experts \(E=\{e_1,e_2,e_3,e_4\}\) to form a project team and select the most suitable candidate center. Although there are both qualitative attributes and quantitative attributes, here, we only consider the qualitative attributes to demonstrate how the proposed framework can help the experts identify the best choice. After discussion, six important qualitative attributes were selected:

  1. (1)

    \(C_1 \): management level;

  2. (2)

    \(C_2 \): product quality;

  3. (3)

    \(C_3 \): customer service;

  4. (4)

    \(C_4 \): comprehensive stuff abilities;

  5. (5)

    \(C_5 \): operational standardization;

  6. (6)

    \(C_6\): implementation effect of quality control (QC) activities.

Note that as mentioned in Sect. 2.2, the literature revealed that the numbers of lean attributes or criteria were quite different, ranging from several criteria to dozens of criteria (Seyedhosseini et al. 2011; Shah and Ward 2007; Vinodh and Vimal 2012). The difference lies in the different types of companies and different levels of evaluation hierarchies. For our case, the expert team considered 11 quantitative attributes and only 6 qualitative attributes. Because it was the first time that the company conducted such evolutions, the decision makers agreed with the experts on the six attributes considered here. The set of attribute may be changed and extended in the future when the focus of the lean practices is changed.

To select the best logistics distribution center, the proposed MAGDM framework was applied and the steps were as follows:

Step 1: Constructing the preferences.

Four distribution centers \(DC_1, DC_2, DC_3\), and \(DC_4\) were assessed using the following linguistic term set, which is uniformly and symmetrically distributed.

$$\begin{aligned} S&=\{s_{0}=\text {Extremely Poor},~s_{1}=\text {Very Poor},~s_{2}=\text {Poor},\\&\quad s_{3}=\text {Slightly Poor}, ~s_{4}=\text {Fair},~s_5=\text {Slightly Good},\\&\quad s_6=\text {Good},~s_7=\text {Very Good},~s_8=\text {Extremely Good}\}. \end{aligned}$$

The information given by the four experts was constructed using the linguistic decision matrices shown in Tables 1, 2, 3, 4.

Table 1 Linguistic decision matrix \(R_1\)
Table 2 Linguistic decision matrix \(R_2\)
Table 3 Linguistic decision matrix \(R_3\)
Table 4 Linguistic decision matrix \(R_4\)

Step 2: Achieving the predefined consensus level.

Without a loss of generality, assume \(\lambda =(1/4,1/4,1/4,1/4)^T\) is the weight vector for the experts. The current group linguistic decision matrix was shown in Table 5. The current consensus indices for each expert were as follows

$$\begin{aligned} { GCI }(R_1)&= 0.9089,\;\;{ GCI }(R_2)=0.9245,\\ { GCI }(R_3)&= 0.9163,\;\;{ GCI }(R_4)=0.9163. \end{aligned}$$
Table 5 Group linguistic decision matrix \(R\)

If \(\overline{{ GCI }}=0.9\), it can be seen that all linguistic decision matrices arrive at the predefined consensus level. However, the experts agreed to set a higher consensus level \(\overline{{ GCI }}=0.95\). Algorithm 1 was used to modify the original linguistic decision matrices. Setting \(\gamma =0.95\), the algorithm terminated after 22 iterations. Overall, \(e_1, e_2, e_3\) and \(e_4\) modified their preferences 7, 4, 5, and 5 times, respectively. The simulation results for each individual’s consensus index were shown in Fig. 4. It follows from the results that both Theorem 1 and 2 were verified. The final group consensus indices were:

$$\begin{aligned} { GCI }(R_1)&= 0.9535,\;\;{ GCI }(R_2 )=0.9518,\\ { GCI }(R_3)&= 0.9510,\;\; { GCI }(R_3)=0.9508. \end{aligned}$$
Fig. 4
figure 4

The group consensus indices of Algorithm 1

All experts had higher consensus indices which were larger than the predefined consensus level. The modified linguistic decision matrices were shown in Tables 6, 7, 8, 9. The final group linguistic decision matrix, \(R_{new}\), was shown in Table 10.

Table 6 Modified linguistic decision matrix \(R_1\)
Table 7 Modified linguistic decision matrix \(R_2\)
Table 8 Modified linguistic decision matrix \(R_3\)
Table 9 Modified linguistic decision matrix \(R_4\)
Table 10 Group linguistic decision matrix \(R_{new}\)

Step 3: Determining the attribute weights.

Based on \(R_{new}\), the extended entropy method was used to derive the importance weights of attributes. Following the steps in Sect. 5.1, we have

$$\begin{aligned} \xi =(0.4300,0.3138,0.0677,0.1506,0.0117,0.0261)^T. \end{aligned}$$

The importance of the six attributes from highest to lowest was \(C_1\), \(C_2\), \(C_4\), \(C_3\), \(C_6\), and \(C_5\). This was consistent with the group linguistic decision matrix \(R_{new}\), where the attribute values for \(C_1\), \(C_2\), \(C_4\) were quite different, while the attribute values for the remaining three attributes were quite similar.

Step 4: Obtaining the overall assessment value for each alternative.

By utilizing the LWAA operator, the overall assessment value for the four alternatives were

$$\begin{aligned} U_1&= (s_4,0.32),\;\;U_2=(s_5,-0.06),\\ U_3&= (s_5,0.14),\;\; U_4=(s_5,0.37). \end{aligned}$$

Step 5: Ranking the alternatives.

According to the value for \(U_i\), the ranking of the alternatives was \(X_4\succ X_3\succ X_2\succ X_1\). Thus the best alternative was \(X_4\).

That is, logistics distribution center \(DC_4\) should be selected as the representative department, as its lean practices experience can be generalized to other similar distribution centers.

7 Conclusion

Lean practices have been implemented in modern companies and organizations to maintain their competitive position. It is very important to evaluate the effectiveness of these lean practices. This paper presented a MAGDM framework under a linguistic setting to facilitate such an evaluation process. The main contributions of this study are as follows:

  1. (1)

    A framework which encompasses both a consensus reaching process and an attribute weights determining process has been proposed to solve the lean practices evaluation problem. It utilizes the 2-tuple fuzzy linguistic model to deal with the linguistic information in the evaluation process. The characteristics and advantages of the proposed framework has been addressed.

  2. (2)

    The proposed MAGDM framework was applied to a lean practices evaluation problem for a commercial tobacco company’s logistics distribution centers in Sichuan province, China. The results demonstrated the viability of the proposed framework.

Although the proposed framework was illustrated by an enterprise lean practices evaluation, it can be easily applied to other decision problems. The proposed framework could be extended to support situations in which the preferences in decision matrix have other forms, such as, triangular fuzzy numbers, intuitionistic fuzzy numbers or hybrid certain and uncertain information. In this paper, the number of the alternatives were assumed to be fixed. However, the set of alternatives may change over time (Pérez et al. 2010). Future research will explore the linguistic decision making consensus reaching process in a dynamic decision environment using interval type-2 fuzzy sets and look for new applications.