1 Introduction

Manufacturing companies implement innovative managerial practices, respond to their customers’ needs properly and timely, invest in advanced manufacturing technologies, and are involved in performance improvement activities in order to achieve success in their markets (Kaplan 1983; Banker et al. 1993; IFAC 1998; Maksoud et al. 2005).

A key enabler for manufacturing companies to have success in their markets is to establish proper performance measurement systems. Moullin (2003) defines performance measurement as ‘evaluating how well organizations are managed and the value they deliver for customers and other stakeholders.’ Traditional performance measurement approaches commonly employ financial measures in their models (Babic and Plazibat 1998; Yurdakul and Ic 2005; Ertugrul and Karakasoglu 2009; Ic and Yurdakul 2010; Ic 2014). However, Striteska (2012) notes that ‘financial measures are not enough’ in the business environment today and there is a ‘growing emphasis laid on using non-financial criteria.’ Similarly, in Kagioglou et al. (2001), Kennerley and Neely (2002), Ittner et al. (2003) and Epstein and Manzoni (1997), financial measures are considered inadequate in dynamic business environments, and non-financial measures are recommended instead for assessment of company activities.

However, both non-financial and financial measures quantify the results of the companies’ past actions (Neely 2002; Moullin 2003). Accordingly, performance measurement models based on financial and non-financial measures only are not sufficient in evaluation of performance of manufacturing companies. They are found to be ‘undermining manufacturing competitiveness (Hayes and Abernathy 1980) through encouraging short-termism (Banks and Wheelwright 1979; Hayes and Garvin 1982), lacking strategic focus (Skinner 1974), encouraging local optimization (Hall 1983; Fry and Cox 1989) and encouraging minimization of variance rather than continuous improvement (Lynch and Cross 1991)’ (Melnyk et al. 2010).

Melnyk et al. (2010) states that various models such as ‘the S.M.A.R.T. Pyramid (Lynch and Cross 1991), the Results/Determinants Matrix (Fitzgerald et al. 1991) and the Balanced Scorecard (Kaplan and Norton 1992)’ eliminate the weaknesses of the traditional performance measurement models. This paper also aims to eliminate the weaknesses of the traditional performance models in developing its own performance measurement model.

In this paper, the proposed performance measurement model presents best practicesFootnote 1 (sources of competitiveness) through a list of statements (“Appendix 1” provides a sample of statements). The statements provide a decision structure and prevent assignment of performance scores randomly in critical activities. A manufacturing company can determine the level of similarity between its practices, actions and infrastructures with the list of statements and assign a performance score accordingly for each critical activity.

In the paper, Sect. 2 introduces the developed three-level performance measurement model. Section 3 describes the proposed model, whereas Sect. 4 provides an illustrative example. Section 5 includes the final thoughts and conclusions.

2 Determination of the structure of the performance measurement model

The authors’ literature survey on performance measurement models reveals that a complete performance measurement model without any weaknesses is not available in the literature. Still, various studies provide essential characteristics of a performance measurement model such as:

  • Anderson et al. (1995), Flynn et al. (1995), Choi and Liker (1995), Forza and Flippini (1998), Dow et al. (1999), Samson and Terziovski (1999), Das et al. (2000), Cua et al. (2001), Matsui (2002), Kaynak (2003), Yeung et al. (2005), Parast et al. (2006), Ahmad and Zabri (2016), Toklu and Taşkın (2017), and Rawat et al. (2018) identify various practices that positively improve performance of a manufacturing company. The practices include leadership commitment, human motivation and development, new product design, process management, quality information and customers and supplier relations.

  • Braz et al. (2011) states that a performance measurement system should provide data not only for predicting future performance but also for benchmarking and reinforcing organizational strategies. In another study, Al-Tit (2017) suggests that a performance measurement model should provide data for organizational culture also. The findings of Al-Tit’s (2017) study indicate that organizational culture in a company is more influential than the supply chain management on performance.

  • Striteska (2011) emphasizes importance of evaluation of ‘how well companies are managed and the value they deliver for customers’ in development of a performance measurement model. On the other hand, Böhm et al.’s (2017) study demonstrates that an increasing focus on services could lead to success for manufacturing companies.

  • Taylor and Taylor (2013), Esmaeel et al. (2018) and Sardana et al. (2016) point out the importance of inclusion of strategic objectives into a performance measurement model. In addition, Sangwa and Sangwan (2018) proposed an integrated performance measurement framework to measure the effect of lean implementation throughout all functions of an organization. In total, they identified 26 dimensions and 119 key performance indicators under seven categories (manufacturing process, new product development, human resource management, finance, administration, customer management and supplier management) in their study.

  • Rao et al. (2018) presented a simple profit-linked performance measurement model. The model can evaluate performance of a company in three measures—profitability, productivity and price recovery.

  • Yeo and Grant (2017) proposed a decision tree approach to explore the effects of information and communication technologies (ICTs) on manufacturing companies’ sales performance.

The characteristics listed above indicate that a performance measurement model must have a multi-level framework in which strategic objectives should guide the performance scoring process. In addition, the model must measure companies’ achievement levels in activities, which are necessary for the companies to be successful in their markets against their competitors in the long term. Such a model is developed and provided in Fig. 1.

Fig. 1
figure 1

The multi-level performance measurement model

2.1 The description of the developed performance measurement model

The developed performance measurement model uses a fuzzy version of Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) approach to obtain overall performance scores of manufacturing companies. TOPSIS is a widely used MCDM technique because of its simplicity, ability to yield an indisputable ranking order and programmable nature (Chen and Hwang 1992; Yurdakul and Ic 2009a, b; Ic and Yurdakul 2010; Chamodrakas et al. 2011; Eraslan and İç 2011; Ic 2012; Kundu et al. 2014).

Regular applications of TOPSIS approach are limited to single-level decision hierarchies. Since this study proposes a multi-level decision hierarchy (Fig. 1), it is necessary to modify TOPSIS approach to obtain performance scores of manufacturing companies with respect to the overall goal. The modified TOPSIS approach calculates the weights of critical activities with respect to the overall goal by combining the weights of the four objectives with respect to the overall goal and the weights of the seventeen critical activities with respect to each objective. The modified TOPSIS approach requires formation of a decision matrix and a weight vector as inputs (see Fig. 1).

For the development of the decision matrix, performance scores of manufacturing companies at critical activities (\( \tilde{x}_{ij} \); i = 1,2,…,n (number of manufacturing companies); j = 1,2,…, 17 (number of critical activities)) are assigned and placed in matrix form as shown in Eq. (1). In the assignment of performance scores in the critical activities, a company can receive one of the four fuzzy scores given in the fourth column of Table 1 corresponding to its level of similarity with the list of statements.

$$ \tilde{D} = \left[ {\begin{array}{*{20}c} {\tilde{x}_{11} } & \quad {\tilde{x}_{12} } & \quad \ldots & \quad {\tilde{x}_{117} } \\ {\tilde{x}_{21} } & \quad {\tilde{x}_{22} } &\quad \ldots & \quad {\tilde{x}_{217} } \\ \ldots & \quad \ldots & \quad \ldots & \quad \ldots \\ {\tilde{x}_{n1} } & \quad {\tilde{x}_{n2} } & \quad \ldots &\quad {\tilde{x}_{n17} } \\ \end{array} } \right]. $$
(1)
Table 1 4-Point Fuzzy Scale for Assignment of Scores

The weights of the seventeen critical activities with respect to the overall goal (\( \tilde{w}_{j} \) is the fuzzy weight of critical activity j with respect to the overall goal) form the second input (the weight vector, \( \tilde{w} \)) for the modified TOPSIS approach. The weight vector is calculated using Eqs. (2-5). In the equations, \( \tilde{c}_{ij} \) and \( \tilde{g}_{i} \) are fuzzy weights of critical activity j with respect to the objective i and of objective i with respect to the overall goal, respectively.

$$ \tilde{C} = \left[ {\begin{array}{*{20}c} {\tilde{c}_{11} } & \quad {\tilde{c}_{12} } & \quad \ldots & \quad {\tilde{c}_{117} } \\ {\tilde{c}_{21} } & \quad {\tilde{c}_{22} } & \quad \ldots & \quad {\tilde{c}_{217} } \\ {\tilde{c}_{31} } &\quad {\tilde{c}_{33} } & \quad \ldots & \quad {\tilde{c}_{317} } \\ {\tilde{c}_{41} } & \quad {\tilde{c}_{42} } & \quad \ldots & \quad {\tilde{c}_{417} } \\ \end{array} } \right] $$
(2)
$$ \tilde{c}_{ij} = \left( {a;b;c;d} \right) $$
(3)
$$ \tilde{G} = \left[ {\begin{array}{*{20}c} {\tilde{g}_{1} } \\ {\tilde{g}_{2} } \\ {\tilde{g}_{3} } \\ {\tilde{g}_{4} } \\ \end{array} } \right] $$
(4)
$$ \tilde{g}_{i} = \left( {k;l;m;n} \right) $$
$$ \begin{aligned} \tilde{W} &= \left[ {\tilde{w}_{1} \vdots \tilde{w}_{2} \vdots \ldots \vdots \tilde{w}_{j} \vdots \ldots \vdots \tilde{w}_{17} } \right] = \tilde{C} \times \tilde{G} \\ &= \left[ {\mathop \sum \limits_{i = 1}^{m} \tilde{c}_{i1} \tilde{g}_{i} \vdots \mathop \sum \limits_{i = 1}^{m} \tilde{c}_{i2} \tilde{g}_{i} \vdots \cdots \vdots \mathop \sum \limits_{i = 1}^{m} \tilde{c}_{in} \tilde{g}_{i} } \right] \end{aligned} $$
(5)

When the decision matrix and the weight vector are available, the decision matrix is normalized and weighted. From the weighted normalized decision matrix, one can obtain ideal and negative-ideal solutions, distance measures and finally overall performance scores of the companies as explained in ‘Appendix 2.’

3 Illustrative example

In the illustrative example, overall performance scores of eight companies from the machinery and components sector (cluster) located in Ivedik Organized Industrial Zone, Ankara, Turkey, are calculated and compared. The selected eight companies manufacture and sell components, parts, modules and systems fitting the machineries and are denoted C1-C8 in the paper. The eight companies cooperate with not only national but also foreign companies as their suppliers. A general evaluation of the eight companies shows that the companies’ manufacturing facilities are highly automated and computerized and they use automated internet-based communication and ordering and payment systems with their customers and suppliers. The computer-aided engineering, design and manufacturing programs are very common in all companies. In terms of advanced management practices, the companies have established programs to improve the quality of their products and customer service and to reduce time delays in manufacturing and designing products.

3.1 Application of the modified fuzzy TOPSIS approach in calculation of the companies’ overall performance scores

The application of the modified fuzzy TOPSIS approach requires two inputs, namely the decision matrix and fuzzy weight vector of the critical activities with respect to the overall goal. Tables 2, 3, 4 and 5 present the results of calculation steps described in Sect. 3.

Table 2 The decision matrix
Table 3 Fuzzy weights of the critical activities against the four objectives
Table 4 Fuzzy weights of the four objectives with respect to the overall goal
Table 5 Weight vector (fuzzy weights of the critical activities with respect to the overall goal)

With the application of the modified fuzzy TOPSIS, the decision matrix is first normalized, weighted and defuzzified (Table 6). The last two rows of Table 6 (A* and A) provide the best and worst values at the seventeen critical activities. The overall performance scores of companies are obtained using the differences between their performance scores and the best and worst values at the seventeen critical activities (di* and d i ) (Table 7). Finally, overall performance scores and rankings of the eight manufacturing companies are calculated.

Table 6 Defuzzified weighted and normalized decision matrix
Table 7 The overall performance scores of the eight manufacturing companies

Figure 2 provides the application results. The average overall performance score is 0.604, and the highest ranked manufacturing company is C5 with the overall performance score of 0.6721. On the other hand, the lowest ranked manufacturing company is C4 with a score of 0.2495. With the results of the proposed model, the companies can compare their overall performance scores with the industry average, maximum and minimum scores. An overall performance score close to the industry minimum should increase the urgency and importance of performance improvement programs in the corresponding company (Yurdakul and Ic 2005).

Fig. 2
figure 2

The illustration of the overall performance scores and rankings of the eight companies

While selecting critical activities for performance improvement programs, the companies can use their performance scores in critical activities (Table 2) along with the critical activities’ weights with respect to the overall goal (Table 5) also. Relative importance of a critical activity is directly proportional to its weight value. Table 8 is prepared to present the necessary related information for performance improvement in a simplified form. The first column of Table 8 lists the critical activities with respect to their weights in descending order. The critical activity with the highest weight value is A8 so that the first row of Table 8 belongs to A8. On the other hand, the last row of Table 8 contains the least important critical activity (A16). The last column of the table contains average performance scores of the listed seventeen critical activities. In selecting a certain number of critical activities from Table 8 for performance improvement, the companies can compare their performance scores with the average performance score in each activity starting with the first row and moving below. A performance score lower than the industry average indicates a candidate activity for performance improvement. The total number of selected activities for performance improvement may change from one company to another depending on many factors such as allocated resources.

Table 8 Performance scores of the eight companies at the seventeen critical activities

3.2 Checking the accuracy of the proposed performance measurement model’s ranking results

The proposed performance measurement model’s ranking results are compared with the ones obtained with XBANK credit experts and COSEL model developed in Ic and Yurdakul’s (2010) paper to check the accuracy of the proposed model. The credit experts use private and public information of the companies and industries in making their credit decisions. They may use various models and rules for their decisions, but personal and institutional expertise gained by making similar credit decisions and observing their results over the years is still the most important tool in their credit decisions. On the other hand, COSEL uses seven financial ratios, namely current ratio, liquidity ratio, ratio of total liabilities to total assets, total liabilities/total assets, ratio of bank debt to equity, ratio of operating profits to net sales, ratio of net earnings to net sales and ratio of net earnings to equity. In the application of COSEL, the user enters the weights for the seven ratios. The summation of the multiplications of the seven financial ratios’ values with their weights provides overall performance score for a company.

For comparison of the ranking results, the statistical significance of the differences between the rankings obtained by the proposed model and the ones obtained by the bank credit experts and COSEL developed in Ic and Yurdakul (2010) is determined using Spearman’s rank-correlation test (Table 9). Spearman’s rank-correlation test is recommended when ‘the actual values of paired data are substituted with the ranks which the values occupy in the respective samples’ (Parkan and Wu (1999)). In the application of Spearman’s test, to test the null hypothesis (H0: There is no similarity between the two rankings), a test statistic, Z, is calculated using Eqs. (6-7) and compared with a pre-determined level of significance α value. In this study, 1.645, which corresponds to the critical Z-value at the level of significance of α = 0.05, is selected. If the test statistic computed by Eq. (7) exceeds 1.645, the null hypothesis is rejected and it is to be concluded that ‘H1: The two rankings are similar’ is true. In Eqs. (6) and (7), di is the ranking difference of company i, and nmc is the number of companies to be compared. rs represents the Spearman’s rank-correlation coefficient in Eqs. (6) and (7) (Ic and Yurdakul 2010).

$$ r_{s} = 1 - \frac{{6 \times \mathop \sum \nolimits_{i = 1}^{{n_{mc} }} \left( {d_{i} } \right)^{2} }}{{n_{mc} \times \left( {n_{mc}^{2} - 1} \right)}} $$
(6)
$$ Z = r_{s} \sqrt {n_{mc} - 1} $$
(7)
Table 9 Ranking results of the three approaches and calculation of the significance of the differences

Table 9 presents the test results between the three models. The last row in Table 9 shows Spearman’s correlation coefficients (Z-value) for the ranking differences. Both Z-values, 1.858 and 2.047, are higher than 1.645, which implies that the differences are statistically insignificant. Based on the test results, it can be concluded that the rankings obtained by the proposed model are statistically similar to the rankings obtained using credit expert opinions and COSEL.

4 Conclusions

The developed performance measurement model not only provides a ranking of a company within its industry, but also compares its performance with its competitors’ performances in critical activities. It also provides a structure to develop performance improvement programs to close the performance gap between the company and higher-ranked companies.

However, the proposed model in this paper is limited to determine performance scores in a specific industry. In its application in another industry type, the user has to search and gather all necessary information about the new industry and modify the list of statements and objectives of the decision hierarchy accordingly. As a further study, one may try to determine a robust decision hierarchy. Such a study requires development of a common list of critical activities and objectives. Another further study can use other fuzzy logic techniques, such as hesitant fuzzy or intuitionistic fuzzy approaches for modeling the uncertainties in the decision environment. As the level of uncertainty in the decision environment increases, other fuzzy logic techniques may provide results that are more realistic.