1 Preface

In the globalization trend of enterprise nowadays, corporate operation has faced changeful impact. To survive, the enterprise has to adapt itself to the environment and change its business operation direction as well as promote business operation reform and innovative concept. However, during the first oil crisis, all the industries around the world are affected, the global market has weak demand, and the production cost is greatly enhanced; the economic recession has made Japan to reach almost a zero growth rate. Until now, the enterprise operation of many enterprises is still in dangerous state, and there is only one special case, that is, TOYOTA company of Japan, which is almost not affected by bad global environment; although the profit is slightly reduced, yet as compared to other enterprises, the profit is still large; therefore, Toyota’s management has become scientific management paradigm well recognized around the world. All the global enterprises are attracted to Toyota’s Production System (TPS) and try to study it and find a way to introduce it into their enterprises effectively. From the empirical study of Womack and Jones [1], it was pointed out that its success key is the so-called lean production. Today, there are many large enterprises around the world that develop, from TPS, many lean production or service systems that are suitable to be applied in their own companies. However, currently, many researchers have been widely studying the lean production or service systems problem [24].

The Department of Business Administration of China University of Technology had received a support of “The research and training center of lean enterprise” project from “Union project of academy and research institute” of Department of Industrial Technology of the Ministry of Economic Affairs (MOEA) of Taiwan to form Toyota’s lean service simulation laboratory. In the laboratory, the preparation of Taiwan’s delicious foods is taken as an example to perform lean production and service simulation experiment, and Taiwan’s delicious foods is as shown in Fig. 1. The experiment can be divided into three stages, the first stage is that the employees, under chaotic condition, perform the preparation of delicious foods, the second stage is the improvement of the preparation flow of delicious foods, and the third stage is to make improvement on batch material preparation and on information propagation part so that the cost of the preparation of delicious foods can be reduced and the value of the information can be enhanced. In this study, data mining technique is used to analyze the data collected in the laboratory so as to investigate whether the results of lean production and service as adopted by an enterprise can enhance the entire production and service performance; meanwhile, the characteristics of employees in the laboratory are investigated, and finally, lean service performance (LSP) and Employee Characteristic Analysis (ECA) models are set up to be used by the enterprise as reference for enhancing production and service performance and human resource utilization performance.

Fig. 1
figure 1

Taiwan’s delicious foods as prepared by laboratory

The main architecture of this article is as follows: The first section will be an introduction of the research motive and objective of this article. The second section will be an introduction of the data mining technique and related literature adopted in this article. The third section will be an introduction of the sample data and empirical analysis used by this article. In the fourth section, research conclusion and suggestion will be proposed.

2 Research method

2.1 Kmedoid cluster

Kmedoid is a newer cluster, and the main difference between Kmeans and Kmedoid stands in calculating the cluster centers: the new cluster center is the nearest data point to the mean of the cluster points.

2.1.1 Kmedoid algorithm

Given the data set X, choose the number of clusters l < c < N. Initialize with random cluster centers chosen from the data set X. Repeat for l = 1, 2, … Step 1: Compute the distances

$$ D_{ik}^{2} = (X_{k} - V_{i} )^{T} (X_{k} - V_{i} ),\quad l \le i \le c,\;l \le k \le N $$
(1)

Step 2: Select the points for a cluster with the minimal distances, they belong to that cluster.

Step 3: Calculate fake cluster centers

$$ v_{i}^{(l)*} = \frac{{\sum\nolimits_{j = 1}^{{N_{i} }} {X_{i} } }}{{N_{i} }} $$
(2)

Step 4: Choose the nearest data point to be the cluster center

$$ D_{ik}^{2*} = (X_{k} - V_{i}^{*} )^{T} (X_{k} - V_{i}^{*} ) $$
(3)

and

$$ x_{i}^{*} = \arg \min_{i} (D_{ik}^{2*} );v_{i}^{(l)} = x_{i}^{*} $$
(4)

Until

$$ \prod\limits_{k = 1}^{n} {\max \left| {\left. {V^{(l)} - V^{l - 1} } \right|} \right. \ne 0} $$
(5)

End: Calculate the partition matrix.

2.2 Grey relational analysis

Professor Deng [5] had proposed grey theory, which is nowadays successfully applied in all kinds of fields. Grey theory is a new theory, which is specifically suitable for forecast and analysis. It has a characteristic to fully analyze finite data and information so as to achieve the goal of predicting the future data. Grey theory is mainly to, under the uncertainty of system model and the incompleteness of information, perform relational analysis and model construction related to the system; meanwhile, through forecast and decision-making method, the system situation is investigated and understood.

Grey relational analysis (GRA), in grey theory, is a measurement method for analyzing the relation among discrete sequence data, which is currently widely applied in performance assessment field [68]. It performs quantitative comparison and analysis on the development situation in the grey system factors. It is a method for evaluating the relational level among factors based on similar or dissimilar level of the development trend between factors, that is, all the relations among related factors of the system are displayed one by one to be used as useful information and reliable basis for system decision making and forecast and control; such analysis model can clarify the grey relation among factors in the grey system, meanwhile, it is a measurement that gives quantification to the trend of change of the system development.

The steps in the algorithm of Grey relational analysis are as follows:

Step 1: Find, from the original decision making matrix D, the standard sequence A 0 and inspected sequence A i. Standard sequence is a set formed by the ideal target values of all the influential factors. A 0 = (x 01, x 02, …, x 0j , …, x 0n ) contains j terms, wherein j = 1, 2, …, n. In addition, the performance value of inspected sequence A i  = (x i1, x i2, …, x ij , …, x in ), wherein i = 1, 2, …, m.

Step 2: Normalize the data of original decision making matrix D.

Step 3: Calculate Grey relational distance ∆0ij , and ∆0ij is to evaluate the difference between each normalized data and normalized reference data.

$$ \Updelta_{0ij} = \left| {\left. {x_{0j}^{*} - x_{ij}^{*} } \right|} \right. $$
(6)

Step 4: Calculate Grey relational coefficient γ0ij .

$$ \gamma_{0ij} = \frac{{\Updelta_{\min } + \zeta \Updelta_{\max } }}{{\Updelta_{0ij} + \zeta \Updelta_{\max } }} $$
(7)

Step 5: Calculate Grey relational grade Γ0i , with a formula as in the following:

$$ \Upgamma_{0i} + \sum\limits_{j = 1}^{n} {[\omega_{j} \times \gamma_{0ij} ]} $$
(8)

Step 6: Make Grey relational ordinal and follow grey relational grade to perform performance ranking.

2.3 Genetic Programming

Genetic Programming (GP) is an algorithm developed by professor Koza [9, 10] in Stanford university of USA by following the genetic algorithm of Holland [11]. Similar to the concept of genetic algorithm, it has chromosome, fitness function, reproduction, crossover and mutation operation mechanism, etc. However, what is different is Genetic Programming further replaces the gene (0 and 1) of chromosome by syntax tree. Therefore, each individual in the matrix represents a set of computer program, and these program codes, similar to gene, through natural selection in the evolution process, can generate optimal program codes.

Genetic Programming originates from genetic algorithm. Usually, genetic algorithm uses binary string of fixed length to represent chromosome, and Genetic Programming uses tree structure that is highly changeable in size and shape and structure to represent chromosome and to represent, respectively, different formula. It is as shown in Fig. 2.

Fig. 2
figure 2

Syntax tree of Genetic Programming

Here, the plus and minus signs are internal nodes, and the rest terminal nodes are elemental sets (X, Y, and 5) defined from the problem; and the representation formula corresponding to syntax tree is (X − 5) + Y.

In addition, in the model construction aspect, genetic algorithm uses known structure to find out optimal model parameter, but in Genetic Programming, the model structure is constructed together with model parameters. The basic formation components of Genetic Programming include:

  1. 1.

    Terminal set: It is a set formed by values sent into Genetic Programming. It can be constant or variable. Since it is located at the leaf part at the branch terminal of tree structure, it is thus called terminal set.

  2. 2.

    Function set: It includes arithmetic operation, logical operation, or the self-defined function by the user.

  3. 3.

    Fitness function: It is a formula designed to calculate and estimate the good or bad characteristic of chromosome. After fitness function operation is done on the chromosome, we will obtain a value. The closer the value to the target value, the higher the probability it is selected.

  4. 4.

    Control parameters: It includes population size, crossover probability and mutation probability, etc.

  5. 5.

    Termination condition: It can be set up as fixed number of generation or the error tolerance.

The main operator of Genetic Programming includes:

  1. 1.

    Selection: Based on the theory of natural selection of evolutionism of Darwin, before performing crossover and mutation, selection, and weed-out job will be performed first.

  2. 2.

    Reproduction: Excellent chromosome will be reproduced completely to the next generation from mother generation.

  3. 3.

    Crossover: To continue excellent gene in the chromosome of mother generation to the next generation. The operation method is to select in random two mother generations; meanwhile, nodes are selected in random as crossover points, then the parts below crossover points are exchanged to produce two different son generations. It is as shown in Fig. 3.

    Fig. 3
    figure 3

    Crossover illustration

Mutation: It is to prevent the falling of chromosome into partial optimal solution, or to prevent the unlimited swelling of the layer number of chromosomes. In general operation, a node is selected in random as mutation point, next, operator will be selected in random from function set or terminal set to replace the original gene. But under certain conditions, under continuous increase of generation, chromosome will show a growth trend. In order to prevent tree layer expansion, mutation will start from random selection of gene in the tree, and the selected gene will be removed and replaced by gene right next to it, and all the parameters of the previous gene will be deleted. It is as shown in Fig. 4.

Fig. 4
figure 4

Mutation illustration

In Genetic Programming, after mechanisms such as selection, reproduction, crossover and mutation, the chromosome in son generation will replace that in the mother generation, and this process will repeat again and again until the termination condition is met.

In this study, newer Matlab GPLAB toolbox will be adopted for model construction. During the model construction process, GPLAB will search dynamically related parameters of Genetic Programming (For example, crossover probability, mutation probability, tree depth, and node number) and use a chart to display the generation evolution process until the parameter is optimized. Currently, Genetic Programming has been applied in many fields, which include finance and banking field [12, 13], management field [14], and engineering field [15], and good results are obtained.

3 Empirical study

3.1 Sample data and variable

This study is carried out by Toyota lean service simulation laboratory at China University of Technology, and two sets of lean service experimental data are collected from March 23, 2010, to April 01, 2010. Here, in the first stage, there are 40 data, in the second stage, there are 40 data, and in the third stage, there are 20 data, that is, a total of 100 data. The related factors in each space include time spent from the arrival of the restaurant to the completion of dish ordering (X1), time spent from the completion of dish ordering to the arrival of ordered dishes (X2), time spent from the completion of dish ordering to the leaving of table for payment after eating (X3), time spent from the completion of dish ordering to leaving after payment (X4), number of wrong dish sending (X5), and number of extra dish sending (X6). In this study, the research data in the first stage and second stage are combined to form the first set of data for a total of 80 data, then the research data in the second stage and third stage are combined to form the second set of data for a total of 60 data; this is to investigate the lean production and service performance and employee characteristic between two stages of the two sets of data, and the descriptive statistical values of these two sets data are as shown in Table 1.

Table 1 Descriptive statistical for the two sets of data

3.2 The performance analysis of lean service simulation experiment

In this study, we referring to Pan [16] all the variables in two sets of data are used as indexes in lean production and service performance assessment, and the values of all indexes are the smaller the better. Then all the assessment indexes are analyzed using Grey relational analysis proposed by professor Deng [5], then the grey relational Matlab toolbox as developed by Wen et al. [17] is used to find out Grey Relational Grade for further sorting; the top 40 rankings in the first set and top 20 rankings in the second set are of good performance, and the latter 40 rankings are of bad performance, then it is compared with the previous stage (worse performance) and latter stage (better performance) within two sets so as to judge the performance of two sets of experimental data. The result of Grey relational analysis is as shown in Fig. 5. In this study, the minimal value in each space factor is used as standard sequence, wherein the bold black line at the bottommost side represents standard sequence, and the rest fine lines represent inspected sequence, that is, in each of the rest data, there are six nodes to represent the six space factors of that data. If inspected sequence gets closer to standard sequence, it means that the lean production and service performance of that data are better. After the analysis of this study, it was found that among latter 40 data of the first set of data, only five data are of bad performance; hence, the performance in the second stage experiment is enhanced, as compared to that in the first stage experiment, by 87.5%; and in the latter 20 data of the second set of data, only six data are of bad performance; hence, the performance in the experiment in the third stage is enhanced by 70% as compared to that in the second stage, and from the analysis result, it can be judged that lean production and service can indeed enhance the entire service performance. Next in this study, this experimental data are used to set up lean service performance (LSP) model to be used as reference by the related enterprises; those enterprises only need to substitute related factors into the model to judge whether it is necessary to take lean production and service to enhance the performance of that company. In this study, the latter 40 rankings of Grey relational analysis results of two sets of data represent bad performance (represented by 1), and the front 40 rankings in the first set and the front 20 rankings in the second set represent good performance (represented by 0); then this is used as dependent variable, and in association with six space factors, it is used as independent variable to perform the setup of three lean service performance model such as Genetic Programming, Back-Propagation Network (BPN) and logistic regression (LR).

Fig. 5
figure 5

The linear sequence chart of GRA

First, the construction of Genetic Programming model is done first. In the model architecture, node mathematical function is formed by plus, mydivide, square, times, minus and mylog. The evolution initial parameter values in the generation include genetic generation number 100, 30 chromosomes, and the termination conditions are the number of generation. In this study, the first set and second set data are finely divided into 4 groups, wherein each small group data of the first set contains 20 data, and each small group data of the second set contains 15 data; meanwhile, three small groups are used as training sample data to construct the model, 1 small group is used as test data to test the model stability, and cross-verification is then carried out. Figure 6 is the output results using the front three small groups training data of the first set and second set data to construct Genetic Programming model, and Fig. 6a shows the dynamic adjustment tree depth and node number change in each evolution generation. Figure 6b shows Pareto front, which shows best fitness found for each tree size. Figure 6c shows Syntax Tree for Genetic Programming, which can be represented as the following two sets of Symbolic Regression.

Fig. 6
figure 6figure 6

The output charts of LSP model from GP

The first set of regression formula is as follows:

plus(times(X3,plus(X3,times(X5,plus(plus(mydivide(mydivide(minus(X1,minus(plus(times(X5,plus(plus(X3,times(X6,X3)),X3)),plus(plus(X5,minus(X6,X3)),X5)),X3)),times(X5,plus(plus(X3,times(plus(X3,X3),times(X3,plus(X3,X5)))),plus(X3,times(X5,X3))))),minus(plus(plus(X3,X4),X3),plus(X3,times(X5,X3)))),times(mydivide(minus(X1,minus(X6,times(X5,plus(X3,X5)))),X2),plus(X3,X3))),X1)))),times(mydivide(mylog(X4),mylog(X3)),plus(X3,times(X5,X5)))).

The second set of regression formula is as follows:

plus(X3,mydivide(X3,square(plus(X2,mydivide(plus(times(X4,mylog(X3)),mydivide(plus(X3,mydivide(X2,square(plus(X1,mydivide(plus(times(X4,mylog(X3)),mydivide(X2,minus(square(X2),X6))),mydivide(X4,X4)))))),X4)),X4))))).

Then in this study, Back-propagation Artificial Neural Network is further adopted to construct lean service performance model. In the architecture selection of neural network, this study has referred those mentioned in Yeh [18], in the general issue, one hidden layer can be adopted, but for the neural number in the network hidden layer, it is feasible to adopt the division of input layer (6) plus output layer (1) by 2 and the round-off into 4, then we can obtain better forecast result. The internet architecture chart is as shown in Fig. 7a.

Fig. 7
figure 7

The network structure charts of LSP model from BPN

Finally, this study has adopted logistic regression in traditional statistical model to construct lean service performance model so as to compare three data mining techniques.

3.3 The Employee Characteristic Analysis of lean service simulation experiment

In this study, the X1, X2, X3, and X4 space factors within two sets of experimental data are used as Employee Characteristic Analysis (ECA) indexes, wherein X1 time spent from the arrival at the restaurant to the completion of dish ordering, which represents whether the attitude the employee serving the customer is aggressive enough; X2, X3, and X4 are time spent from the completion of dish ordering to the arrival of ordered dishes, time spent from the completion of dish ordering to the leaving of table for payment after eating, and time spent from the completion of dish ordering to leaving after payment, and all the factors within these spaces are related to whether the employees are familiar with the preparation of dishes or whether the dishes serving is of good production efficiency. In this study, these two sets of data will be done with Kmeans and Kmedoid hard clustering analysis, and the analysis result will divide the data into two Clusters; meanwhile, the writing of professor Liu and Xu [19] will be referred to and newer Fuzzy Sammon Mapping nonlinear mapping will be adopted to map the clustering result from multidimensional space to two-dimensional plane so as to clearly display the clustering result, which is as shown in Fig. 8.

Fig. 8
figure 8

Kmeans and Kmedoid clustering analysis result

From a comparison of two sets of data, it was found that one group has better service attitude and has dish-serving efficiency, and another group has worse service attitude and is lacking of dish-serving attitude; however, from the clustering index, it was found that in the first set, the Kmeans PI (Partition Index) is 1.7732, SI (Separation Index) is 0.0222, which is much higher than Kmedoid PI index of 0.0025 and SI index is 0.000003; in the second set, Kmeans PI index is 1.0448, SI index is 0.01742, which is much higher than Kmedoid PI index of 0.0029 and SI index of 0.000005; since PI and SI index is the smaller the better, further from the figure, we can find that Kmedoid clustering result can clearly separate two groups, and there is obvious gap between groups; hence, in this study, Kmedoid clustering result is used for subsequent analysis. It was found from the first set of clustering result that there are 50 data classified as better service attitude with dish-serving efficiency (represented by 0), and there are 30 data classified as bad service attitude and bad dish-serving efficiency (represented by 1). In the second set of clustering result, 42 data are classified as better service attitude and good dish-serving efficiency (represented by 0), and 18 data are classified as bad service attitude and worse dish-serving efficiency (represented by 1).

Next, in this study, the previous clustering result is used as dependent variable, X1, X2, X3, and X4 space factors are used as independent variables to perform the construction of three Employee Characteristic Analysis models such as Genetic Programming, Back-Propagation Network and logistic regression so as to be used as reference by enterprises, and the enterprise can then detect the production and service performance of the employee at any time. The same as the above section, data in the first set and second set are finely divided into four groups, wherein there are 20 data in small group in the first set, and 15 data in small group in the second set; meanwhile, cross verification is carried out in a way that three small groups are used as training sample data to construct a model and one small group is used test data to test the model stability. The selection of model parameter of Genetic Programming is the same as the last section. Figure 9 shows the output result using the first three small groups training data of first group and second group data to construct Genetic Programming model, from Fig. 9a, it can be seen the dynamic adjustment of tree depth and node number change in each evolution generation. Figure 9b shows Pareto Front, which is the best fitness found for each tree size. Figure 9c shows the Syntax Tree of Genetic Programming, which can be represented as the following two sets of symbolic regressions.

Fig. 9
figure 9figure 9

The output charts of ECA model from GP

The first set of regression formula is as follows:

mylog(minus(X4,times(plus(plus(minus(square(times(X6,plus(X4,X4))),X4),times(square(mylog(minus(X4,times(plus(X4,times(X3,minus(X5,X1))),times(mylog(X4),mylog(X2)))))),X1)),mylog(X3)),times(mylog(X4),mylog(X2))))).

The second set of regression formula is as follows:

minus(times(X2,mylog(times(mylog(plus(X2,minus(plus(X6,mylog(mylog(times(plus(X4,plus(X6,X2)),plus(X6,plus(X6,X2)))))),square(plus(X2,X4))))),mylog(mydivide(X4,mylog(X5)))))),minus(X2,square(square(X6)))).

Then in this study, Back-propagation Artificial Neural Network is further adopted to construct Employee Characteristic Analysis model. In the architecture selection of neural network, one hidden layer is adopted in this section; but for the neural number in the network hidden layer, it is feasible to adopt the division of input layer (4) plus output layer (1) by 2 and the round-off into 3. And the network architecture is as shown in Fig. 7b. Finally, in this study, the logistic regression of traditional statistical model is adopted to construct Employee Characteristic Analysis model so as to compare the advantages and disadvantages of three data mining techniques.

3.4 Summarization and comparison of the classification and forecast capabilities of three models

In this study, the output result of Genetic Programming model and Back-propagation Artificial Neural Network model is defined that when the value is smaller than 0.5, it is classified as 0, when the value is larger than 0.5, it is classified as 1; and this is to observe the forecast capabilities of three models. After cross verification of the four small groups of data with the results generated by three models, a ROC curve is then plotted, which is as shown in Fig. 10.

Fig. 10
figure 10

The ROC curve of classification forecast results for LSP and ECA model

Figure 10 shows the ROC curve of classification forecast result of cross verification of four small groups of data of two sets of data in lean service performance model and Employee Characteristic Analysis model. Bradley [20] pointed out that the larger the area above reference line and underneath the curve, the more accurate the classification capability of the model. From the figure, it can be obviously seen that no matter it is lean service performance model or Employee Characteristic Analysis model, Genetic Programming model always shows the best classification capability.

Then from Table 2, the output result of ROC curve analysis can be seen. Here, sensitivity (Sen) means the percentage of the ratio of the number of forecast result of 1 to the number of real value of 1; and specificity (Spe) means the percentage of the ratio of the number of forecast result of 0 to the number of real value of 0. Moreover, Hand and Till [21] had pointed out that (Gini Index) = 2 × AUC − 1. Since the larger these index values the better, for the specificity and sensitivity of Genetic Programming model in the table, area under the curve (AUC) and Gini Index, no matter in lean service performance model or Employee Characteristic Analysis model, are all higher than that in other model. Hence, it has very good alarming and detection capabilities.

Table 2 The empirical result of ROC curve analysis

4 Conclusions and suggestions

The main contribution of this study is to further investigate on whether the experimental data obtained from Toyota lean service simulation experiment at China University of Technology have production and service performance. From Grey relational analysis, it was found that the result after lean production process (the second stage) can indeed be the entire production and service performance. In addition, this article also investigates employee characteristics during the production process. From the clustering analysis, it was found that employee characteristic can be divided into two groups of good service attitude with dish-serving efficiency and of worse service attitude and without dish-serving efficiency. Finally, this research has further set up lean service performance model and Employee Characteristic Analysis model to be used by the enterprise as reference for enhancing production and service performance and human resource utilization performance.

In addition, this article uses Grey relational analysis to perform the performance analysis of lean production and service. In the future, it is suggested that other models (for example, Data Envelopment Analysis) can be adopted to further investigate the production and service performance.