Keywords

1 Introduction

Variability is one of the most troublesome factors of every process. It is caused by common (random) and special causes. Common causes are coherent to the process. They are usually accepted, as their identification and elimination often entail significant costs. Special causes, however, can and must be eliminated [4] because their impact on the process results can be significant. The sources of common causes of process variability are easily assigned. They are continuously present in the process (e.g. gradual wear of a tool’s blade) or occur suddenly (e.g. the breaking of a blade).

A widely recognized tool for monitoring process variability is the process control chart (PCC). The concept was proposed by Walter Shewhart in the 1930s [4].

Implementation of a PCC is usually based on an approach where the process is characterized by a temporary normal distribution with constant mean (µ) and standard deviation (σ) values. If over time this assumption is not met, it is assumed that special causes of variation occurred in the process and the causes must be eliminated [4].

In its basic form, a PCC enables monitoring of the statistics (e.g. mean value or standard deviation) of selected process features (they are usually properties or qualities of machined products or processed materials). On the basis of probability of occurrence of certain statistics values, conclusions are drawn concerning the process stability [4]. After recognizing that the process is running out of control a set of actions aimed at restoring it is undertaken. These actions consist of stopping the process, analysis of possible causes of instability and process correction. The use of PCC for process stability monitoring is grounded on the condition that the process capability measured by so called process capability indices (C p , C pk ) is sufficient. For practical reasons, a PCC analysis involves many predefined patterns that indicate the loss of process stability (Fig. 1), for example: one point beyond control limits (LCL or UCL), seven consecutive points making an increasing or decreasing trend, the so-called mixtures, i.e. the occurrence of groups of three consecutive observations above, and then below the center line, and many more [4, 17].

Fig. 1.
figure 1

The idea of a statistical model of the process (author’s research)

In most common cases, patterns on a PCC are constantly observed by an employee that operates the machine. The employee, or “human operator” also makes decisions about possible actions to be undertaken if the process loses its stability. Taking into account that the main job of the human operator is to operate the machine and that the human operator’s statistical knowledge is often insufficient [4] it can be expected that his decisions may be not sufficient enough.

Although Walter Shewhart published the theoretical basis of a PCC almost one hundred years ago, research is still conducted into expanding the field of their applications. Messaoud [10], for example, applied a PCC to monitor the process of drilling in order to discover unacceptable vibrations. Zhu and Lin [21] used Shewhart’s control charts to monitor the density of wooden slabs. In 2011, Holmes and Mergen presented a case where a PCC was applied to an automatically controlled process. Their study was conducted on the data from a process of brake drum production [5]. Chang studied the application of PCC to the operation of curing high-pressure hoses. He showed that application of a PCC made it possible to significantly reduce energy consumption [2]. Moore and Murphy analyzed the processes of surgical instrument production. These processes were characterized by frequent changes in machine settings and rebooting of machines due to the short run type of production. They proposed and applied a short run PCC [11]. Tangjitsitcharoen and Boranintr applied a PCC to monitor the surface roughness of a product processed through rolling with the use of a cutting force ratio. For the purpose of the statistical control of the process, they applied an I-MR (individual observations – moving range) PCC [18].

Many researches attempt to improve the methods of PCC analysis based on the assumption that sets of points on a PCC can be viewed as “images” of a process and may be interpreted on the basis of a set of previously developed patterns, without the need to apply the knowledge of assumptions concerning probability distribution and statistics. Here, the term “image” means a time series made by subsequent points on a chart, obtained as a result of measuring values or statistics for a given process. It may be called an image of a process in a specified time frame. Such an approach to PCC analysis becomes a classifying task. It involves classifying a set of points to a certain group of patterns using an appropriate classifier and on the basis of occurring cases, and assigning potential corrective measures. The results of this classification serve as a decision making support for the operator and can be very helpful in his every days’ work.

The relevant literature offers a number of classification methods, for example: classification through the induction of decision trees, Bayes classifiers, metaheuristic (e.g. genetic algorithms), rough sets, k-nearest neighbor algorithm, and many other statistical methods constantly developed [6, 15, 20]. Another one is research of Lesany et al., in which the methods of classification of control chart patterns were analyzed in detail [7].

2 Process Control Charts Analysis with the Use of Artificial Neural Networks

Among classification methods artificial neural networks (ANNs) have a special place. Networks do not learn algorithms but do learn through examples. Therefore, they are highly useful in solving problems related to the classification of objects. Below, several selected applications of ANNs concerning classification in the area of broadly understood production engineering are presented.

In research carried out by Kujawińska and Hamrol, the best results in application of ANNs to control chart analysis were obtained with the use of MLP (multilayer perceptron) networks [3]. Lu et al. used an ANN to monitor the state of the process and eliminate disturbances that occur during production [9]. Yu and Xi observed that neural networks are characterized by significant resistance to the lack of data and other disturbances. In their research they applied self-organizing maps for the control of process quality [19]. In 2013, Lonkwic published the results of research into the extent to which the neural network method is capable of identifying the dimensional part in the process of profile bending. The results were satisfying and not worse than in the case of the traditional approach [8]. Lesany et al. applied learning vector quantization and MLP neural networks to study atypical patterns. They emphasize that correct and quick recognition and analysis of unnatural patterns on Shewhart’s control charts is important, and unfortunately, not always (or almost never) possible when using classical solutions [7]. In 2014, Rojek studied different types of neuronal nets for load prediction and failures location within a water-supply network [12, 13] and for the system supporting the design of a technological process [14] and other authors also [1, 16].

However, the application of ANN architecture is not free of difficulties. Above all, problems related to the choice of such parameters as the number of neurons in particular network layers and the parameters for learning should be mentioned. Another weakness of ANNs is the fact that eventually, the results of the classification also depend on variables used for the construction of models and on the length of the images. This element significantly influences the effectiveness of recognition. In the case of images where length increases, a tendency to faulty classifications of symptoms for MLP networks was observed [3].

The aim of this paper is to study usability of various pattern recognition methods, based on artificial neural networks for decision making support in process control chart analysis. Their effectiveness for detecting process instability is compared with the effectiveness of a human operator and of a widely accessed commercial statistical software.

In the first stage of investigation three different types of networks were investigated in relation to their usability for PCC analysis: MLP, RBF (Radial Basis Function) and Kohonen (SOFM Self Organizing Feature Map). MLP networks are considered as the most universal and widespread networks for various technical problems. The RBF networks have some advantages in comparison to MLP. They enable to model any nonlinear function. In Kohonen networks training is organized by using method of competitive self-organizing. The results of the first stage made it possible to choose the best settings of neural networks and use them as the input to the second stage of the research: comparison of their effectiveness with the effectiveness of the human operator and commercial statistical software. The data from grinding of roller bearings was utilized in the first stage and from grinding together with superfinishing of a TV screen surface in the second stage of study.

Over the course of the first stage of research, the number of neurons in the hidden layer and the number of learning cycles were also changed (after processing a complete learning cycle - the process known as epoch - the classification error for an epoch was calculated and the whole cycle was repeated until this error fell below an acceptable level). The number of neurons in the hidden layer was selected experimentally: for the MLP network from 5 to 30, for the RBF network from 5 to 60. The number of learning cycles changed from 10 to 150. In the case of the SOFM network, the topology of the network also changed (network topology is a network structure; neurons are connected to adjacent neurons by neighboring relations – typical network topologies are rectangular, hexagonal, or in the form of open or closed chains). Networks of the topology of 4 × 4, 10 × 10, 15 × 15 and 20 × 20 were used.

For the purpose of training of the MLP network the BFGS algorithm was used (Broyden – Fletcher – Goldfarb – Shanno; for example “BFGS 130” means that the optimum solution was obtained in 130 steps). The RBF network was trained with the use of the RBFT algorithm (Reduced Breadth-First Search) in two stages. In the first stage, radial functions were distributed using only input variables from the data. In the second stage, weights connecting radial functions with output neurons were determined. In the case of the SOFM network, the training of the network took place with the use of Kohonen’s method, which involves assigning nods to a layer of radial neurons.

For training 504 samples at the length of 5 or 10 elements were collected. The set consisted of typical patterns defined for PCCs. It was divided into a learning file (75% records), a test file (15% records) and a validation file (10% records). The sets of learning included classes of previously defined patterns. The results of training, testing and validating are presented in Table 1.

Table 1. Results of evaluation of training, testing, validation errors and classification effectiveness for various types of networks in grinding process (authors’ research).

In the training and testing stage, the number of layers was selected along with the number of neurons, minimizing the network’s error. This task was conducted by means of a training algorithm. The training process was at the same time the process of assigning parameters of the model represented by the network to available training and testing data. A network’s error was determined with the use of an error function chosen by the researcher, which aggregates determined differences between the cases presented to the network. A testing error was used to evaluate the accuracy of the neural network model during the process of neural network training on the basis of data from the test sample that were not used in the first (training) stage. At the network testing stage there may be one more change in the network parameters. The validation error evaluates the model of a neural network on the basis of data not used during the training and testing of the network. The experiments compared training error, testing error, and validation error. In the first experiments was used an RMS error. Next, entropy (single and multiple) was used. Table 1 also includes the network evaluation expressed with the so-called classification quality (ANN quality), defined as a number of correct indications of the network to the number of all the possible indications expressed as a percentage.

By analyzing the models of MLP, RBF and SOFM neural networks, it can be seen that the selected networks (Table 1, bold) take a high percentage value of the classification quality – reaching even 100%. This result was obtained through changing the parameters in the network structure at the training stage. In all types of networks, the first parameter of the structure concerned the number of inputs. The constructed models had 5 and 10 inputs. In the case of the MLP network, a greater number of inputs resulted in the change of the remaining parameters. In order to obtain 100% classification quality, a much greater number of learning cycles were conducted in the BFGS algorithm, 37 for MLP 5-22-1 network, 52 for MLP 10-12-1 and 48 for MLP 10-25-1 respectively, which extended the network’s learning time. Another parameter, i.e. the number of neurons in the hidden layer of the network, also influenced both the classification quality and learning time. In order to improve the quality of training and shorten the training process, the neuron activation function in the hidden and output layers was changed. Networks cope well with tasks related to signal classification but require significant learning time and an ability to select an appropriate structure. The results of the research with RBF and SOFM networks showed great quality of pattern recognition. Another important feature was their speed of learning, especially through RFB networks.

It is worth mentioning that the number of accurate recognitions was increased in the course of training by introducing an additional number of neurons in the hidden layer, not by adding learning cycles.

3 Pattern Recognition on Control Charts – Comparative Study

Classification effectiveness of ANNs was compared with the effectiveness of the human operator and a commercial statistical module. The comparison was made for two industrial processes: grinding of roller bearings and the superfinishing of a TV screen surface. The study consisted of four stages:

  1. 1.

    Collecting data from grinding and superfinishing processes and analyzing the process stability and relevant sets of data by an expert.

  2. 2.

    Defining a set of patterns at the length of 5 or 10 points (observations). For both processes the set of 7 patterns (see Fig. 1): upper shift, lower shift, increasing run, decreasing run, increasing trend, decreasing trend, groups 2 out of 3 were predefined.

  3. 3.

    Recognizing patterns by a human operator, a chosen commercial statistical application (Statistica 12.0 [17]), ANNs used in previous research and the new ANNs.

  4. 4.

    Comparison of recognition effectiveness of all three approaches.

From each of the processes, 504 samples at the length of 5 or 10 elements were collected. The learning sets included classes of previously defined patterns.

As a measure of comparison, the quotient of the number of correct recognitions and the number of all occurrences of patterns in the analyzed set is expressed as a percentage (MP). The value of the measure changes from 0 to 100%. The measure takes its maximum value in a situation when all the patterns included in the analyzed set are correctly recognized.

In the research, the so-called false classification error was also observed, i.e. an error involving indication of a pattern in a case where there was none (MB). This measure was defined as a quotient of the number of erroneous recognitions and the number of all indications. It takes its minimum value in a situation when the method does not interfere with the random process (no false signals). In order to compare the effectiveness of recognition for particular methods, the values of both measures were analyzed simultaneously. In the study, the assumption was made that the analyzed method of classification is characterized with high effectiveness if the value of the measure of correct classification (MP) is larger than 95%, and at the same time, the value of the measure of erroneous indications (MB) does not exceed 1%. Tables 2 and 3 present the results of the classification verification in respect to the type of classifier and the kind of technological process. The tables show only the value of the classification quality (the measure of correct answers), because the value of the measure of false classifications was below 1% and it was assumed that its significance for the effectiveness of recognition is negligible. The results of the effectiveness of ANNs presented in the study were juxtaposed with the results of pattern recognition by the Statistica module and the classification quality of a human operator. The verification of the effectiveness of the classification of PCC images confirmed the assumptions that the weak chain in the analysis of control charts is the human operator—characterized by the smallest percentage of accurate recognitions. The effectiveness of the human operator in the classification of patterns for both processes equaled, on average, 84%. A decrease in the classification quality may be observed for patterns with a greater number of points. The human operator copes well with patterns of the trend, run or shift type. An exception significantly deviating from this rule are the patterns such as mixtures and with 2 out of 3 points located close to the control lines (in the so called warning zone).

Table 2. Comparison of classification effectiveness between a human operator, Statistica and ANN for images of length 5 (own study).
Table 3. Comparing classification effectiveness between operator, Statistica and ANN for images of length 10 (own study)

As mentioned before for training 504 samples at the length of 5 or 10 elements were collected. The set consisted of typical patterns defined for PCCs. It was divided into a learning file (75% records), a test file (15% records) and a validation file (10% records). The learning sets included classes of previously defined patterns. The PCC analysis module in the Statistica software had 95% of accurate indications. The Statistica software coped well with patterns such as increasing, decreasing, increasing run, decreasing run and 2 out of 3 in zone A The effectiveness of recognition decreased by unconventional patterns characteristic for both operations, mainly in the case of mixtures. The best recognition results were obtained for classifiers based on ANNs. The percentage of correct recognitions with the use of an ANN in the process of grinding was 100 in the case of images of length 5. Slightly worse classification quality was achieved for the process of superfinishing and in the case of images of a greater number of points. Nevertheless, the average level of correct indications was higher than the classification quality of the human operator and the Statistica module.

4 Conclusions

ANNs obtain great results in pattern recognition on control charts. They had significantly more accurate recognitions than the classification performed by a human operator or Statistica software.

Based on the authors’ research and literature review, a comparison of the three discussed methods was conducted. The results are presented in Table 4.

Table 4. A comparison of the three studied methods of pattern recognition on PCC.

All the discussed methods have their pros and cons. Which method to choose for specific use should be decided on the basis of technical, quality and economic conditions.