Keywords

1 Introduction

Machine learning models are increasingly being used to replace or supplement human decision-making in tasks requiring some kind of prediction. The more complex and dynamic systems rely on artificial intelligence classification methods, the more the need for tools that explain the operation of artificial intelligence. This process can be aided by indicating the relationship between machine learning classification decisions and the parameters of other systems of evaluation processes.

One of the methods developed in recent years to explain the performance of machine learning systems is the use of algorithms to determine Shapley values for individual features used in a machine learning model. However, it is not an unambiguous method, and is treated by researchers more as a basis for interpretation rather than as a direct indication of the reasons for classifying objects in this way [13].

Interpreting the results of algorithms based on Shapley numbers requires the cooperation of machine learning specialists and subject matter experts. The latter often use methods for maintaining systems based on the idea of control charts. Predictive methods for assessing system dynamics have been known for a long time, examples being the work of the American engineer Walter Shewhart in the first half of the 20th century. Most of specialists of industrial processes maintenance are familiar with these type of system dynamics description. Methods for statistical evaluation of variability are still being developed, often in association with ML [1, 12, 19]. The development of ML algorithms allows their use in applications related to predictive maintenance or forecasting of potential errors. [5]. Explainability methods are a step toward increasing users’ awareness of the model’s reasoning for making predictions. This makes it possible to optimize the effectiveness of the model, but also to detect undesirable model responses, thereby increasing the model’s credibility [18, 23].

The interpretation of artificial intelligence decision explanation algorithms in industrial processes requires the collaboration of experts in both areas: experts in the field of machine learning and experts in the field of the industrial process being analyzed or, as we will show further on, in the field of analysis of any type of issue that we have classified by artificial intelligence methods and which we wish to justify. Supporting the explanation of phenomena with visual methods makes it easier to carry out reasoning, so both in the field of machine learning and in the case of quality control processes, certain visual codes of graphs and relationships have been developed to facilitate understanding of the problem [20].

The aim of this article is to present method Shap-Enhanced Control Charts (SECC) which allows to combine information on the exceedance of system parameters limits with visualizations of the SHapley Additive exPlanations model proposed by Scott Lundberg [15]. The questions to be answered by experts using the combined data plots are:

  • whether overruns of system parameter limits translate into the relevance of a parameter value in machine learning classification

  • whether having no given limit we can propose one based on machine learning data

  • whether there are patterns of relationships between parameter values and value relevance in machine learning.

The method is implemented in application built in Streamlit, an open-source platform dedicated to creation of the visualizations and dashboards in pure Python, available at https://iwonagg-decisionsupportstreamlit-main-59xn5u.streamlit.app/. All visualizations presented in this paper are generated based on this app. Moreover, thanks to the possibility of interacting with the data, the user is provided with tools for finding relationships and correlations between data obtained in the machine learning process and other data characterizing the product, which can be useful for further analysis.

The article is divided into several sections. Section 2 presents related works. In Sect. 3 the method is described. In Sect. 4 the usecases are presented, based on two datasets described in Sect. 4.1.

2 Related Works

The scientific literature reports several works in the field of conformance checking [7, 17]. Typically, the term conformance checking refers to the comparison of observed behaviors, as an event log, with respect to a process model. At first, most of the conformance checking techniques were based on procedural models [14]. In recent years, an increasing number of researchers are focusing on the conformance checking with respect to declarative models, based on reactive business rules [4]. Conformance checking is also one of the goals of eXplainable Artificial Intelligence (XAI) methods, because their task is to link the model output with known, interpretable information. One of the motivations for our research is evaluation of the explainability results using transfer learning from expert knowledge as a base of conformance checking.

XAI is a dynamically evolving part of AI field, focusing on approaches that provide transparent and insightful explanations for decisions made by black-box models. We can distinguish between model–agnostic and model–specific methods. The first ones can be used to estimate the impact of features regardless of the type of model and its construction. These include Lime, SHAP or Anchors [16, 21]. Model–specific methods are less versatile because of the affiliation to the model class e.g. saliency maps [11] for gradient–based models. In our experiments, we based on the SHAP method.

The second motivation for our study is to expand the knowledge of the dataset using XAI indicators. In particular, to identify additional value limits related to the significance of the feature’s impact on the model decision. There are a number of solutions using explainable clustering as one of the tools in data mining analysis such as KnAC or CLAMP [2, 3]. Rule–based explainers such as Anchors and LUX [10] can be considered as knowledge generation methods, because they extract interpretable knowledge from the black-box in the form of a set of rules.

In our work, we are enhancing the current state–of–the–art methods by combining conformance checking of the model using expert knowledge, while generating additional knowledge from data based on XAI. At the same time, we demonstrate practical applications of the developed approach. For this purpose, we use domain knowledge limits for indicators describing medical condition, as well as technological specifications as a base for assessment sensors measurements gathered during the manufacturing process. We then associated information with different origins to interpret the discovered relations and present them in the form of interactive visualizations.

3 Shap-Enhanced Control Charts (SECC)

This method is based on the two fields of data analysis: the analysis of the importance of given parameter in ML classification, using SHAP values and impact of excedance of limits defined in control charts system.

3.1 Basis of SHAP Calculation and Visualisations

For each dataset on which we perform classification, we determine the set of features that are relevant to the learning model. SHAP algorithms allow us to determine the relevance factor of a feature for positive and negative classification. Usually red and blue colors are used for visualisation.

The SHAP values can be used for many types of analysis. One of the most important possibilities is to show, for every element rejected in the process of classification, the impact of every parameter taken into consideration in the process of machine learning. In Fig. 1 the example is shown of such single explanation. The problem is, that the interpretation of the SHAP values is not obvious and sometimes need the factual background to be performed [6, 9]. Without experts background it is difficult to determine whether high SHAP values are related to any part of the distribution, or perhaps are proportionally related to the values of individual parameters. This is important because if high explainability significance for a parameter means at the same time high values of this parameter, we can assess that if the model creates predictions that recognize defective products, SHAP determines the causes of this defect.

Fig. 1.
figure 1

SHAP bar chart example. The color is connected with negative or positive impact on classification and the width of bar is connected with the relevance of the parameter. (Color figure online)

3.2 Basis of Control Chart Usage

In the case of the quality control process, a frequently used analysis is based on a chart called the Shewhart chart, i.e. the variation of values over time including limit exceedances. The operator focuses on a single parameter and observes the variation of its value over time, and is particularly interested in the exceedances of values defined as lower and upper limits, i.e. alert values for the assessment of system performance. Figure 2 shows an example of a control chart used in a Statistical Quality Control system.

The limits – upper and lower – can be defined on the basis of external expertise or, with a normal distribution of the parameters for the data set. Factors causing nonconformities in the process under investigation on the control card are presented as:

  • points not falling within the designated range (outside the control lines)

  • clear sequences of consecutive points

  • above or below the line of average values

  • increasing or decreasing.

Fig. 2.
figure 2

Example of a control chart. The upper chart presents values of the chosen parameter, the lower chart presents the standard deviation. The blue line represents the values of the parameter, and the red ones - the upper and lower limits of this parameter. On the left corresponding histogram is given. The detailed description of the chart and its source dataset are presented in [22]. (Color figure online)

3.3 Combining the SHAP Charts and Values and Limits Plots

We assume that the parameter analysed by the operator, as relevant, has also been taken into account in the machine learning process. In order to make an analysis combining data from the two analytical processes, let us combine information from the control card for one parameter with the SHAP values for this parameter (see Fig. 3).

Fig. 3.
figure 3

Example of control chart combined with the SHAP values for the parameter shown in plot. The blue line represents values of the parameter, the yellow ones stand for limits and the background shows the SHAP values for the parameter. (Color figure online)

Fig. 4.
figure 4

Example of control chart combined with the SHAP values for the parameter shown in plot. The blue line represents values of the parameter, the yellow ones stand for limits and the background shows the SHAP values for the parameter. (Color figure online)

In the analysis based on the limits exceedance the combined information of limit value, parameter value and SHAP value is needed. In the data slice shown in Fig. 3 we can see two overruns combined with different SHAP values (zoomed in the Fig. 4), but every single overrun is hard to interpret unless we can put it in context. Visual patterns can be taken into consideration while reasoning. While analysing the data according to increasing SHAP values we can look for some dependencies between parameter values and SHAP data in the context of limits. A pattern similar to the one shown in Fig. 5A can be an argument for such connection, pattern similar to the one shown in Fig. 5B denies the relationship. Examples of real data charts meeting these patterns are shown in Sect. 4.2.

While analyzing the relationship between limits and SHAP values three cases can be taken into consideration:

  • When the SHAP values importance meets the exceedance of the limit. Limits are set by experts.

  • When there is no connection between SHAP parameter importance and the limits exceedance. Limits are set by experts. It is worth noticing, that it does not mean, that limit was defined wrongly. The control cart parameter limits may involve a wider range of problems than those considered for classification, and their origin may be related to other aspects of the process described by the model.

  • When there is no limit set by an expert, but while analysis of the relationship between SHAP importance and parameter values, the threshold of significance can be proposed on the base of the visual pattern shown in the chart.

Fig. 5.
figure 5

Visual pattern of combined data, where there is a connection between SHAP values and limit (in the left) and pattern, where no such connection is visible (in the right).

4 Usecases

4.1 Dataset Used

The model usage is presented using two datasets of different origins and different features as an example to demonstrate typical characteristics of data collected for the quality control process and data collected for classification by machine learning methods. One of them is data on the steel rolling process in the steel mill. The data collected from the sensors are the physical parameters of the rolled object (thickness, width, cross-section, etc.) and the parameters of the production process (temperature, etc.). The objects analysed are steel coils. The problem of machine learning classification is to distinguish between good and bad coils (rejected during the complaints process). The parameters of industrial process shown in line plot can be used as well to take care of the production line to avoid the technical problems leading to the increased number of bad coils. The exceedances of limits indicate the possible need for repair process.

The first dataset (A) we used to conduct the analysis was obtained from a steel plant, specifically from the Hot Rolling Process. The data was gathered via sensors positioned along the production line. After consulting with experts from the Quality Department, we conducted a feasibility study and finalized the selection of relevant parameters. The evaluation of steel strip quality is based on the feature set employed in our research, which corresponds to the Statistical Process Control system currently implemented by the company [8].

In order to ascertain the significance of variations between distributions, we trained a Machine Learning model to categorize products as either “good” or “bad”. Nonetheless, this classification is not primary objective of this study, but the visualisation of the results to enhance interpretation of the XAI. For that reason, we evaluate the importance of features on the model’s decision by utilizing SHAP values.

The second dataset (B) is medical data on patients diagnosed with diabetes. The problem of machine learning classification is to predict diagnosis of diabetics. The parameters shown in line plot can be used to indicate the need of in-depth diagnostics to make sure that parameters overruns are not the sign of the disease process.

Fig. 6.
figure 6

Combined chart of SHAP values and limit (140 mg/dl) excedance for glucose parameters sorted by the SHAP glucose (in the left) and value of glucose (in the right).

4.2 Connection Between Limits and SHAP Values

As mentioned before, while comparing the control chart with SHAP values diagrams it can be easly shown if the limits exceedance is correlated with SHAP values or not. Let’s show the examples of three cases:

  • overlap between the experts’ limit and relevance of SHAP values.

  • lack of connection between the experts’ limit and relevance of SHAP values.

  • the threshold of significance proposition on a base of the visual pattern shown in the chart.

The first situation can be represented by the glucose parameter from the dataset A. The expert based limit of accepted value of parameter is 140 mg/dl. As we can see in Fig. 6, there is strong correlation between upper limit and importance of SHAP value of glucose parameter.

The second situation is shown on the base of parameter of the BMI for the same dataset and the temperature of the steel coils from the other dataset. As we can see in the Fig. 7 and 8, there is no unambiguous connection between the SHAP value relevance and exceeding the limit. Especially in the second case we can see, that none of the overruns were classified negatively.

Fig. 7.
figure 7

Combined chart of SHAP values and limit exceedance for BMI parameter from dataset A sorted by the SHAP BMI (in the left) and temperature parameter from dataset B (in the right) sorted by SHAP temperature.

Fig. 8.
figure 8

Combined chart of SHAP values and limit exceedance for BMI parameter from ‘dataset A (in the left) and temperature parameter from dataset B (in the right). Data filtered by the upper limit. The result of classification can be seen below: negative in navy blue and positive in gray. (Color figure online)

Finally let’s take into consideration the most interesting third case, when there are no limits in the sense of experts’ base threshold, but such a line can be proposed on the base of pattern analysis in the combined chart of SHAP importance and control chart. One of the parameters taken into account in the diabetics dataset is the number of pregnancies. There was no “limit” of pregnancies in the context of the risk of diabetes, but let us see Fig. 9. We can place the threshold line behind 6 pregnancies - the importance of higher values of pregnancies parameter is always high in the SHAP results schema. After establishing the 7 pregnancies as limit we can filter the cases (Fig. 9).

Fig. 9.
figure 9

Combined chart of SHAP values and limit exceedance for the number of pregnancies parameter sorted by the SHAP pregnancies (in the left) and number of pregnancies (in the right). The proposed on the base of visualisation threshold value is 7.

5 Conclusions

The combination of SHAP values and control charts parameters, especially limit exceedances are not broadly described in the literature, but separate use of both approaches is fundamental for industrial processes analysis. The combined approach, connecting the information taken from SHAP calculations and from control charts exceedance, presented in SECC method, gives us possibilities to reason beyond the scope implied by both the first and second approaches.

This approach is dedicated to practical industrial analysis, where the control charts are in everyday use. The customizable and flexible application for comparing the control charts visualizations with the combination of the SHAP values and additional information can be used as a tool for reasoning based on dependencies, which appear in the graphical visualization. Some examples of conclusions made on the base of the application were shown. In the future maybe the other types of anomalies can be extracted from the data after visual patterns recognition and described and interpreted in the context of dataset.