Keywords

1 Introduction

Heating, Ventilation and Air Conditioning (HVAC) systems count for 50% of the consumed energy in commercial buildings for maintaining indoor comfort [21]. Nonetheless, almost 15% of energy being utilized in buildings get wasted due to various faults (like control faults, sensor faults) which significantly occurs in HVAC systems because of lack of proper maintenance [20, 28]. In HVAC systems, an Air Handling Unit (AHU) acts as a key component. The faults in heat recycler in AHU often go unnoticed for longer periods of time till the performance deteriorates which triggers the complaints related to comfort or equipment failure. There are various fault detection and diagnosis techniques being identified to benefit the owners which can reduce energy consumption, improve maintenance and increases effective utilization of energy. Heat recycler’s faults can be detected by comparing the normal working condition data with the abnormal data measured during heat recycler failure. Most of the fault detection methods have the training dataset as historical data for building machine learning models, such as support vector machine, neural networks, support vector machine or decision trees, depending on the training dataset. Abnormal data of heat recycler failure is identified as a different class from the normal working class by using various classification algorithms [9, 27].

Fig. 1.
figure 1

The explanation framework of XAI [16]

We have entered a new era of artificial intelligence where core technology is machine learning but machine learning models are non-intuitive, opaque and it is difficult to understand them. Thus, the effectiveness of machine learning models is limited by its inability to give explanation for its behaviour. To overcome this, it is important for machine learning models to provide a human understandable explanation for explaining the rationale of model. This explanation can then further be used by analysts to evaluate if the decision meets the required rational reasoning and does not have reasoning conflicting with legal norms. But what does it mean by explanation?; a reason or justification given for some action. The explanation framework can be well explained by a framework as shown in Fig. 1 where XAI system consists of two modules, explanation model and explanation interface. The explanation model takes the input and justifies recommendation, decision or action based on any machine learning model. The explanation interface provides an explanation to justify the decision made by machine learning model i.e. why does the machine behaved in such particular way that made it to reach a particular decision. Thus the user can make the decision based on the explanation provided by the interface. Figure 2 shows an example that how explainable artificial intelligence helps a user by explaining the decisions of learning model.

Fig. 2.
figure 2

An example depicting an instance of XAI [16]

2 Previous Work

Explainable artificial intelligence is getting a lot of attention nowadays. The machine learning algorithms have been used for cancer detection but these models do not explain the assessment they made. Humans can not trust these models since they do not understand the reason of their assessment [17]. Van Lent et al. [25] used the explanation capability in the training system developed by academic researchers and commercial game developers for the full spectrum command. Sneh et al. [24] used the explainable artificial intelligence in intelligent robotic systems for categorization of different types of errors. The errors have been divided into five categories using the machine learning techniques but they fail to provide the explanations. The XAI have been used to provide information and explanation of occurrence of these errors for three different machine learning models. Ten Zeldam et al. [31] proposed a technique for detection of incorrect or incomplete repair card in aviation maintenance that can result in failures. They proposed a Failure Diagnosis Explainability (FDE) technique for providing the interpretability and transparency to the learning model for the failure diagnosis. It is used to check if the accessed diagnosis can explain if a new failure detected matches the expected output of that particular diagnosis and if it is dissimilar to it, then it is not likely to be the real diagnosis.

A number of fault detection tools have currently emerged from research. Generally, stand-alone software product form is taken by these tools where there should be either offline processing of trend data or an online analysis can be provided for the building control system. There have been different data driven methods developed and used for detection of AHU’s faults such as coil fouling, control valve fault, sensor bias etc. Yan et al. [30] proposed an unsupervised method for detecting faults in AHU by using cluster analysis. Firstly, PCA is used to reduce dimensions of collected historical data and then spacial separated data groups (clusters representing faults) are identified by using clustering algorithm. The proposed system was tested on a simulated data and was able to detect single and multiple faults in AHU.

Lee et al. [18] detected the AHU cooling coil subsystem’s fault with the help of Artificial Neural Network (ANN) backward propagation method based on dominant residual signature. Wang et al. [26] presented a method based on PCA for detection and diagnosis of sensor failures where faults in AHU were isolated by Q-contribution plot and used squared prediction error as indices of fault detection. Likewise, PCA along with Joint Angle Analysis (JAA) is also proposed by Du et al. [10] for diagnosis of sensors’ drifting and fixed biases in Variable Air Volume (VAV) systems. A new method for the detection of drifting biases of sensors in air handling unit is proposed by Du et al. [11] which employed neural networks along with wavelet analysis. Zhu et al. [32] adduced a sensor failure detection system based on regression neural network. It employed the analysis made by three-level wavelet for decomposition of the measured sensor data followed by extraction of each frequency band’s fractal dimensions for the depiction of sensor’s failure characteristics and then it is trained with neural networks to diagnose failures. A new semi supervised method for detection and diagnosis of air handling unit faults is proposed by Yan et al. [29] where a small amount of faulty training data samples were used to give the performance comparable to the classic supervised FDD methods. Madhikermi et al. [19] presented a heat recovery failure detection method in AHU using logistic regression and PCA. This method is based on process history and utilizes nominal efficiency of AHU for detection of faults.

Fig. 3.
figure 3

The schematic diagram of Heat Recycler Unit

3 Theoretical Background

3.1 Heat Recycler Unit

A typical AHU with balanced air ventilation system, as shown in Fig. 3, includes the HRU, supply fan, extract fan, air filters, controllers, and sensors. The system circulates the fresh air from outside to the building by utilizing two fans (supply side and extract side) and two ducts (fresh air supply and exhaust vents). Fresh air supply and exhaust vents can be installed in every room, but typically this system is designed to supply fresh air to bedrooms and living rooms where occupants spend their most of time. A filter is employed to remove dust and pollen from outside air before pushing it into the house. The system also extracts air from rooms where moisture and pollutants are most often generated (e.g. kitchen and bathroom). One of the major component of the AHU is HRU which is used to save energy consumption. The principle behind the HRU is to extract heat from extracted air (before it is removed as waste air) from house and utilize it to heat fresh air that is entering into the house. HRU is a fundamental component of AHU which helps to recycle extracted heat. The main controllers included in the system are supply air temperature controller which adjusts the temperature of the supply air entering into house and HRU output which controls the heat recovery rate. In order to measure efficiency of HRU, five temperature sensors are installed in AHU which measure the temperature of circulating air at different part of AHU (detailed in Table 1). In addition to data from sensors, HRU control state, supply fan speed, and extract fan speed can be collected from system.

Table 1. Dataset description of Air handling unit sensors

3.2 Support Vector Machine

SVM is a supervised machine learning approach used for both type of problems classification as well as regression. But most of the time it is used to solve classification problems. In this technique we plot all features as a data point in dimensional space by using coordinate values. Then a hyperplane is created that can discriminate the two classes easily. The problem in linear SVM using linear algebra for assisting the learning of the hyperplane. The equation for predicting a new input in linear SVM is calculated by using dot product between the input (x) and each support vector (xi) given as [1]:

$$\begin{aligned} f(x) = B(0) + sum(ai \times (x,xi)) \end{aligned}$$
(1)

This equation involves the calculation of the inner products of a new input vector (x) with all support vectors in training data. The learning algorithm’s training data helps in estimation of the coefficients B0 and ai (for each input). SVM model used in the proposed methodology can be depicted in Fig. 4 where two classes (Since it is binary classification problem) are shown which depicts the normal cases and fault detection cases with no heat recovery.

Fig. 4.
figure 4

SVM model used in proposed methodology

3.3 Neural Networks

Neural Networks the general function approximations, which makes them applicable to almost all machine learning problems where a complex mapping is to be learned from input to the output space. The computer based algorithms modeled on the behaviour and structure of human brain’s neurons to train and categorize the complex patters are known as Artificial neural networks (ANNs). In artificial neural networks, the adjustment of parameters with the help of a process of minimization of error due to learning from experience leads to pattern recognition. The neural networks can be calibrated using different types of input data and the output can be categorized into any number of categories. The activation function can be used to restrict the value of output by squashing the output value and giving it in a particular range depending on the type of activation function used.

Table 2. The activation functions used in neural networks
Fig. 5.
figure 5

Neural network model for proposed methodology

Table 2 lists the most common activation functions used in the neural networks where the value of sigmoid ranges from 0 to 1, tanh from −1 to 1 and ReLu from 0 to +infinity. Figure 5 depicts the neural network used in the proposed model which takes the five features described in the dataset as input which is mapped to hidden layers and finally to the output classifying it as fault detection or not.

3.4 Explainable Artificial Intelligence

Although there is an increasing number of works on interpretable and transparent machine learning algorithms, they are mostly intended for the technical users. Explanations for the end-user have been neglected in many usable and practical applications. Many researchers have applied the explainable framework to the decisions made by model for understanding the actions performed by a machine. There are many existing surveys for providing an entry point for learning key aspects for research relating to XAI [6]. Anjomshoae et al. [8] gives the systematic literature review for literature providing explanations about inter-agent explainability. The classification of the problems relating to explanation and black box have been addressed in a survey conducted by Guidotti et al. [15] which helped the researchers to find the more useful proposals. Machine learning models can be considered reliable but they lack in explainability. Contextual Importance and Utility has quite significance in explaining the machine learning models by giving the rules for machine learning models explanation [13]. Framling et al. provides the black box explanations for neural networks with the help of contextual importance utility [12, 14].

There are many methods used for providing the explanations for example; LIME (Local Interpretable Model-Agnostic Explanations) [3], CIU (Contextual Importance and Utility) [13], ELI5 [2], Skater [5], SHAP (SHapley Additive exPlanations) [4] etc. Most of them are the extensions of LIME which is an original framework and approach being proposed for model interpretation. These model interpretation techniques provide model prediction explanations with local interpretation, model prediction values with shape values, building interpretable models with surrogate tree based models and much more. Contextual Importance (CI) and Contextual Utility (CU) explains the prediction results without transforming the model into an interpretable one. These are numerical values represented as visuals and natural language form for presenting explanations for individual instances [13]. The CIU has been used by Anjomshoae et al. [7] to explain the classification and prediction results made by machine learning models for Iris dataset and car pricing dataset where the authors have CIU for justifying the decisions made by the models. The prediction results are explained by this method without being transformed into interpretable model. It explains the explanations for the linear as well as non linear models demonstrating the felexibility of the method.

Fig. 6.
figure 6

Heat recovery failure detection methodology

4 Methodology

The proposed methodology considers the fact that due to high number of dimensions, detecting the failure cases (due to HRU failure) from the normal ones is really tedious task. The HRU’s nominal efficiency (\(\mu _{nom}\)) is a function of AHU’s air temperatures as depicted in Eq. 2 [23]. The real dataset is collected from AHU containing 26700 instances of data collected for both states; “Normal” and “No Heat Recovery” state. There are two class labels with one label as “Normal” with 18882 instances and other as “No Heat Recovery” with 7818 instances. Since HRU output is set to “max” (i.e. it is a constant parameter) and HRU nominal efficiency being a function of air temperature associated with AHU (as shown in Eq. 2), this analysis only contains temperature differences as key point. All these dimensions have been combined together for measuring the performance of HRU.

$$\begin{aligned} \mu _{nom} = \frac{T_{ext}-T_{wst}}{T_{ext}-T_{frs}} \end{aligned}$$
(2)

The methodology for the detection of heat recycler unit failure has been depicted in Fig. 6. The methodology starts with having the input data containing 5 features and 1 binary class label (“No Heat Recovery” or “Normal”). The input data is divided into 70:30 ratio of for training and testing dataset respectively. The training dataset is used for training 2 models neural networks (nnet) and support Vector Machine (SVM) individually along with 10 fold cross validation. After both the models have been trained on the training dataset, they both are tested for prediction on the testing dataset for the classification. Further, the justification for the decision made by both the models is given with the help of Explainable Artificial Intelligence (XAI). Local Interpretable Model-Agnostic Explanations (LIME) has been used for providing the explanation of both the models for 6 random instances of test data. The LIME helps in justifying the decisions made by the models, neural networks and SVM.

5 Result Analysis

The performance of the proposed methodology has been tested on two trained models, neural networks and support vector machine. The test dataset is given to both the trained models for obtaining the various performance metrics such as accuracy, sensitivity, specificity, precision, recall, confusion matrix and ROC. Table 3 compares the results obtained from both the models where neural networks outperforms the SVM. It shows that neural networks have the sensitivity and specificity as 0.91 and 1 respectively with accuracy of 0.97 whereas SVM has accuracy of 0.96 with sensitivity and specificity values as 0.99 and 0.95 respectively.

Table 3. Performance comparison of neural networks and SVM
Table 4. Confusion matrix for nnet model
Table 5. Confusion matrix for SVM model
Fig. 7.
figure 7

ROC curve for Nnet model

Fig. 8.
figure 8

ROC curve for SVM model

The confusion matrix obtained for neural networks and SVM is given in Tables 4 and 5 respectively. Here, the positive class is taken as ‘No Heat Recovery’ where there is failure in HRU and negative class is taken as ‘Normal’. Table 4 shows that there are 2322 instances of True Positives (TP), 0 False Positives (FP), 255 False Negatives (FN) and 5463 True Negatives (TN) according to predictions made by neural network model. Similarly, Table 5 shows that there are 2364 instances of True Positives (TP), 255 False Positives (FP), 21 False Negatives (FN) and 5370 True Negatives (TN) according to predictions made by SVM model. ROC (Receiver Operating Characteristics) curve is one of the most important evaluation metrics for checking any classification model’s performance. The ROC curve is used for diagnostic test evaluation where true positive rate (Sensitivity) is plotted as function of the false positive rate (100-Specificity) for different cut-off points of a parameter. The ROC curve for neural networks is depicted in Fig. 7 and for SVM is depicted in Fig. 8.

5.1 Explanations Using LIME

Since most of the machine learning models used for classifications or predictions are black boxes, but it is vital to understand the rationalization behind the predictions made by these machine learning models as it will of great benefit to the decision makers to make the decision whether to trust the model or not. Figure 9 depicts an example of the case study considered in this paper for predicting the failure of the heat recovery unit. The explainer then explains the predictions made by the model by highlighting the causes or features that are critical in making the decisions made by model. However, it is possible that the model may make mistakes in predictions that are too hard to accept therefore, understanding model’s predictions is quite important tool in deciding the trustworthiness of the model since the human intuition is hard to apprehend in evaluation metrics. Figure 10 illustrates the pick up step where convincing predictions are being selected for being explained to the human for decision making.

Fig. 9.
figure 9

Explaining individual predictions to a human decision-maker

Local Interpretable Model-agnostic Explanations (LIME) has been used for giving the explanations of the model which can be used by decision makers for justifying the model behaviour. The comprehensive objective of LIME is identifying an interpretable model over the interpretable representation which fits the classifier locally. The explanation is generated by the approximation of the underlying model by interpretable model which has learned on the disruptions of the original instance. The major intention underlying LIME is that it is being easier approximating black box model locally using simple model (locally in the neighbourhood of the instance) in contrast to approximating it on a global scale. It is achieved by weighing the original instances by their similarity to the case we wish to explain. Since the explanations should be model agnostic, LIME Because our goal should be to have model-agnostic model, We can use LIME for explaining a myriad of classifiers (such as Neural Networks, Support Vector Machines and Random Forests) in the domain of text as well as images [22].

The predictions made by both the models are then justified with the help of explainable artificial intelligence. Local Interpretable Model-Agnostic Explanations (LIME) has been used for providing the explanation of both the models for 6 random instances of test data. The explainability of neural networks and SVM is shown in Figs. 11 and 12 respectively. “Supports” means that the presence of that feature increases the probability for that particular instance to be of that particular class/label. “Contradicts” means that the presence of that feature decreases the probability for that particular instance to be of that particular class/label. “Explanation fit” refers to the \(R^2\) of the model that is fitted locally to explain the variance in the neighbourhood of the examined case.

Fig. 10.
figure 10

Explaining the model to a human decision maker [22]

Fig. 11.
figure 11

Explainability of NNet model

Fig. 12.
figure 12

Explainability of SVM

The numerical features are discretized internally by LIME. For instance, in Fig. 11, for case no. 7637, the continuous feature HREG_T_WST is being discretized in such a way that a new variable is created (HREG_T_WST \(\le 7.1\)) that when it is true, the feature HREG_T_WST is lower or equal to 7.1. When this variable is true, the estimate for case 7637 is driven approximately 0.34 higher than the average predicted probability in whole sample. Similarly, another continuous variable HREG_T_SPLY variable is being discretized into a new variable (\(12.9<\) HREG_T_SPLY \(\le 16.7\)) and the estimate for case 7637 is driven approximately 0.45 lower than the average predicted probability in the whole sample, etc. When all the contributions are added on the average performance, it gives the final estimate. It also tells the class for which that particular instance belongs and how the probabilities of all variables have contributed in deciding that it belongs to that class. Similarly, it can be explained for second model SVM in Fig. 12.

6 Conclusion

The heat recycler’s fault detection in Air Handling Unit (AHU) is tedious task because the reason for its failure is mostly unknown and unique. The key requirement of such systems is the early diagnosis of such faults for its economic and functional efficiency. The real dataset of Heat Recycler Unit of AHU has been used for making predictions. The machine learning models, Support Vector Machine and Neural Networks have been used individually for the classification to detect the faults in AHU. Further, an explainable artificial intelligence has been used to explain the behavior of both the models i.e. the reason for justifying the recommendation or decision made by the learning models. Local Interpretable Model-Agnostic Explanations (LIME) has been used for providing the explanation of both the models chosen for 6 random instances of test data LIME has been used as an adequate tool for facilitating the trust for experts of machine learning and has been a good choice to be added in their tool belts. As a future work, we will like to compare the explanation results obtained by LIME with Contextual Importance (CI) and Contextual Utility (CU) to study how these two methods behave differently in context with providing the explanations.