Keywords

1 Introduction

The prediction process of cooling load mainly includes data preprocessing and optimization of prediction models. Data preprocessing usually comprises removing outliers, filling missing values (for time-series models), eliminating redundant variables, and data normalization. The prediction models can be divided into physical models, data-driven models, and semi-physical models in terms of modeling mechanisms (Xiao et al. 2022). Data-driven models less rely on expert knowledge, thus making the modeling process more simple (Yao and Shekhar 2021), which commonly includes regression analysis (e.g., multiple linear regression MLR and multiple nonlinear regression MNR), artificial neural network (ANN), support vector regression, decision tree regression, etc. (Chen et al. 2022; Zhang et al. 2021). Most of the existing studies focused on models’ optimization and comparison while ignoring the preliminary work of models, i.e., data preprocessing. For data-driven models, data quality determines the upper limit of model performances (Xiao et al. 2022). Therefore, in addition to the typical data preprocessing methods, more attention should be paid to the features of the data.

Besides, models’ input data may present different statistic distributions, i.e., having variational load patterns so that difficult to guarantee prediction accuracy if the same model is used to predict the data with different load patterns (Chen et al. 2022). Based on this, some researchers introduced unsupervised clustering into the cooling load prediction, i.e., different types of the data are expressed by different models, and therefore obtained a better improvement in prediction accuracy. For instance, Zhang et al. (Zhang et al. 2019) used the K-means clustering to classify the model’s input data and then used the K-nearest neighbor to determine the training data class of the model. The case revealed that this method could improve the prediction accuracy of the cooling load in the factory workshop by 10%. Ding et al. (Ding et al. 2018) used the K-means and hierarchical methods to cluster the input variables of the ANN and support vector regression, thus improving the results of the cooling load prediction for the office buildings. Ko et al. (Ko et al. 2017) applied the clustering technique to enhance the prediction accuracy of the regression analysis model. However, in these clustering methods, most of them did not consider the influence of the input variables on the cooling load, i.e., they put each input variable equally into the clustering space, so they did not introduce the weights of the variables into the process of the clustering that affects the cooling load. In the production of actual cooling load, the contributions of different input variables to the cooling load are different. If they are all treated equally, it makes the clustering effect getting poor, as a result of little improvement in the prediction accuracy. Aiming at the unsatisfactory prediction accuracy in the existing methods, this paper proposed a novel weight clustering-based pattern recognition method for improving building cooling load prediction reliability.

2 Pattern Recognition Method Based on Weight Clustering

The cooling load pattern recognition method is shown in Fig. 26.1. The original data are first preprocessed and randomly divided into 70% training set and 30% validation set. Secondly, the training set is divided into K classes by using the K-means clustering that is based on the weights of input variables affecting the output cooling load, and then the center points of each class are obtained. Subsequently, the distances between the predicted sample j in the validation set and the centers of each training class are compared, and the training set i with the smallest distance is selected as the training samples for the current model (i.e., the MLR, MNR, and ANN). Finally, the well-trained model is used to predict the corresponding sample j of the validation set. The clustering-based training method can effectively improve the matching between the training samples and the predicted sample for the models, and thus can obtain better prediction accuracy.

Fig. 26.1
A flow chart. A raw data set consists of a training set and a validation set. Clustering based on input variables weights from the training set, load patterns from the validation set, select training set i with minimum center distance, prediction models, and building cooling load.

The framework of the training pattern recognition for cooling load prediction

3 Data Preprocessing

3.1 Outlier Detection

Assuming that the input variables of the model obey the normal distribution, thus the 3σ criterion (Fan et al. 2021) can be used to judge the samples whether are outliers. Equation (26.1) is for solving the mean value of variable x. Equation (26.2) is the solution of standard deviation σ. Equation (26.3) is the discriminant of whether sample x is an outlier value. Note that the outliers removed should be filled especially in the time-series models. The filling method for missing values can adopt linear interpolation.

$$\begin{array}{*{20}c} {\overline{x} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} x_{i} } \\ \end{array}$$
(26.1)
$$\begin{array}{*{20}c} {\sigma = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {x_{i} - \overline{x}} \right)^{2} } } \\ \end{array}$$
(26.2)
$$\begin{array}{*{20}c} {\left| {x_{i} - \overline{x}} \right| > 3\sigma } \\ \end{array}$$
(26.3)

3.2 Pearson Correlation Analysis

To reduce the complexity of models and decrease the unnecessary investment in redundant points measured, the Pearson correlation analysis method (Ding et al. 2018) is used to reduce the input dimensions of models. In Eq. (26.4), the ρx,y denotes the size of linear correlation between x and y. The variables with high correlation should be removed.

$$\begin{array}{*{20}c} {\rho_{xy} = \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {x - \overline{x}} \right)\left( {y - \overline{y}} \right)}}{{\sqrt {\mathop \sum \nolimits_{i = 1}^{N} \left( {x - \overline{x}} \right)^{2} \mathop \sum \nolimits_{i = 1}^{N} \left( {y - \overline{y}} \right)^{2} } }}} \\ \end{array}$$
(26.4)

3.3 Data Normalization

The variables are normalized as shown in Eq. (26.5) to prevent numerical problems (Fan et al. 2019). Note that the prediction cooling loads of models are needed to be normalized inversely. Where xmin and xmax represent the minimum value and the maximum value of variable x, respectively.

$$\begin{array}{*{20}c} {x = \frac{{x - x_{\min } }}{{x_{\max } - x_{\min } }}} \\ \end{array}$$
(26.5)

4 Clustering of Training Data

The clustering process of training data is first required to obtain the variables’ weights affecting the cooling load and then the weights are introduced into the K-means clustering space.

4.1 Determination of Input Variables’ Weights

Figure 26.2 shows the two-dimensional attribute space normalized of the cooling loads, which assumes that the cooling loads are only affected by outdoor dry-bulb temperature T and outdoor relative humidity H. Figure 26.2 presents that the cooling load CL1, cooling load CL2, and cooling load CL3 have the same distance from predicted cooling load CL0, while CL3 is closer to CL0 in the outdoor dry-bulb temperature, and CL2 is closer to CL0 in the outdoor relative humidity. Then which points of cooling load should be chosen to estimate the CL0? From the sensitivity analysis of the cooling load, the outdoor dry-bulb temperature has a greater impact on the cooling load than the outdoor relative humidity. Therefore, from the perspective of probability, using the cooling load CL3 to estimate CL0 is more preferred. In this paper, the Pearson correlation coefficients after being taken as absolute values are chosen as the variables’ weights.

Fig. 26.2
A graph of outdoor relative humidity versus dry-bulb temperature. Where r 1 = r 2 = r 3 and delta H = delta T. Three circles interlinked each named C L 1, C L 2, and C L 3, the point is labeled as C L 0. Delta T passes through the middle of the circle C L 0 toward delta H.

Normalized dimensionless attributes space (a two-dimensional example)

4.2 K-means Algorithm for Classifying Training Data

Equation (26.6) is the objective function of the weights-based K-means algorithm, which is to minimize the weighted distance E between class center u and class samples x. Equation (26.7) is the solution to class center point u. Through repeated iterations, when the position of class center point u no longer changes, it is considered that the clustering has been completed. The K-means clustering is a classical unsupervised learning method and is described in detail by Ref. (Zhang et al. 2019; Ding et al. 2018).

$$\mathop {\arg }\limits_{u} \min E = \mathop \sum \limits_{i = 1}^{K} \mathop \sum \limits_{{x \in C_{i} }} \left\| {x - u_{i} } \right\|_{2}^{2} \cdot W_{x}$$
(26.6)
$$\begin{array}{*{20}c} {u_{i} = \frac{1}{{\left| {C_{i} } \right|}}\mathop \sum \limits_{{x \in C_{i} }} x} \\ \end{array}$$
(26.7)

where Ci denotes the ith class of training samples. K denotes the total cluster number. Wx denotes the weights vector of input variables, which is composed of the Pearson correlation coefficients above.

4.3 Pattern Recognition of Validation Data

After dividing each class of training samples via the K-means algorithm, the distance dij between the predicted sample j and the center point i of each training class are compared, and the training class with the minimum distance dij is selected as the training set of the current model to predict sample j. The selection of the training class i is shown in Eq. (26.8). Where m denotes the dimensions’ number of sample x. For a more intuitive depiction please see Fig. 26.1.

$$\begin{array}{*{20}c} {\mathop {\arg }\limits_{i} \min d_{ij} = \mathop \sum \limits_{l = 1}^{m} \left( {x_{j,l} - u_{i,l} } \right)^{2} W_{x} \left( l \right)} \\ \end{array}$$
(26.8)

5 Case Study

The EnergyPlus software was first used to simulate the energy consumption of the typical office building in Guangzhou to obtain the raw data for training and validating the models above. The Matlab software was second used for implementing the methods of this paper.

5.1 Cooling Load Data of a Typical Office Building

A typical office building energy model in Guangzhou derived from Ref. (Lv et al. 2019), was used to obtain the training and validation data of the models. The hourly cooling load data (1188 samples) of the typical office buildings in Guangzhou during the cooling season (May to September) were obtained by the EnergyPlus.

It is found that the correlation ρ between the three variables of the indoor occupant, indoor lighting power, and indoor equipment power is very high, so the occupant obtained easily is selected as the representative for the other two variables. The input and output variables of models are shown in Table 26.1. The MLR, MNR, and ANN were selected as the prediction models.

Table 26.1 Variables of the prediction models

5.2 Predictive Performance of Models Based on the Load Pattern Recognition

Figure 26.3a shows the variation of prediction performance of the MLR with the increase of clustering number K. It can be seen that the clustering-based pattern recognition method obtained a great improvement in the model performance. Compared with the non-clustering method, its MAPE decreased by 35% on average. Due to the introduction of the variables’ weights, the clustering performance was further improved, and its MAPE decreased by 6% on average. Figure 26.3a also indicates that with the rise of the clustering number, the model performance first increased and then declined. When the clustering number was 7, the model had the optimal prediction accuracy (MAPE = 4.53%). When the clustering number exceeded 9, the model performance began to deteriorate, and the clustering number is recommended between 3 and 7 for the MLR.

Fig. 26.3
Three graphs of M A P E versus clustering number K of M L R, M N R, and A N N model. It depicts without clustering, after clustering, and after weight clustering. a. 30.28% drop. b. 30.27% drop. c. 11.85% drop.

MAPEs varying with clustering number K: a MLR; b MNR; c ANN

Figure 26.3b presents the variation of prediction performance of the MNR with the rise of clustering number K. It can be seen that the clustering-based method makes a great improvement in the model performance. Compared to the non-clustering model, its MAPE decreased by 36% on average. Due to the introduction of the weights, the clustering performance was further improved, and its MAPE decreased by 8% on average. Figure 26.3b also displays that with the increase of the clustering number, the model performance first increased and then decreased. When the clustering number was 7, the model had the best prediction accuracy (MAPE = 4.57%). When the clustering number exceeded 9, the model performance started to deteriorate, and the number of clusters is recommended between 3 and 7 for MNR. Figure 26.3c shows the variation of prediction performance of the ANN with the increase of clustering number K. It can be seen that the clustering-based method obtains a great improvement in the model performance. Compared with the non-clustering, its MAPE decreased by 15% on average. Due to the introduction of the weights, the clustering performance was further improved, and its MAPE decreased by 3% on average, respectively. Figure 26.3c reveals that the regularity of change to the ANN was not obvious with the increase in clustering number. When the clustering number was greater than 6, the model performance would deteriorate. Overall, the clustering number is recommended between 2 and 6 for the ANN.

In summary, the clustering-based pattern recognition method achieves a great improvement in the prediction performance for the MLR, MNR, and ANN. The introduction of the variables’ weights into the clustering will further improve the prediction accuracy for the models. Compared to the ANN, the MLR and MNR have a higher improvement in accuracy and better prediction stability when with the variation of the clustering numbers. When the number of clusters is about 4, the robustness of the above models can be guaranteed.

5.3 Explanations of the Results

(1) The main reason why the clustering-based pattern recognition method had a great improvement in the model’s prediction accuracy is that the unsupervised learning method was used to cluster the model’s training samples so that the training samples with the same class had more similarity. After identifying the spatial attributes of the predicted sample, the training class with the greatest correlation was selected to train the sub-model, thus the parameters of the sub-model would better match the current predicted sample. (2) The main reason why the introduction of the variables’ weights into the clustering process further improved the model performance is that the weights-based clustering method can achieve a better clustering effect, which thus will further improve the prediction accuracy for the models. (3) In the case study, it can be seen that, under the non-clustering condition, ANN had higher prediction accuracy than the MLR and MNR due to its stronger nonlinear fitting ability. However, after the clustering, the prediction accuracies of the MLR and MNR were much higher than the ANN (e.g., their MAPEs less than the ANN by 14% on average). This indicates that the clustering-based pattern recognition method is more suitable for the low complexity models. The main reason is that the number of training samples in each class was reduced due to the division of the total training samples through the clustering. While in the few-shot learning, the generalization ability of the ANN is poor, so the clustering effect on the ANN is not as good as the MLR and MNR.

6 Conclusion

Aiming at the unsatisfactory prediction accuracy in the existing methods, this paper proposed a novel weight clustering-based pattern recognition method for improving building cooling load prediction accuracy. The case study showed that this method achieved a significant improvement in the MLR, MNR, and ANN, e.g., MAPEs decreased by 35%, 36%, and 15% on average, respectively. Compared to the non-weight clustering, the introduction of the weights further improved the prediction accuracy of the models, such as MAPEs reduced by 6%, 8%, and 3% on average, respectively. When the clustering number was about 4, the models had a more stable prediction performance. In future research, the combination of this method with the real-time optimization and online feedback calibration will be investigated.