1 Introduction

The energy utilization in residential sectors is growing every year. Determining the minimum energy requirements of residential buildings is an important engineering problem. In the residential sector of many countries, the energy consumption of buildings includes a significant part of the total energy used (Rafe Biswas et al. 2016; Amasyali and El-Gohary 2018). This sector is not studied well in comparison to industrial and transportation sector because of a lack of financial incentive (Amasyali and El-Gohary 2018; Tsanas and Xifara 2012). Regulating the energy consumption of residential buildings is sometimes implemented by selecting heating and cooling systems. Optimal control of energy utilization is important for decreasing energy waste and its adverse impact on the environment. The energy consumption in residential buildings depends on a set of factors. The locations and human occupancy are important factors affecting energy consumption. The characteristics of buildings such as number and sizes of rooms, the size of the house, size and number of windows, the use of different materials in the construction of buildings (such as the use of wooden, concrete, plastic or stone) are factors affecting energy utilisation in buildings (Tsanas and Xifara 2012; Perez-Lombard et al. 2008). Also, the income of pupils, cost of energy, the frequency of using cooling, heating and other electrical appliances, the usage of hot water, indoor lighting are parameters affecting the energy consumption in buildings.

Recently different approaches were used for the prediction of energy demand in buildings. These are methods based on physical modeling and data-driven methods (Rafe Biswas et al. 2016; Amasyali and El-Gohary 2018). Physical models are based on the analysis of thermodynamic rules for detailed energy modeling. Various software programs have been developed using physical modeling. These are DOE-2 (Sullivan and Winkelmann 1999), EnergyPlus (Crawley et al. 2001), ESp-r (Strachan et al. 2008), DeST (Yan et al. 2008), TRNSYS and (Fumo 2014). Using these software tools, the designers were able to estimate the impact of design alternatives on energy consumption. However, these software simulation systems have a complex structure and time-consuming. During modeling, physical principles, detailed building information, whether detail and resident behavior should be considered for proper development of the model. However, acquiring these parameters is difficult and sometimes the predicted output results of these models are far from actual values (Amasyali and El-Gohary 2018; Fumo 2014). Because of the progress of intelligent systems and computer technology, data-driven models are widely used for energy prediction in practices. These models are based on machine learning methods and the historical data of energy utilization in the building. Different machine learning-based models have been extensively used for energy modeling. These models were basically based on ANNs (Rafe Biswas et al. 2016; Kumar et al. 2013; Sholahudin et al. 2016; Zhang and Haghighat 2010; Deb et al. 2016; Chae et al. 2016), SVM (Li et al. 2009; Zhang et al. 2016), SVR (Hong 2009). The papers (Rafe Biswas et al. 2016; Kumar et al. 2013) integrated statistics and artificial neural networks (ANNs) for analysis and prediction of energy utilization in buildings. These machine learning methods were reliable and effective approach for solving prediction problem because of their learning ability that allows them to handle nonlinear processes, and also generation property. Deb et al. (2016) used ANN and Bayesian regularization learning algorithm for forecasting cooling loads of buildings. Ensemble empirical mode decomposition adaptive noise and support vector regression with quantum-based dragonfly algorithm (Zhang and Hong 2019), empirical mode decomposition with support vector regression model (Fan et al. 2020a), regression model based on hybrid empirical mode decomposition and support vector regression with back propagation neural network (Hong and Fan 2019) were developed for electricity load forecasting. Fan et al. 2020b designed a hybrid model using machine learning techniques for prediction energy consumption. The paper (Kumar et al. 2021) presented different machine learning models such as SVM, ANN and random forest for solving forecasting problem. The authors find that the random forest was the most suitable prediction model. The authors also presented the importance of the Internet of Things (IoT) (Plageras et al. 2017; Gupta and Quamara 2018; Esposito et al. 2021) to support and implement efficient solutions for Smart Cities. The authors' proposed methods for collecting and managing sensors’ data in a smart building, which operates in IoT environment. The hybrid models that integrate different machine learning techniques were developed to improve the performance of the prediction systems (Dong et al. 2016; Abizada and Abiyeva 2018; Gao et al. 2019; Abiyev 2009). The effects of various parameters of residential buildings on energy consumption were considered in (Tsanas and Xifara 2012; Gao et al. 2019). Cooling and heating loads were evaluated using various variables of buildings. Different research works have investigated the relationships between the building’s parameters and the energy consumption related to cooling and heating loads. Improving the performance characteristics of the designed models is the primary concern of most research works.

As we mentioned above, the factors affecting energy utilization in buildings are different. These are physical characteristics of buildings, weather conditions of the region, social and demographic location of pupils, etc. The influences of these factors on the energy load of buildings are imprecise and uncertain. Therefore, it is difficult to evaluate accurately the energy performance and energy load of the building based on uncertain information. One effective approach to solve these problems is the use of a fuzzy set theory that handles the uncertainties and imprecise information in designed models. The fuzzy systems (FSs) have widely been used in modeling of different industrial and nonindustrial problems characterizing uncertainty. However, in some practical applications, type-1 fuzzy sets are unable to handle high levels of uncertainties, imprecision and vagueness. In some research studies, it has been shown that type-2 fuzzy sets are a valuable approach for handling uncertainty in the solution of uncertain problems (Zadeh 1975; Mendel 2017). Because the membership functions of type-2 fuzzy sets are fuzzy, they outperform their type-1 counterparts in handling uncertainties. If the extracted data are noisy and the knowledge extracted from the experts carries uncertainties then type-2 fuzzy sets are an effective approach for modeling such uncertainties. Since the MFs of type-2 fuzzy sets are fuzzy, they can model high levels of uncertainties and consequently minimize the effects of uncertainties in the rules.

Type-2 fuzzy sets were introduced by Zadeh (1975) and subsequently developed by Mendel and his co-authors (Mendel 2017). These systems have been widely used for solving many practical problems. Such as time-series forecasting (Karnik and Mendel 1999; Abiyev 2010), channel equalizations (Liang and Mendel 2000; Abiyev et al. 2011), for robot control (Hagras 2004; Liu et al. 2007), dynamic plant identification and control (Castillo and Melin 2008; Biglarbegian et al. 2010; Abiyev and Kaynak 2010; Lin et al. 2015), time-series applications (Baklouti et al. 2018), servo system control (Kayacan et al. 2011), credit rating (Abiyev 2014), gene expression (Shukla and Muhuri 2019).

The integration of ANNs and FSs allows us to automate the construction of the fuzzy rule-based models, decrease their design time and develop trainable fuzzy systems (Abiyev and Kaynak 2010; Lin et al. 2015; Baklouti et al. 2018; Kayacan et al. 2011; Abiyev 2014; Stojčić et al. 2019; Vilela et al. 2019). Neural networks (NNs) have been applied for designing type-2 fuzzy systems for dynamic plant identification and control (Abiyev and Kaynak 2010; Kayacan et al. 2011) and time-series predictions (Baklouti et al. 2018). The integration of FSs and NNs has been used for energy load prediction (Abiyev 2009, 2010). Fuzzy set theory is more robust in tolerating imprecision, vague, noisy or missing information. The use of wavelet functions in NN structure allows us to model local detail of nonlinear processes (Zekri et al. 2008; Zhang et al. 1995; Thuillard 2001). Fuzzy wavelet neural network-based systems were developed for dynamic plant control (Abiyev et al. 2013), and function learning (Ho et al. 2001). These research studies demonstrated good performances of designed models. However, most of these research studies are devoted to the solution of particular problems and sometimes did not adequately describe some special characteristics such as uncertainty, high nonlinearity of considered problems. To improve the performances of constructed models in the paper, we propose the design of a hybrid system based on type-2 fuzzy sets, neural networks and wavelet technology for modeling energy performances of residential buildings. The integration of these paradigms in system design offsets the demerits of one paradigm by the merits of another. Wavelet functions allow us to analyse non-stationary signals and discover their local details. Neural networks have self-learning characteristics that increase the accuracy of the model. The localization properties of wavelets used in NN structures allow modeling of the local detail of nonlinear processes. Fuzzy logic allows us to reduce the complexity of the data and to handle uncertainty and imprecision. The combination of fuzzy logic and wavelet neural networks automate the construction of the fuzzy rule-based models, decrease their design time and develop trainable fuzzy systems that can describe nonlinear systems characterized with uncertainties. At the same time, this hybrid structure allows approximation of complex functions more effectively. In the paper, these methodologies are integrated into the T2FWNN structure for modeling the energy performances of residential buildings. The contribution of the paper is summarized as follows: The structure of a multi-input and multi-output T2FWNN model that integrates type-2 fuzzy sets, neural networks and wavelet technology is proposed for modeling energy performance of residential buildings, in particular, prediction of cooling and heating loads of buildings; The learning algorithm of T2FWNN is designed using cross-validation technique, fuzzy clustering and gradient descent algorithm with an adaptive learning rate; using T2FWNN the energy consumption prediction model is designed for residential buildings in North Cyprus. Comparative results are provided to demonstrate the effectiveness of the designed T2FWNN models used for modeling energy performances and predicting the energy consumption of residential buildings.

The paper includes five sections. Section 2 proposes the T2FWNN model used for the estimation of energy performance. Section 3 presents the update algorithms used for adjusting the parameters of the T2FWNN. Section 4 presents a simulation of the T2FWNN system and comparative results of different models. Section 5 presents the conclusions of the paper.

2 T2FWNN model for estimation energy performance of buildings

In this section, the structure of T2FWNN used for the estimation of the energy performance of the residential building is presented. The designed system is based on the type-2 fuzzy If–Then rules of TSK type. The antecedent parts of the considered rules utilize type-2 sets, consequent parts utilize wavelet functions. Because of the localization properties of wavelet functions, the constructed model allows us to model local features of the nonlinear processes and increase the computational power of the T2FWNN model. This property allows decreasing of the number of rules used in respect to the type-2 TSK system that uses linear functions in the consequent parts. The If–Then rules used in the paper are given as

$$ {\text{If}}\,x_{1} \,{\text{is}}\,{\text{and}}\, \ldots \,{\text{and}}\,x_{m} {\text{is}}\,{\text{Then}}\,y_{1} \,{\text{is}}\;w_{jk} \sum\limits_{i = 1}^{m} {(1 - z_{ij}^{2} )e^{{ - \frac{{z_{ij}^{2} }}{2}}} } $$
(1)

where x1, x2,…, xm are the input-, y1, y2,…, yr are the output variables, \(\tilde{A}_{ji}\) are type-2 fuzzy sets, wij (i = 1,, m and j = 1,, r) are coefficients. m and r are the numbers of inputs and fuzzy rules, respectively. \(z_{ij} = {{(x_{i} - b_{ij} )} \mathord{\left/ {\vphantom {{(x_{i} - b_{ij} )} {a_{ij} }}} \right. \kern-\nulldelimiterspace} {a_{ij} }}\) and \(a_{ij} {\text{ and }}b_{ij} \, \) are dilation and translation coefficients. As shown, the antecedent part of the rules includes type-2 fuzzy sets, and the consequent part- Mexican Hat wavelet functions. The wavelet functions used in the rule base with the different dilation and translation parameters allow capturing various essential features and behaviors of the nonlinear model. By learning these parameters and finding their optimal values, the proper rule base of T2FWNN can be designed.

The structure of multi-input and multi-output T2FWNN used for estimation of energy performance is given in Fig. 1. The proposed T2FWNN model uses eight input and two output variables. The T2FWNN is based on fuzzy rules of (1) and has six layers. The first layer is used for the distribution of input signals. The second layer is used to represent type-2 fuzzy sets described by Gaussian MFs. In Gaussian, 2 parameters (center and width) are used to represent membership functions. The high number of parameters affects the learning time. We selected the Gaussian MF function because of less number of parameters. The uncertainties can be represented with the mean or standard deviation (STD) of membership functions. Figure 2a and b presents Gaussian type-2 MFs with the uncertain mean and uncertain STD, respectively. In the paper, the MFs presented by uncertain mean and fixed STD (Fig. 2a) is used in the second layer. The Gaussian MFs μ1j(xi) used in this paper are described as

$$ \mu 1_{j} (x_{i} ) = e^{{ - \frac{{(x_{i} - c_{ij} )^{2} }}{{\sigma_{ij}^{2} }}}} $$
(2)

where \({c}_{ij}\,{\rm and}\, {}_{ij}\) are the center and width of the Gaussian MFs, respectively. x is the input vector with for uncertain mean c ∈ [c1, c2].

Fig. 1
figure 1

T2FWNN model

Fig. 2
figure 2

Gaussian type-2 fuzzy set: a uncertain mean, b uncertain STD

In the output of the nodes of the second layer, the type-2 membership degrees are calculated using (2). All the operations are implemented using interval type-2 fuzzy sets. Using (2) upper and lower MFs are derived in the second layer

$$ \mu_{{\tilde{A}_{k}^{j} }} (x_{k} ) = \left[ {\underline{\mu }_{{\tilde{A}_{k}^{j} }} (x_{k} ),\overline{\mu }_{{\tilde{A}_{k}^{j} }} (x_{k} )} \right] = \left[ {\underline{\mu }^{j} ,\overline{\mu }^{j} } \right] $$
(3)

here \(\underline{\mu }^{j} \;{\text{and}}\;\overline{\mu }^{j}\) are upper and lower membership functions, respectively. The third layer is the rule layer. Here R1, R2,, Rr represents the type-2 rules, and the number of rules is equal to the number of nodes. The t-norm min operation is used to determine outputs of the third layer

$$ \begin{gathered} \underline{f}_{j} = \min (\underline{\mu }_{{\tilde{A}_{1} }} (x_{1} ),\underline{\mu }_{{\tilde{A}_{2} }} (x_{2} ),...,\underline{\mu }_{{\tilde{A}_{m} }} (x_{m} )); \, \hfill \\ \overline{f}_{j} = \min (\overline{\mu }_{{\tilde{A}_{1} }} (x_{1} ),\overline{\mu }_{{\tilde{A}_{2} }} (x_{2} ),...,\overline{\mu }_{{\tilde{A}_{m} }} (x_{m} ) \hfill \\ \end{gathered} $$
(4)

here \(\underline{f}^{j} \;{\text{and}}\;\overline{f}^{j}\) are upper and lower membership functions obtained using t-norm min operations. The fourth layer is a consequent layer, that calculates the outputs of wavelet functions. The number of wavelet functions is equal to the rules’ number r. The outputs of the fourth layer are the products of the outputs of the third layer and wavelet networks that include wavelet functions of the fourth layer. Here the contributions of wavelet functions to the output of the rules are determined. The wavelet functions are calculated as

$$ \Psi_{j} (z) = \sum\limits_{i = 1}^{m} {\left| {a_{ij} } \right|^{{ - \frac{1}{2}}} (1 - z_{ij}^{2} )e^{{ - \frac{{z_{ij}^{2} }}{2}}} } ;\quad y_{j} = w_{j} \Psi_{j} \left( z \right) $$
(5)

where \({\text{z}}_{ij} = \frac{{x_{i} - b_{ij} }}{{a_{ij} }}\), i = 1,…, m, j = 1,…, r. In the next fifth layer, the obtained output signals of the fourth layer are multiplied by the wjk weight coefficients. This operation allows scaling the output signals into the desired range. Then, the defuzzification and type-reduction are applied to find the network’s outputs (Biglarbegian et al. 2010; Abiyev and Kaynak 2010; Begian et al. 2008).

$$ u_{k} = {{\left( {p\sum\limits_{j = 1}^{r} {\underline{f}_{j} y_{jk} } } \right)} \mathord{\left/ {\vphantom {{\left( {p\sum\limits_{j = 1}^{r} {\underline{f}_{j} y_{jk} } } \right)} {\left( {\sum\limits_{j = 1}^{r} {\underline{f}_{j} } } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {\sum\limits_{j = 1}^{r} {\underline{f}_{j} } } \right)}} + {{\left( {q\sum\limits_{j = 1}^{r} {\overline{f}_{j} y_{jk} } } \right)} \mathord{\left/ {\vphantom {{\left( {q\sum\limits_{j = 1}^{r} {\overline{f}_{j} y_{jk} } } \right)} {\left( {\sum\limits_{j = 1}^{r} {\overline{f}_{j} } } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {\sum\limits_{j = 1}^{r} {\overline{f}_{j} } } \right)}} $$
(6)

Here yjk = wjkΨj(z), where uk (k = 1,…, n) are the outputs of T2FWNN, r is a number of active rules, n is a number of outputs. p and q are the parameters that are used for weighting the share of lower and upper levels of each rule. The p and q parameters are used to adjust the lower or the upper portions according to the certainty level of the system.

For the design of T2FWNN model, the learning of the network parameters c1ij, c2ij, σij in the second layer, wjk, aij, bij (i = 1,…, m, j = 1, …, r, k = 1,, n) in the fourth layer and also p, q parameters is carried out. In the next section, the learning of the parameters is presented.

3 Parameters’ learning

The development of the T2FWNN model consists of finding proper values of the parameters cij, oij, aij, bij, and wjk of the If–Then rules (1). Consequent parts of the If–Then rules define the behaviors of the system in certain regions determined by the antecedent parts. The learning algorithm can be applied to find appropriate values of the parameters of the If–Then rules. Recently various techniques were applied for this purpose. These are clustering, gradient algorithms, the least-squares method (LSM) and genetic algorithms (Abiyev and Kaynak 2010; Lin et al. 2015; Baklouti et al. 2018; Kayacan et al. 2011; Abiyev 2014; Abiyev et al. 2013). In this paper, the parameter update of T2FWNN is implemented using type-2 fuzzy clustering and gradient descent algorithms.

The learning of T2FWNN is started by determining the parameters of the antecedent part of the rules, which are the centers and widths of MFs. For this purpose, type-2 fuzzy classification is applied. Type-2 fuzzy clustering is presented in (Abiyev et al. 2011; Kayacan et al. 2011) in more detail. As a result of clustering, the cij parameters corresponding to the cluster centers of fuzzy regions are obtained. These centers correspond to the centers of MFs. Using the distances between the centers of membership functions, the widths σij are calculated.

After finding the parameters of the MFs, the parameters of the wavelet networks are determined. For this purpose, the cross-validation with the gradient descent algorithm is applied for parameter learning. At first, the initial values of the T2FWNN parameters are generated randomly. During initialization, the parameters of Gaussian functions and wavelet functions are generated using the change interval of input parameters. This approach allows fast training of T2FWNN parameters. The parameter update is carried out using the network output errors. The output error is calculated as

$$ E = \frac{1}{2}\sum\limits_{k = 1}^{n} {(u_{k}^{d} - u_{k} )^{2} } $$
(7)

where \(u_{k}^{d} \,{\text{and}}\,u_{k}\) are desired and current output signals, respectively. \(w_{jk} ,\,{\text{a}}_{ij} ,\,{\text{b}}_{ij}\), cij, and oij (i = 1,…, m, j = 1, …, r, k = 1, …, n) parameters of T2FWNN are adjusted as

$$ \begin{gathered} w_{jk} (t + 1) = w_{jk} (t) - \gamma \frac{\partial E}{{\partial w_{jk} }} + \lambda (w_{jk} (t) - w_{jk} (t - 1)); \, \hfill \\ {\text{a}}_{ij} (t + 1) = {\text{a}}_{ij} (t) - \gamma \frac{\partial E}{{\partial {\text{a}}_{ij} }} + \lambda ({\text{a}}_{ij} (t) - {\text{a}}_{ij} (t - 1));\;b_{ij} (t + 1) = b_{ij} (t) - \gamma \frac{\partial E}{{\partial b_{ij} }} + \lambda (b_{ij} (t) - b_{ij} (t - 1)); \hfill \\ \end{gathered} $$
(8)
$$ \begin{gathered} {\text{c1}}_{ij} (t + 1) = {\text{c1}}_{ij} (t) - \gamma \frac{\partial E}{{\partial {\text{c}}_{ij} }} + \lambda (c1_{ij} (t) - c1_{ij} (t - 1));\;{\text{c2}}_{ij} (t + 1) = {\text{c2}}_{ij} (t) - \gamma \frac{\partial E}{{\partial {\text{c}}_{ij} }} + \lambda (c2_{ij} (t) - c2_{ij} (t - 1)); \, \hfill \\ \sigma_{ij} (t + 1) = \sigma_{ij} (t) - \gamma \frac{\partial E}{{\partial \sigma_{ij} }} + \lambda (\sigma_{ij} (t) - \sigma_{ij} (t - 1)); \, \hfill \\ i = 1,...,n;\;j = 1, \ldots ,r;\;k = 1, \ldots ,n. \hfill \\ \end{gathered} $$
(9)

where γ is the learning rate, λ is the momentum. m, r and n are the numbers of input, hidden and output neurons of T2FWNN, respectively.

The derivatives in (8) are computed as.

$$ \frac{\partial E(t)}{{\partial w_{jk} }} = \frac{\partial E(t)}{{\partial u_{k} (t)}}\frac{{\partial u_{k} (t)}}{{\partial y_{jk} (t)}}\frac{{\partial y_{jk} (t)}}{{\partial w_{jk} }} = (u_{k} - u_{k}^{d} ) \cdot \psi_{j} {\text{(z)}} \cdot \left( {\frac{{p \cdot \underline{f}_{j} }}{{\sum\limits_{j = 1}^{r} {\underline{f}_{j} } }} + \frac{{q \cdot \overline{f}_{j} }}{{\sum\limits_{j = 1}^{r} {\overline{f}_{j} } }}} \right) $$
(10)
$$ \begin{gathered} \frac{\partial E(t)}{{\partial a_{ij} }} = \frac{\partial E(t)}{{\partial u_{k} (t)}}\frac{{\partial u_{k} (t)}}{{\partial y_{jk} (t)}}\frac{{\partial y_{jk} (t)}}{{\partial \psi_{j} (t)}}\frac{{\partial \psi_{j} (t)}}{{\partial z_{ij} (t)}}\frac{{\partial z_{ij} (t)}}{{\partial a_{ij} }} = \hfill \\ (u_{k} - u_{k}^{d} ) \cdot w_{jk} \cdot \left( {\frac{{p \cdot \underline{f}_{j} }}{{\sum\limits_{j = 1}^{r} {\underline{f}_{j} } }} + \frac{{q \cdot \overline{f}_{j} }}{{\sum\limits_{j = 1}^{r} {\overline{f}_{j} } }}} \right) \cdot {{(3.5z_{ij}^{2} - z_{ij}^{4} - 0.5)e^{{ - \frac{{z_{ij}^{2} }}{2}}} } \mathord{\left/ {\vphantom {{(3.5z_{ij}^{2} - z_{ij}^{4} - 0.5)e^{{ - \frac{{z_{ij}^{2} }}{2}}} } {(\sqrt {a_{ij}^{3} } )}}} \right. \kern-\nulldelimiterspace} {(\sqrt {a_{ij}^{3} } )}} \hfill \\ \end{gathered} $$
(11)
$$ \begin{gathered} \frac{\partial E(t)}{{\partial b_{ij} }} = \frac{\partial E(t)}{{\partial u(t)}}\frac{\partial u(t)}{{\partial y_{j} (t)}}\frac{{\partial y_{j} (t)}}{{\partial \psi_{j} (t)}}\frac{{\partial \psi_{j} (t)}}{{\partial z_{ij} (t)}}\frac{{\partial z_{ij} (t)}}{{\partial b_{ij} }} = \hfill \\ (u_{k} - u_{k}^{d} ) \cdot w_{jk} \cdot \left( {\frac{{p \cdot \underline{f}_{j} }}{{\sum\limits_{j = 1}^{r} {\underline{f}_{j} } }} + \frac{{q \cdot \overline{f}_{j} }}{{\sum\limits_{j = 1}^{r} {\overline{f}_{j} } }}} \right) \cdot {{(3z_{ij}^{{}} - z_{ij}^{3} )e^{{ - \frac{{z_{ij}^{2} }}{2}}} } \mathord{\left/ {\vphantom {{(3z_{ij}^{{}} - z_{ij}^{3} )e^{{ - \frac{{z_{ij}^{2} }}{2}}} } {{(}\sqrt {a_{ij}^{3} } )}}} \right. \kern-\nulldelimiterspace} {{(}\sqrt {a_{ij}^{3} } )}} \, \hfill \\ \end{gathered} $$
(12)

The gradient algorithm can be applied for learning of c1ij, c2ij, σij parameters. During the learning of the network, the parameters p and q that are used for weighting lower and upper levels of the output signal are adjusted. The update is started from the initial value of 0.5 as

$$ p(t + 1) = p(t) - \gamma^{p} (t)\frac{\partial E(t)}{{\partial p(t)}};\quad q(t + 1) = q(t) - \gamma^{q} (t)\frac{\partial E(t)}{{\partial q(t)}}; $$
(13)

Using (813) the parameters of T2FWNN are updated.

We use adaptive learning in order to speed up the learning process and guarantee convergence. The learning rate is adjusted according to the increase or decrease of root mean square of error R(t).

$$ R\left( t \right) = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {u_{i}^{d} \left( t \right) - u_{i} \left( t \right)} \right)^{2} } ; \quad {\text{decay}}\left( t \right) = \frac{{R\left( t \right) - R\left( {t - 1} \right)}}{R\left( t \right)} $$
(14)

The adjusting of γ(t) learning rate is implemented by the following formula. Here t is the current epoch number.

$$ \begin{gathered} {\text{If decay}}\left( {\text{t}} \right) < 0 \quad {\text{g}}\left( {\text{t}} \right) = {\text{g}}\left( {{\text{t}} - {1}} \right)*{1}.00{1}; \hfill \\ {\text{else}}\quad {\text{g}}\left( {\text{t}} \right) = {\text{g}}\left( {{\text{t}} - {1}} \right)/{1}.0{1;} \hfill \\ \end{gathered} $$
(15)

The adaptive adjusting allows to stabilize and speed up the learning process.

The design stages of the T2FWNN for estimation of energy performances of residential buildings are presented below

figure d

4 Simulations

The above described T2FWNN and its training algorithm are used for the estimation of the energy performance of residential buildings. We used two kinds of statistical data. At first, using the proposed T2FWNN model, we modeled the association strength of input and output variables using the real energy efficiency data set taken from the UCI machine learning data repository. In the second simulation, we designed the energy prediction model using the statistical data of energy utilization in residential buildings in Northern Cyprus. In the first simulation, the data set uses eight input and two output variables. Heating and cooling loads of buildings are the predicted output variables. Relative compactness, surface area, wall area, roof area, overall height, orientation, glazing area and glazing area distribution are the input variables. 768 instances representing different buildings are used for modeling input–output relationship. The fragment of the energy efficiency data set is given in Table 1. Here X variables denote input and Y variables denote output signals.

Table 1 Energy efficiency data set

The training set of T2FWNN based system consists of the values of eight input and two output variables. Using input and output variables, the architecture of the T2FWNN is constructed. The number of fuzzy rules that are hidden neurons is set by the programmer. We used a different number of hidden neurons (rules) for modeling of the T2FWNN model in order to obtain the required accuracy. The learning of the system was accomplished using the cross-validation technique. Using cross-validation, two independent data sets were generalized: training and evaluation. During the training of the T2FWNN model, K-fold cross-validation was used for separating the dataset into training and testing sets. Here the original data samples are randomly partitioned into k groups of equal size. A single group was used as a validation group for testing, and the remaining k-1 groups were used for training. In the paper, the k value was taken as 10. The cross-validation is repeated 10 times (number of folds). The training is continued for k epochs set by the programmer. In each epoch, one set of data (group) is used for evaluation of the model, and the remaining part is used for training. In each epoch, the testing data are changed and moved to the next group in order to test all data sets. Final classification accuracy is determined using averaged values of accuracies obtained from the folds.

Figures 3 and 4 depicted the scatter plots for each of the input variables with each output variables, in particular, heating and cooling loads. These plots demonstrate that associations between input and output variables are too complex. T2FWNN is one of the effective approaches to present such relationships. In the paper, using T2FWNN the modeling of relationships between input parameters and output variables is constructed. The considered T2FWNN model will have eight inputs and two outputs.

Fig. 3
figure 3

Relationship between input variables and output heating load

Fig. 4
figure 4

Relationship between input variables and output cooling load

During the modeling, we recorded the results obtained for training, evaluation and testing modes. For measuring, the performance of the system mean square error (MSE) and also root mean square error (RMSE) were used.

$$ M = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {Y_{i}^{d} - Y_{i} } \right)^{2} ;\quad R = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {Y_{i}^{d} - Y_{i} } \right)^{2} } $$
(16)

Here N is a number of samples (data items), M is MSE, R is RMSE. For test subset N = 768, for training subset N = 768*K fold; where K fold is the number of folds. In the paper, the tenfold cross-validation is used, therefore K fold = 10.

During updating of parameters, a fuzzy type-2 clustering algorithm is applied to partition input space and find cluster centers which are centers of MFs. Using distances between cluster centers, the widths of membership functions are computed. These two parameters are the parameters of the second layer of T2FWNN (Fig. 1). After determining the centers and widths of membership functions in the second layer of the network, a gradient descent algorithm is applied in order to determine the parameters of the wavelet function in the fourth layer of T2FWNN. Learning is carried out using a cross-validation approach. Initially, the parameters of wavelet functions are generated in the interval [f, h] randomly. Here f is the corresponding minimum, h is the corresponding maximum value of the input data set. By finding the error of network output and using learning formulas given in Sect. 3, the adjusting of the parameters is performed. The formula (16) is used to evaluate the performances of simulated T2FWNN models. The plot of RMSE obtained from the training was depicted in Fig. 5. The simulation is carried out using 8 and 16 hidden neurons. The training is continued for 200 epochs. For each obtained clusters and wavelet parameters, the results of the simulation are given in Table 2. The table depicted the simulation results of the T2FWNN system using 8 and 16 hidden neurons. Using 16 hidden neurons, the RMSE value for training data was obtained as 1.582, for evaluation- 1.614. After learning the RMSE value for the testing data was obtained as 1.599.

Fig. 5
figure 5

Plot of RMSE

Table 2 T2FWNN simulation results

The simulation results of T2FWNN models using 8 and 16 rules are depicted in Table 2 correspondingly. For each output variables, the values of RMSE are fixed for training, evaluation and testing stages. Comparative results have been provided in order to show the efficiency of the designed system. For this purpose, the simulation results of the T2FWNN model were compared with the simulation results of the ANN-based and fuzzy neural network (FNN)-based models. Table 3 presented the values of RMSE averaged over ten simulations. As shown from Tables 2 and 3, the T2FWNN model more accurately describes heating load than cooling load.

Table 3 MSE values for each output obtained with 16 rules

The comparative results of different machine learning models used for estimation of energy performances of buildings are presented in Table 4. The used models are an iteratively reweighted least-squares (IRLS), random forest (RF), radial basis function (RBF) network, alternating model tree (AMT), lazy K-star, neural networks (NN), fuzzy neural networks (FNN) and T2FWNN. For comparative purpose, we used the mean squared error and root mean squared error. As it was shown, the simulation results obtained with the T2FWNN model are better than the ones obtained from other models. The wavelet functions used in the T2FWNN model allow us to catch the local details of the input–output relationships. The proposed model can be a practical solution for modeling the energy performance of buildings. At the same time, RF and RBF models have good performance also. The comparisons demonstrate the efficiency of the application of T2FWNN-based energy consumption model in real life.

Table 4 Comparison of different models

The correlation between the actual output of cooling and heating loads and predicted model outputs is given in Fig. 6. Here S2 measure that presents the strength of relationships between actual and predicted values is calculated. The R2 value for the heating load was measured as 0.9995, and the cooling load was 0.975. As shown from the figures, the obtained model accurately describes the input–output relationships. Considering Tables 2, 3 and 4, we can see that the RMSE value obtained for the heating load is lower than the one obtained for the cooling load. If we consider Fig. 6 and analyse the S2 values that depict the strength of the relationship, we can see that the S2 value for the heating load is larger than the cooling load. The analyse show that the constructed model describes the associations between building parameters and heating load more accurately than the associations between building parameters and the cooling load. The comparative result given in Table 4 demonstrates the performances of other models. Comparative results of different models show that the T2FWNN is the best method among the presented approaches and more accurately describes the association between the building’s parameters and energy utilization.

Fig. 6
figure 6

Chat for the T2FWNN results for a heating b cooling load estimation

In the second simulation, T2FWNN was applied for the prediction of energy consumption in the residential buildings in Northern Cyprus. The monthly data characterizing the energy consumption in residential buildings between 2004 and 2020 were taken for modeling purpose. Prediction of energy consumption plays an important role in enhancing the energy efficiency of the buildings. As it is known, Northern Cyprus does not explore petroleum and gas import these from abroad. In Northern Cyprus, residential buildings utilize more than 30% of total energy. The problem is to meet the energy demand of customers. In the summertime, the houses use an essential part of energies for cooling of houses. In wintertime, the essential part of the energy is used for heating purpose. The other parts of the energy are used for lighting, cooking and for the consumption of home appliances. The overall electricity consumption data for residential buildings in Northern Cyprus were obtained from KibTek Corporation. Energy consumption prediction allows the advanced planning of energy production, thereby preventing unexpected energy deficiency problems. The data characterize the monthly energy consumption used by residential buildings. We used the data set and organized input and output training pairs. At first, we have considered a 3-step ahead prediction of energy consumption. We used input data x(t-9), x(t-8), x(t-3), x(t -2) and x(t) in order to predict x(t+3) signal. Using a five-dimensional input vector and one-dimensional output vector, the training and testing sets are generated. These data sets are used in designing the T2FWNN prediction model. Based on the five-dimensional input vector and one-dimensional output vector, the structure of the network is selected. The number of rules in the second hidden layer is selected by the programmer. Using a different number of rules, the simulation of the T2FWNN system has been performed. During training, we applied cross-validation to train the network. The training has been done for 200 epochs. Figure 7 depicts the plot of RMSE values obtained for training data. When the number of hidden neurons was 10, the training and validation errors were obtained as 0.052754 and 0.052858 correspondingly. After training the test of the system was performed. The value of test error was 0.051163. Figure 8 depicted the plots of actual and 3-step ahead predicted values of energy consumption. The plot of prediction error is depicted in Fig. 9. Comparative results of different models have been provided in order to show the efficiency of the proposed T2FWNN system. Table 5 depicts the comparative results of different prediction models.

Fig. 7
figure 7

The plot of RMSE obtained during training

Fig. 8
figure 8

The plots of actual and 3-step ahead predicted values of energy consumption

Fig. 9
figure 9

The plot of prediction error

Table 5 Comparative results of 3 step ahead prediction of different models

In the next simulation, we have implemented the simulation of one-step-ahead prediction using T2FWNN. x(t -11), x(t -5), x(t-4) xt(t -1) and xt(t) are used as input signal for the system, and the predicted output value was x(t+1). The RMSE values for training, validation and test data were obtained as 0.058245, 0.059007 and 0.057475 correspondingly. Figure 10 depicts plots of actual and one-step ahead predicted values of energy consumption. Figure 11 depicts the plot of errors obtained during training. To show the efficiency of the constructed system, we compared T2FWNN results with the results of other machine learning techniques, such as ANN- and FNN-based models. Comparative results of the models are given in Table 6. For measuring the performance of the models, the authors used the RMSE value. As shown in the table, the T2FWNN model has better performance than neural networks (NN) and type-1 fuzzy neural networks (Fuzzy NN) based models.

Fig. 10
figure 10

The plots of actual and one-ahead predicted values of energy consumption

Fig. 11
figure 11

The plot of errors

Table 6 Comparative results of different models

In the paper, the authors proposed the T2FWNN model for energy consumptıon in residential buildings. The data set is used to predict the future value of the total energy utilized in the buildings. The obtained simulation results indicate the efficiency of the proposed T2FWNN model in energy prediction.

5 Conclusions

In this paper, a novel T2FWNN model is proposed for the estimation of the energy performance of residential buildings. The integration of fuzzy clustering, gradient descent algorithm and cross-validation technique was used for the design of T2FWNN. The T2FWNN was proposed for solving two problems, the first one is the determination of associations between building parameters and energy consumptions for accurate prediction of the buildings’ energy load, and the second one is the prediction of energy consumption in residential buildings in Northern Cyprus. Using statistical data and training algorithms, the development of the T2FWNN prediction models has been performed. The special adaptive learning algorithm is developed to stabilize and speed up the training of the T2FWNN model. After the design of prediction models, comparative results have been provided to show the efficiency of the designed T2FWNN models. The obtained simulation results demonstrate that the T2FWNN has obtained a better accuracy value than the existing models used for modeling energy performances of residential buildings as presented in Table 4. The designed T2FWNN structure is also applied for the prediction of energy consumptions of residential buildings of North Cyprus. Experimental results indicate that the T2FWNN prediction model outperforms neural networks (NN) and type-1 fuzzy neural networks-based models in terms of prediction accuracy indices MSE and RMSE as presented in Table 6. Comparative results with the different machine learning models demonstrated the efficiency of the proposed T2FWNN model in predicting the energy consumption in residential buildings. Future research is based on the improvement of the learning algorithm of the T2FWNN model and the application of the presented model for solving other prediction and classification problems in engineering.