Introduction

Time series data are a collection of statistics that are placed at equal time intermissions [1, 2]. In current years, research on time series modeling has witnessed multiple increase. As a result of extensive research, the change in behavior of the time series data can be predicted [3, 4]. Time series comprises of following components: (1) Trend: it is the chief component which is an outcome of a long-term moment of various factors [1, 5]. When time series show an upward or downward on steady movement for a long duration of time, then, we name it as a trend. (2) Cyclic: it is the component that occurs when the time series show a increase and decrease in an uneven period. This component usually stretches over long intervals. This cyclical variation is observed by most financial time series and economic categories. (3) Seasonal: it is the type of component in which time series are partial by periodic factors that repeats regularly at a periodic interval of time, i.e., weekly, monthly, quarterly. There are many kinds of factors like traditional events, weather conditions, climate, like the sale of tea during winter and sale of ice-cream during summer, Durga Puja sale. (4) Irregular component: it is unpredictable. They are random variations that can be caused due to many factors like war, earthquake, flood.

The stock market is a community marketplace that exists to issue, sell, and buy stocks [1, 6, 7]. A stock is a partial possession in an industry to share its profit. History says that prices of other assets and stock prices dynamically impact on the economic activity. If stock price increases, then, the economy also rises [1].

Another instance of a time series dataset is currency exchange data. It is a multivariable nonlinear system. Due to the erratic nature of the exchange rate market, it has been a extremely challenging task. So, researchers have developed various neural network techniques to control the multivariable nonlinear systems. Neural network techniques can adapt extensively and learn from past data [8].

In a complex commercial building, the consumption of electricity is inherently nonlinear and transient in nature. Intermediate to long term forecast is needed for usage of electricity in a housing and profitable building at an hourly basis to care demand response strategies, decision making about operations, and distributed generation system's installation. Due to progress in smart metering forecasting of sub-meter usages in the household level, it is also getting widespread on-demand response program and smart building control [9].

In time series, the developed dataset and the produced dataset are not from the same distribution and the real-world time series data are non-stationary, and the statistical properties of the distribution will be shifting as new model enters. The only way is to retrain the model every time whenever new data came in. It is not as continuous learning where we need to update already trained model whenever new data enter. So, here every time a new model is retrained when we generate a new forecast. With the growth of business, the time series forecasting will become harder as the amount of data increases. Now a days, stock market is built by combining various technologies like machine learning, big data and expert systems that communicates with each other to get accurate decisions. The connectivity of user in global environment on the internet has led to decrease in stability, susceptibility of customer sentiments and prone to mischievous attack. So, many hybrid approaches like combination of statistical and machine learning are developed that are better in stock market prediction [10].

As, DBN is a generative model that creates samples according to the features which the model learns at the time of training [11]. DBN is a type of deep learning algorithm. It is an effective algorithm that helps to solve the problems like low velocity and the phenomenon of overfitting in learning [12]. The features extraction by the DBN has higher separability and robustness that brings higher classification accuracy increasing the classifier performance [13].

In this paper, DBN is used for the analysis of three time series data like stock price data, exchange rate data, and electricity consumption data. First, the data are pre-processed that removes all the missing values. Then, the data are normalized using min–max normalization method. Finally, the processed data is fed to the DBN where classification is executed. Classification accuracy is calculated with the help of Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and time of execution. Then, the results are compared with other machine learning algorithms like Particle Swarm Optimization (PSO) and Local Linear Wavelet Neural Network (LLWNN).

This paper is summarized as: after the introduction to the problem, "Related Work" section discusses the related work. "Deep Belief Network (DBN)" section discusses about DBN. Other predictive models are discussed in "Predictive Models" section. Findings and analysis of DBN as well as other predictive models are discussed in "Finding and Analysis" section. Finally, the conclusion is drawn in "Conclusion" section.

Related Work

A deep Long Short-Term Memory Neural Network (LSTMNN) was proposed with entrenched layer and automatic encoder for predicting the stock market. The experimental result shows that LSTM with an embedded layer is better than the other with 57.2% accuracy [14]. The performances of these two models are verified using Sinopec and Shanghai A-share composite index. The Shanghai A-share composite index shows better predictive accuracy [14]. A Convolution Neural Network algorithm was developed to predict stock price dataset, i.e., historical data Nifty from January 5, 2015, to December 27, 2019. Here, 8 regression and 8 classification methods are used from which CNN shows the best result with 1.09 Root Mean Square Error (RMSE). The result shows that neural network-based models are highly capable to extract and learn the training dataset features. Moreover, multivariate analysis enables higher accuracy than univariate models [15].

An adaptive prediction model was developed with knowledge guided artificial neural network (KGANN) to predict exchange rate efficiently [16]. The new method is having two parallel systems, i.e., the first one is LMS trained Adaptive Linear Combiner, and the second one is the Functional Link Artificial Neural Network (FLANN) model to deliver an additional precise exchange rate than that foretold by LMS or FLANN model independently. KGANN provides a more effective model to predict the exchange rate than that of LMS and FLANN individually [16]. MLFNN and NARX with 0.001 Mean Square Error (MSE) show better forecasting efficiency than other techniques according to the MSE plot [17]. An Improved Shuffled Frog Leaping-based Learning Strategy (ISFL) was integrated with Pi-Sigma Neural Network (PSNN) to predict the exchange rate, i.e., US dollar against three different currencies like Canadian Dollar (CAD), Swiss Franc (CHF) and Japanese Yen (JPY) from Jan 2014 to Nov 2017. The result shows that the Pi-Sigma ISFL with USD/CAD shows about 0.0197 RMSE which is much better than other techniques. The presentation of the anticipated model can be detected with other time series data. A hybrid robust learning method can be developed for PSNN [18].

A Recurrent Neural Network (RNN) algorithm was developed to support demand response strategies decision making about processes and connection of distributed generation systems [19]. These outcomes are associated with 3-layered Multi-Layered Perceptron (MLP) and show better performance when done in an hourly basis, but when it is compared in yearly basis than MLP shows better performance over RNN, this model helps to get surrogates for missing transient variables that affect the load profile in profitable structures [19]. Cost-Effective Firefly-based Algorithm (CEFA) was developed to initialize population, encoding problem and evaluate fitness in order to provide optimized execution of workflow and cost-effective within a time limit. CEFA is compared with different algorithms like IC-PCP, RWO, PSO, Robustness-cost time etc. CEFA uses Cloud Sim tool for simulation. This work can be extended in real-cloud environment [20]. Soft-Margin Complex Polyhedron Classifier (SM-CPC) was developed to provide better accuracies with the well-known classifier. They are helpful in providing noise lenience and arrangement with the instance where two classes share shared illustrations. They are not suitable for high dimensional and large-scale classification problems [21].

Deep Belief Network (DBN)

DBN is a class of unsupervised network [22, 23]. RBM consists of a observable layer (input layer) and a hidden layer. This can be expressed as:

$$P\left(H/V\right)=\frac{P(H/V)}{P(V)}$$
(1)
$$P\left(V/H\right)= \frac{P(V,H)}{P(H)}$$
(2)

P(V/H) = conditional probability, where H = hidden vector, V = input vector

Energy-based RBM model:

$$P\left(H,V\right)=P\left(V,H\right)=\frac{1}{Z}{e}^{-E(V,H)}$$
(3)

E(V, H) is the energy of hidden and visible unit joint configuration.

$$Z={\sum }_{V,H}{e}^{-E(V,H)}$$
(4)

RBMs training process:

Step 1 Initialize

The weights are established with normal distribution for individually input data \({X}_{t},t\in \left[1, Z\right], V={X}_{t}\)

Step 2 \(P\left(H/V\right)\), the probability of hidden layer is computed

$$P\left(H/V\right)=\sigma (wV)$$

Step 3 \(P\left(V/H\right)\), the probability of reconstructed input layer is computed

$$P\left(V/H\right)=\sigma (wH)$$

Step 4 Find \(\Delta w\), the reconstruction error

$$\Delta w={<VH>}_{\mathrm{data}}-{<VH>}_{\mathrm{reconstruction}}$$

Step 5 Calculate E(V, H), energy function and update ‘w’, weights

$$w\leftarrow w+\varepsilon \left(\Delta w\right)$$
$$E\left(V,H\right)=-{\sum }_{i}{V}_{i}{B}_{i}-{\sum }_{i}{H}_{j}{C}_{j}-{\sum }_{i,j}{V}_{i}{H}_{j}{w}_{ij}$$

where i = 1……..n, j = 1……..m, \({B}_{i}\) and \({C}_{j}\) are bias.

Step 6 Step2 and Step5 are repeated till \(E\left(V, H\right)\) converge.

Fine Tuning

Step 1 Train the 1st RBM with the data X.

Step 2 Fix ‘w’ and \({B}_{i}\) & \({C}_{j}\) of 1st RBM. The state of hidden units is used as visible data for the 2nd RBM.

Step 3 After training the 2nd, RBM is stacked on the top of 1st RBM.

Step 4 Repeat step 2 and 3 for the preferred quantity of layers, every time by proliferating skyward either mean values or sample.

Step 5 Fine-tuning each parameter in this model concerning a proxy for DBN log-likelihood.”

As per the name suggests, DNN is a feed forward neural network with many layers, but DBN is a generative probabilistic model that has undirected connections between the top layers like in RBM [24].

Advantages of DBN over other networks are: It has a higher modeling capacity per parameter and also has well-organized training process that chains unsupervised knowledge for feature recognition with a succeeding stage of supervised knowledge which tunes the features to improve the discernment [22, 25].

Predictive Models

Particle Swarm Optimization (PSO)

In this algorithm, each particle in the population adapts recurrently to previous positive areas. Velocity update and position update are two main operators. In each generation, every particle is rushing to the particle's global best position. Each particle velocity is measured. Henceforward, operating the particles according to given equations [26]:

Velocity is determined as:

$${x}_{ij}=p({x}_{ij}+{k}_{1}{\times r}_{1}\left({T}_{i}-{b}_{ij}\right)+({k}_{2}\times {r}_{2}\left({G}_{\mathrm{best}}-{b}_{ij}\right))$$
(5)

Each particle location is updated as:

$${b}_{ij}={b}_{ij}+{x}_{ij}$$
(6)

where x = velocity of the particle, b = random variable, \({T}_{i}\) = local best value, \({G}_{\mathrm{best}}\) = global best value, \(p\) = inertia weight, k1 & k2 = acceleration constants, \({r}_{1}\) & \({r}_{2}\) = random variables between 0 and 1.


PSO algorithm [27]:

  1. 1.

    Particles are randomly initialized.

  2. 2.

    Fitness value with objective function is computed and deliberated as Pbest

  3. 3.

    Position and speed of every particle is updated.

  4. 4.

    Velocity of each particle is updated with Eq. (5).

  5. 5.

    Location of each particle is updated with Eq. (6).

  6. 6.

    The Gbest and Pbest are updated until stopping criteria is met.

Local Linear Wavelet Neural Network (LLWNN)

Wavelet Neural Network (WNN) delivers improved learning effectiveness and structure transparency as compared to a multilayer perceptron (Fig. 1). Limitation of WNN is that when it is used for multidimensional problems, then, many hidden layers are needed [28, 29]. Then, LLWNN is developed which is a modification of WNN. Figure 2 shows the architecture of LLWNN. The productivity of the output layer is determined as:

$$y=\sum_{k=1}^{N}\left({w}_{k0}+{w}_{k1}{t}_{1}+\cdots +{w}_{kn}{t}_{n}\right){\varphi }_{k}\left(t\right)$$
(7)

y \(= \sum_{k=1}^{N}\left({w}_{k0}+{w}_{k1}{t}_{1}+cdots +{w}_{kn}{t}_{n}\right){\left|{p}_{k}\right|}^{-\frac{1}{2}}\varphi \left(\frac{t-{q}_{k}}{{p}_{k}}\right)\) where T = [t1, t2,…, tn].

Fig. 1
figure 1

A local linear wavelet neural network

Fig. 2
figure 2

Flowchart of DEEPNN model

In LLWNN, piecewise constant weight is used, i.e., a linear model is presented as

$${v}_{k}={w}_{k0}+{w}_{k1}{t}_{1}+\cdots +{w}_{kn}{t}_{n}$$
(8)

Direct models (\({v}_{k}\)) (K = 1, 2…N) activities are found by active wavelet functions \({\varphi }_{k}\left(t\right)(K=1, 2,\dots ..N)\) that are associated locally. The translation, space, and local linear parameters are initialized arbitrarily at the commencement and then enhanced, by RLS (Recursive Least Square) algorithm.

Finding and Analysis

Different types of time series datasets are used to check the outcome of the DBN algorithm. At the start, the features are selected from the training data. Then, the relevant features are provided to the multilayer perceptron where the DBN update the factors. The performance validation of the classifier is checked by three datasets. 70% of the information is taken for training purpose and 30% for testing. Data pre-processing is carried out where the misplaced values are discarded before the statistics is checked with the model. After that, data were normalized by min–max normalization. The min–max normalization method is specified as:

$$X_{\mathrm{normalized}}=\frac{X_{\mathrm{original}}-X_{\mathrm{minimum}}}{X_{\mathrm{maximum}}-X_{\mathrm{minimum}}}$$
(9)

Summary of the complete effort done is provided in Fig. 3 and describe as:

Fig. 3
figure 3

Real versus forecast during training for DBN in stock market

Step 1 Loading the time series data, i.e., Currency Exchange rate, Household Electricity Consumption, and Yahoo Inc. datasets.

Step 2 Identification of the attributes and features.

Step 3 Statistics Pre-Processing is added where the raw statistics is transformed into an understandable format. The original data may contain some errors and are mostly inconsistent and incomplete.

Step 4 Classification is started after the processed data is fed to DBN.

Step 5 Calculation of cataloguing correctness is executed using diverse constraints like RMSE, MAPE, and Time of Implementation.

Step 6 The calculation of cataloguing correctness of other techniques is further executed and then the comparison is done with DBN.

Step 7 The confusion matrix of DBN is constructed for Currency Exchange rate, Household Electricity Consumption, and Yahoo Inc. datasets.

Step 8 End the process.

The DBN performance is equated with different methods like PSO and LLWNN. The DBN performance is evaluated by three parameters:

  1. (1)

    Root Mean Square Error (RMSE). The RMSE formulae are given below:

    $${\mathrm{RMSE}=\sqrt{\frac{\sum_{n=1}^{N}{({Y}_{n}-{T}_{n})}^{2}}{N}}}$$
    (10)

    where Yn = expected output, \({T}_{n}\) = actual output (target) s, N = total data sample size.

  2. (2)

    The Mean Absolute Percentage Error (MAPE) formulae is as follows:

    $${\mathrm{MAPE}}=\frac{1}{n}\sum_{t=1}^{n}\left|\frac{{A}_{t-{F}_{t}}}{{A}_{t}}\right|$$
    (11)

    where At is the real value, and Ft is the estimate value.

  3. (3)

    Time of Execution

The MATLAB software is used here to get the simulation result. The normalized input variables are taken to free them from measurement units.

Specifics of Datasets

  1. (i)

    The stock market for yahoo.inc [30] is used for analysis. The dataset consists of 7 columns. The dataset consists of 1500 samples. The yahoo data are taken from 1st January 2007 to 1 January 2011. The data till 1 October 2009 is taken as training data, and the rest is taken as testing data.

  2. (ii)

    The currency exchange rate from INR to USD. The dataset consists of 2430 rows and 6 attributes [31]. The time gap from May 2010 to 2018 is considered as training data. May 2019 is considered for trying. The currency exchange dataset displays the currency exchange data of the US dollar to Indian National Rupees (INR). The dataset has 1500 samples and 7 columns namely date, price open, high low volume, and exchange %. Out of which price is taken as the target value. From December 31st, 2019 to January 31st, 2018 is taken as training value and the rest is taken as testing value.

  3. (iii)

    The energy consumption data [32] are taken for 10 min intervals for 4.5 months. The ZigBee wireless sensor network checks the temperature and humidity circumstances. The energy usage of the preceding month is considered as for training, and the subsequent day energy usage is considered as testing data. The household electricity statistics records the electricity consumption value in 10 min interval. We have taken data of each day and taken the mean of total day’s data (around 397 samples). In that way, we have taken data from 16/12/2006 at 5.24 PM to 02/03/2008 at 06.23 AM (1500 samples). The data till 18 September 2007 10.02 AM are taken as training statistics, and the surplus is taken as testing statistics. The column Global active power is taken as the target value [33].

Results and Discussion

The superiority of a DBN is verified by figures and tables. Tables and figures are described below as:

Table 1 discusses parameter settings of deep belief networks.

Table 1 Parameter settings

Table 2 titles the details of diverse kinds of time series datasets.

Table 2 Specifics of datasets

Table 3 represents the stock market details during the time of testing. In 0.6532 s, PSO shows a MAPE of 0.8638 with 0.05621 RMSE, while DBN has a MAPE of 0.7713 with 0.003214 RMSE which is a better alternative in 0.8643 s. LLWNN within 0.9542 s shows RMSE of 0.09762 with 0.9011 MAPE which is worth mentioning.

Table 3 Details of stock market during testing

Table 4 represents the exchange rate details during the time of testing. Here, LLWNN in 0.9073 s shows MAPE of 0.6909 and 0.0879 RMSE which is higher in the case of DBN having 0.6593 MAPE and 0.005421 RMSE in 0.7659 s. PSO is also having a better RMSE of 0.0953 in less time of 0.8703 s with a MAPE of 0.7430.

Table 4 Specifics of exchange rate through testing

Table 5 represents the Household Electricity consumption details during the time of testing. Here, DBN shows better RMSE of 0.0008906 in 0.7642 s with 0.7802 MAPE than PSO having 0.005971 RMSE in 0.6091 s. The results of LLWNN are worth mentioning as it is having 0.009851 RMSE in 0.8903 s with 0.8979 MAPE.

Table 5 Details of household electricity consumption during testing

Table 6 depicts the average distance during testing between the actual and forecasted prices for the stock market. It is apparent from the table that DBN has the bottommost variance between real and estimated price, i.e., 0001. Next to DBN, the better performance between PSO and LLWNN is PSO having the distance as 0.002. The LLWNN model is showing a distance of 0.003.

Table 6 Presentation assessment between diverse representations (Regular remoteness during testing between forecasted & actual price) for the stock market

Table 7 describes the distance between the actual and forecasted prices for the exchange rate. The DBN model provides the best results having a distance of 0.007. The PSO and LLWNN model provides a distance of 0.008 and 0.09, respectively.

Table 7 Presentation assessment between diverse representations (Regular remoteness during testing between forecasted and actual price for exchange rate

Table 8 discusses the distance between actual and forecasted for household electricity consumption. The DBN model provides a distance of 0.004 whereas PSO and LLWNN model provide a distance of 0.007and 0.006, respectively.

Table 8 Presentation assessment between diverse representations (Regular remoteness during testing between forecasted and actual price for household electricity consumption

Table 9 depicts the external validation results where DBN is associated with additional different techniques. It shows better RMSE than other method.

Table 9 External validation results

Figures 3 and 4 depict the assessment between the real and predicted prices for the stock market using DBN for training and testing.

Fig. 4
figure 4

Real versus forecast during testing for DBN in stock market

Figures 5 and 6 depict the assessment between the real and prediction prices for the stock market using PSO for training and testing.

Fig. 5
figure 5

Real versus forecast during training for PSO stock market

Fig. 6
figure 6

Real versus forecast during testing for PSO in stock market

Figures 7 and 8 depict the assessment between the real and predicted prices for the stock market using LLWNN for training and testing.

Fig. 7
figure 7

Real versus forecast during training for LLWNN in stock market

Fig. 8
figure 8

Real versus forecast during testing for LLWNN in stock market

Figures 9 and 10 depict the assessment between the real and predicted prices for the exchange rate using DBN for training and testing.

Fig. 9
figure 9

Real versus forecast during training for DBN in exchange rate

Fig. 10
figure 10

Real versus forecast during testing for DBN in exchange rate

Figures 11 and 12 depict the assessment between the real and predicted prices for the exchange rate using PSO for training and testing.

Fig. 11
figure 11

Real versus forecast during training for PSO in exchange rate dataset

Fig. 12
figure 12

Real versus forecast during testing for PSO in exchange rate

Figures 13 and 14 depict the comparison between the actual and forecasted prices for the exchange rate using LLWNN for training and testing.

Fig. 13
figure 13

Real versus forecast during training for LLWNN in exchange rate

Fig. 14
figure 14

Real versus forecast during testing for LLWNN in exchange rate

Figures 15 and 16 depict the comparison between the actual and forecasted prices for electricity consumption using DBN for training and testing.

Fig. 15
figure 15

Real versus forecast during training for DBN in electricity consumption

Fig. 16
figure 16

Real versus forecast during testing for DBN in electricity consumption

Figures 17 and 18 depict the comparison between the actual and forecasted price for electricity consumption using PSO for training and testing.

Fig. 17
figure 17

Real versus forecast during training for PSO in electricity consumption

Fig. 18
figure 18

Real versus forecast during testing for PSO in electricity consumption

Figures 19 and 20 depict the comparison between the actual and forecasted price for electricity consumption using LLWNN for training and testing.

Fig. 19
figure 19

Real versus forecast during training for LLWNN in electricity consumption

Fig. 20
figure 20

Real versus forecast during testing for LLWNN in electricity consumption

Figures 21, 22, and 23 present the comparison of MSE Convergence using different methods for the Stock market, Exchange rate, and Electricity consumption, respectively. It is evident from the graphs that DBN converges faster than other methods in all the datasets.

Fig. 21
figure 21

MSE conjunction graph of stock market

Fig. 22
figure 22

MSE conjunction graph of exchange rate

Fig. 23
figure 23

MSE conjunction graph of electricity consumption

Conclusion

Analyzing the trends of time series data remain a challenging task even after a decade of extensive research. As the time series data does not follow any particular pattern, it proves difficult for the researchers to analyze it and use it conveniently. This paper presents a state of art predictive model using DBN to analyze the time series data. The data are also analyzed by PSO and LLWNN. The assessment of the model is validated with the benefit of RMSE and MAPE. The predictive model led by DBN provides an RMSE value of 0.0032, 0.0054, and 0.0089, respectively, for the Stock market data, Exchange rate data, and Electricity consumption data, respectively. The MAPE values are 0.7713, 0.6593 and 0.7802 for the Stock market data, Exchange rate data and Electricity consumption data, respectively.

In the future, more complex structures can be designed with the help of DBN. It may be applied using different feature selection algorithms for large datasets.