Keywords

1 Introduction

The International Energy Agency (IEA) has estimated that the global consumption of energy is expected to surge by 53% by the year 2035 [1]. Energy is viewed as the largest industry across the globe [2]. The consumption of energy involves all sectors of society, including information and communication technology. Shojafar et al. [3] have argued that real-time processing and energy efficiency are hot topics in the management of information and communication technology platforms. Currently, one of the most widely discussed topics in the science and technology community is big data. Big data has potential for applications in all sectors of society, such as climate, economics, health, social science [4]. The data collected from various sources in society is growing exponentially and is estimated to grow to 44 ZB (trillion gigabytes) by 2020, from 4.4 ZB in 2013 [5]. In big data, a diverse field of study which includes natural language processing, medical science, security and business management depends heavily on knowledge discovery through big data analytics. The effective and efficient processing of big data requires computer systems [6] involving Hadoop which offer the MapReduce framework for parallel computation [7].

The transfer of large volumes of data between users, service providers and data centres requires a high bandwidth connection. This consumes large amounts of energy, more than simply processing and storing the big data within cloud-based data centres. Therefore, emits high carbon dioxide. The transfer of big datasets into remote data centres consumes a significant quantity of power [8], and these carbon dioxide emissions contribute to global warming [9]. The optimisation of energy consumption for data transmission requires the network to reduce redundant and duplicate traffic [10].

Applications for future generations of parallel and distributed systems in big data analytics are a major issue. These applications generate datasets in repositories that exceed exabytes, and the size of these datasets is speedily increasing. These datasets and their associated applications pose a challenge to both software techniques and software development [11]. The task of analysis frequently has strict targets, and one of the major issues for applications in this field is the quality of data. Most of the emerging applications, data-driven models and techniques with the capability of operating at large scales are not yet widely known [12]. In real-time systems, the amount of energy is increasing; thus, the application of big data methodologies can be used to handle these operations [13]. Significant developments in big data have arisen from various research communities, for example data mining and learning algorithms from the artificial intelligence research community [4].

Big data offers important opportunities for organisations who can analyse it and gain critical intelligence for effective decisions [14]. Within various industries, data processing and analysis plays a significant role, especially in situations where nonlinear dynamics, comprising various uncertainties and mathematical models, fails. Computational intelligent algorithms such as fuzzy logic, artificial neural networks and evolutionary algorithms have demonstrated their ability to deal effectively with data modelling, and research on computational intelligent algorithms has attracted unprecedented attention from researchers. Computational intelligent algorithms have successfully solved real-world problems, as reported in the literature. Examples of the effectiveness of computational intelligent algorithms in solving real-world problems include control engineering, the modelling of unknown nonlinear dynamics using artificial neural networks and the implementation of controllers using adaptive neuro-fuzzy inference systems. Despite the overwhelming successes recorded by computational intelligent algorithms in solving real-world problems, those within the research community are still facing obstacles to the processing of industrial data, such as feature extraction from large-scale measurements that are distributed in nature, machine learning algorithms for highly robust data modelling and signal processing [15].

The world is experiencing a data revolution in terms of gleaning knowledge from big data. Computational intelligent algorithms are among the mainstream tools of big data analytics: computational intelligence has played an important role in artificial intelligence, which focuses on the design of algorithms. Such algorithms can be used to analyse huge amounts of structured and unstructured data which help in the discovery of approximate solutions for many complex problems [16]. Meanwhile, the use of virtualised clouds is currently under consideration in big data analytics, based on new machine learning theories and artificial intelligence. It is now common that intuitive physical interpretations affect the use of machine learning and artificial intelligence. It is therefore important that a suitable knowledge interpretation is provided, in order to make sound decisions based on the intelligence derived from machine learning and artificial intelligence [17].

In this chapter, we propose a theoretical framework for big data analytics, based on computational intelligent algorithms with the potential to reduce energy consumption and improve performance. It is necessary to explore the theoretical aspects of energy issues in big data analytics in relation to computational intelligent algorithms; it is critical to explore the theoretical aspects of big data in view of the fact that this can point the way towards effective and efficient applications [4].

The remaining sections of this chapter are organised as follows: Section 2 introduces the computational intelligent algorithms; Sect. 3 presents a discussion of big data analytics and the energy consumption of cluster systems; Sect. 4 discusses big data analytics and computational intelligent algorithms; Sect. 5 describes the issue of energy consumption in the application of computational intelligent algorithms in big data analytics; Sect. 6 describes the proposed framework for big data analytics based on computational intelligent algorithms; and Sect. 7 presents the conclusions.

2 Computational Intelligent Algorithms

The computational intelligent algorithm is a name recently given to the branch of artificial intelligence that deals with sub-symbolic techniques. It offers a description of techniques that mainly focus on strategies and results. Computational intelligent algorithms include sub-disciplines that deal with adaptive and intelligent systems such as evolutionary computing, artificial neural networks, fuzzy systems, artificial immune systems, particle swarm optimisation (PSO) and ant colony optimisation [18, 19]. The primary source of inspiration for these intelligent systems is nature; most of these algorithms are inspired by the characteristics of biological systems, referred to as biology-inspired algorithms [20].

Solving real-world problems generally involves challenging and NP-hard problems which require optimisation techniques, with no guarantee of obtaining an optimal solution. There are no effective and efficient algorithms for all NP-hard problems; therefore, experimentation with various optimisation algorithms is required to find the algorithm that produces the optimal solution. Many computational intelligent algorithms such as PSO, cuckoo search and firefly have been introduced to deal with the challenges of optimisation problems [20]. Computational intelligent algorithms have become widespread, and this has significantly expanded the literature [21].

Recently, a new computational intelligent algorithm inspired by nature has been added to the literature almost every month. It is likely that there are more than 200 of such algorithms in the literature. As these algorithms have flooded the literature, many researchers have found that the newly created algorithms are the existing algorithms disguised as new ones [22]. Figure 1 illustrates the number of computational intelligent algorithms introduced into the literature per year. In 2009, the literature witnessed a drastic influx of computational intelligent algorithms. Figure 2 presents the classification of computational intelligent algorithms inspired by nature and is based on the classification proposed by Fister et al. [20].

Fig. 1
figure 1

Number of computational intelligent algorithms in the literature per year

Fig. 2
figure 2

Classification of intelligent algorithms

2.1 Characteristics of Computational Intelligent Algorithms

In general, computational intelligent algorithms aim to generate a new solution which is superior to the existing one. Ideally, computational intelligent algorithms are expected to generate solutions superior to current solutions with minimal effort [21]. We now examine the major characteristics of computational intelligent algorithms: exploitation and exploration, parameter tuning/control, diversity and adaptation and algorithm parameters.

Exploitation and Exploration

Exploitation is a local search process using local information for a problem and uses information obtained from a problem to generate new solutions which are superior to the existing ones. The major strength of exploitation is its ability to give a high convergence rate. However, it has the possibility of becoming stuck in local minima. Conversely, exploration is a global search process which allows computational intelligent algorithms to explore the larger search space in far regions efficiently; it has the ability to generate solutions with sufficient diversity which is far from the existing solutions. Exploration has a lower propensity to become stuck in local minima, but it has a slow convergence rate and involves a high computational cost. Good performance for an algorithm requires a balance between exploitation and exploration: high exploitation and low exploration lead to faster convergence, but the possibility of finding a true global solution is low, while low exploitation and high exploration can lead to the meandering of the search path with a slow convergence rate [21].

Parameter Control/Tuning

The values of parameters obtained through parameter tuning are fixed during iterations, whereas the parameters of an algorithm are varied for the purpose of control. Parameter control aims to find the algorithm with the best convergence rate for better performance; parameter tuning is carried out to find the optimal parameter settings for the running of the algorithm, in order to solve a broader array of problems. There is currently no systematic and efficient method of tuning to obtain optimal parameter settings; this is often realised through extensive experiments on parameter studies [23].

Diversity and Adaptation

The computational intelligent algorithms have both diversity and adaptation, which are evident from the balance between exploitation and exploration. For example, ways of balancing exploration and exploitation are the key form of adaptation. For instance, the representations of solutions in genetic algorithms are either in binary or in real number form, whereas swarm intelligence-based algorithms generally use real numbers for solutions. The population size of an algorithm can either be fixed or varying, and variation in population size is therefore a typical example of adaptation.

Algorithm Parameters

These algorithms involve parameters, and algorithm operators are used to construct the algorithms. In genetic algorithms, crossover, mutation and selection are used. Crossover is the operation used to create new solutions [21]. As an example of the differences between computational intelligent algorithms in terms of parameters, strengths, weaknesses, generation of new solution and solution representation, Table 1 presents five different well-established algorithms from the literature with their differences.

Table 1 Major differences between five well-established computational intelligent algorithms

3 Big Data Analytics and Energy Consumption by Cluster Computing Systems

3.1 Big Data Analytics Platforms

The unprecedented accumulation of data in the information technology world has given rise to the concept of big data. Volumes are extremely large, with petabytes (PB) and even zettabytes (ZB) of data handled by organisations. The velocity and time-based variability of this data involve high speeds. The formats in which this data is created and stored are inconsistent, although these may originate from the same source and/or be generated by the same user. The veracity of the data, as opposed to the noise inherent in it, is of the highest concern. Despite all these features, big data offers high value when properly stored and analysed [29, 30]. Internet companies handle large volumes of Internet requests from their users using big data analytical platforms running on clusters of commodity hardware. Facebook and Walmart are two good examples [31].

Due to the above-mentioned characteristics, the storage and analytics of big data require large hardware resources. For example, in order to store 1 PB of data on a cluster with a 6 TB capacity hard disk, 163 hard-disk units are required. Assuming each node in the cluster can host five hard-disk units, then a 100-node cluster is required. In addition, when analysing this 1 PB of data, each hard disk must be accessed for IO, depending on the platform used for big data analysis. This is due to the fact that big data analytics platforms use the sequential access method by default. The most popular big data analysis framework, Hadoop, uses a full scan (sequential access) by default for the targeted data [32]. Each IO operation on a hard disk leads to a maximum consumption of energy in the form of electricity. Consumption due to powering of the cluster and cooling: The scenario described above can be used to give a wider picture of how much energy is consumed by companies that deal with big data. This explains why power and energy are always of first-order priority in the design of computing systems infrastructure [31].

Initially, information technology companies used clusters of commodity hardware and networking to avoid the high costs of hardware. These companies see this as a better option for providing the infrastructure necessary to accumulate and process large amounts of data, in comparison with the expense of purchasing and maintaining supercomputers and mainframe computers which would allow the system to achieve the same purpose. However, the amount of power and energy consumed by these clusters, particularly during big data storage and analysis, is growing at an alarming rate. The cost of the energy used by servers within their lifetime is expected to supersede the cost of the hardware itself, if current trends continue unchecked [31]. For example, Yahoo has installed a Hadoop cluster of over 2000 servers, while that of Facebook has more than 600 servers. Similarly, General Electric has deployed Hadoop on a cluster of 1700 servers. Energy is a crucial issue in view of these massive deployments of Hadoop over thousands of cluster systems and has influenced the cost of exploiting cluster systems. In 2007, there was a high cost of energy consumption for cluster systems [33]. This example shows the massive deployment of Hadoop over a cluster of hundreds of servers. In addition, the deployment and operation of Hadoop, the hardware required to build the cluster systems and the energy required to run them incur very high costs [31].

3.2 Energy Consumption Over Big Data Platforms

Energy consumption in big data platforms is related to several factors such as physical resources and computing resources. Big data platforms have the ability to model, organise, store and process large amounts of data. Development of information technology platforms and the massive generation of data in the world, big data technologies has become the battlefield of information technology service providers in terms of high performance and cost. The research community started to focus on energy consumption in big data platforms [34]. With the fast development of the global economy, energy consumption will keep increasing in the upcoming years.

Figures 3 and 4 are created based on the data presented in [35]; it shows the carbon dioxide emissions of data centres according to [36]. Whereas Fig. 4 depicts the distribution of energy consumption in data centres, Figure 3 shows that almost 6% per year of the emissions are caused by information technology servers. The consumption of energy in big data platforms can be expressed as follows:

Fig. 3
figure 3

Estimated information and communication technology carbon dioxide emissions

Fig. 4
figure 4

Energy consumption distribution of data centres

$$E_{\text{c}} = C_{\text{r}} + P_{\text{r}}$$
(1)

where \(E_{\text{c}}\) denotes the energy consumption, \(C_{\text{r}}\) represents the computing resources and \(P_{\text{r}}\) denotes the physical resources. According to [37], the energy consumption of computing resources accounts for about 50% of the total energy consumption as shown in Fig. 4. The percentage of energy consumes by servers’ computation; the communication equipment, and the storage devices are depicted in Fig. 4.

From Fig. 4, it is clear that the data servers are consuming a big part of the energy consumed by data centres. This amount grows exponentially in case of processing large datasets which is the case of big data platforms. Therefore, reducing energy consumption for big data platforms is the key issue for sustainable big data platforms.

3.3 Metrics Used for Measuring Power in Big Data Platforms

The management of energy consumption can be formulated as a multi-objective optimisation problem, where several performances and energy metrics are used to measure the performance [38,39,40,41,42]. It is very important to highlight that the following two objectives are mostly considered in the literature: minimisation and maximisation. The minimisation consists of reducing the consumption of the data platforms during peak power. The maximisation consists of increasing energy efficiency.

In fact, limiting the consumption during peak power is very crucial to maintain the reliability of big data platforms, escape system overheating and avoid power capacity overloads. It is shown that reducing power consumption is strongly correlated with the cost of power provisioning [43]. Energy efficiency can be expressed as follows:

$${\text{Energy}}\;{\text{efficiency}} = \frac{{{\text{Computing}}\;{\text{performance}}}}{{{\text{Total}}\;{\text{energy}}\;{\text{consumed}}}}$$
(2)

This metric represents the main focus of energy management of data centres and processing systems. From the perspective of power management of data centres, energy consumption control is viewed as a result of [39]:

Power Usage Effectiveness

Facility efficiency is the ratio of the total amount of energy used by a data centre facility to the energy delivered to computing equipment.

Server Power Usage Effectiveness

Server power conversion efficiency is the ratio of the total server input power to its useful power consumed by the electronic components directly involved in the computation.

Server’s Architectural Efficiency

Server’s architectural efficiency is the ratio of computing performance metric to the total amount of energy used by electronic components.

4 Computational Intelligent Algorithms and Big Data Analytics

In today’s world, almost everything is online, and organisations intending to improve their services analyse big data to gain knowledge to be used in improving their services [44]. Big datasets are beyond the scope of relational or object-oriented databases, and traditional computer applications and normal computers cannot handle the analytics involved. These big datasets require very large parallel processing power, from clusters of computers, for analysis. The processing of big data is generally based on nonlinear systems, and actions are not predictable in some cases [45].

To discover the knowledge required for decision making, data mining algorithms are applied to the datasets extracted from data sources. In recent years, much attention has been given to data mining, probably due to the popularity of big data concepts [44]. Big data analytics involves modelling, analysis and interpretation [46]. It has been shown that computational intelligent algorithms can be applied to solve big data problems effectively from the perspective of hardware and software design [29].

The application of computational intelligent algorithms in big data analytics is severely limited, however, since recent intelligent algorithms have difficulty in analysing big data. This is because the nature of big data makes it difficult for these intelligent algorithms to analyse it [47]. The proposed basic framework for big data analytics in relation to data mining is shown in Fig. 5; these data mining algorithms also include computational intelligent algorithms [48] (discussed in Sect. 2).

Fig. 5
figure 5

Three primary layers of big data analytics [49]

The commonly accepted framework for big data analytics is shown in Fig. 5. It comprises three layers [50] as follows:

  1. i.

    Data access and computing,

  2. ii.

    Data privacy and

  3. iii.

    Domain knowledge and data mining algorithms.

The core of Fig. 5 is the data mining platform, which is responsible for data access and computing processes. With the increasing accumulation of high volumes of data, the distributed storage of large-scale data is required to be considered during computation. In brief, data analytics and the processing of the task are partitioned into sub-tasks in multiple forms for parallel execution on a large number of computing nodes. The role of the middle layer structure is to connect the outer and inner layers. The inner layer contains data mining technology, responsible for providing a platform for the execution of data-related activities in the middle layer. Examples of data-related work include information sharing, privacy protection and the acquisition of knowledge from areas and applications.

Information sharing is the concern of the whole framework, including processing and big data analytics in smart grid. The outer layer of Fig. 5 shows the data fusion technology necessary for the preprocessing of the heterogeneous, uncertain, incomplete and multi-source data. Complex and dynamic data is extracted after the data preprocessing phase. Subsequently, pervasive smart-grid global knowledge can be obtained through local learning and fusion of the model [31]. Of the decision tree, ridge regression and support vector machine algorithms, the decision tree is found to be the most efficient algorithm for managing energy data based on big data. However, when efficiency is the priority, for example in real-time applications, ridge regression is the most effective algorithm of these three algorithms [51].

Learning is a subfield of machine learning that has the potential to solve a range of complex problems within mobile big data analytics, including classification and regression. Mobile big data samples can be modelled using deep learning consisting of neurons and synapses for training mobile big data samples to learn hierarchical features.

The application of deep learning within mobile big data has the advantages of a high level of accuracy, which is a priority in mobile systems, and multimodal deep learning; intrinsic features are generated by deep learning, necessary in mobile big data analytics, and unlabelled mobile data can be learned using deep learning, which reduces the effort required for data labelling. However, the large number of deep model parameters and the large size of mobile big datasets mean that deep learning is slow and computationally expensive [52].

More recently, deep learning has become a common technique in big data analytics, especially in the retrieval of images with a high level of accuracy [53]. Supervised deep learning and unsupervised deep learning are the two types of deep learning discussed in the literature [54]. A battery with a limited capacity requires an energy efficiency of hundreds of giga floating-point operations per second per watt for a mobile embedded system. This can allow mobile embedded systems to achieve both the required portability and performance [55].

5 Energy Consumption in the Application of Computational Intelligent Algorithms in Big Data Analytics

In the design of computing systems, energy efficiency is one of the most significant issues to be considered. However, the termination of Moore’s law has imposed a limit on additional improvements to energy efficiency, which is unfortunate. Recently, the use of physical memristors has shown that it is possible to generate a solution for the integrated hardware of artificial neural networks. This can heavily influence energy efficiency and improve performance [10, 55]. The artificial neural network is one of the more powerful algorithms in computational intelligence and has received unprecedented attention from researchers; it is believed to constitute one of the major breakthroughs in artificial intelligence. Hu et al. [56] used memristors in the design of a power neuromorphic framework for approximating computation with programmability and computational generality. This design was motivated by the theory of artificial neural networks, which shows that multilayer neural networks are universal approximators, and their wide range of applications in signal processing, pattern recognition, computer vision and natural language processing. A neuromorphic architecture for computing and a tolerance for uncertain computing can generate significant performance and gains in energy efficiency. Wang et al. [55] have found that large-scale artificial neural networks constitute one of the most mainstream algorithms in big data analytics. Two phases are involved in the processing of big data using large-scale artificial neural networks: a training phase and an operational phase. The training phase in big data processing requires a very high amount of computing power and energy efficiency, and this is one of the primary considerations in the operational phase. For example, a ~100 MB training dataset is needed with >100 TOP computation capability, ~40 GB/s IO and SRAM data bandwidth. A 3.4 GHz CPU therefore requires >10 h of learning time for ~100 K input-vector datasets; this requires ~1 s for recognition and is far from real-time processing [57].

The use of computational intelligent algorithms in big data analytics requires high bandwidth interconnection networks with low latency and low power consumption, which are essential for data and storage systems [58]. For example, Wang et al. [55] presented a promising ultrahigh energy-efficient implementation by taking advantage of emerging memristor techniques involving the computing power of GPUs for big data analytics. The results showed a high speedup compared with the basic CPU implementation [55]. Big data analytics using artificial neural networks poses the challenge of how to achieve better training within a lower convergence time and with lower energy consumption [55]. Another computational intelligent algorithm related to energy itself is the deep belief network. The deep belief network consists of the stack of a restricted Boltzmann machine. This is based on the model of energy and certain stochastic methods. Binary values are generated by each of the nodes in the restricted Boltzmann machine, between each node, symmetric link (weight) that can have negative and positive numbers exist. The two types of node in a restricted Boltzmann machine are visible and hidden nodes. The state of a restricted Boltzmann machine is associated with the energy of the restricted Boltzmann machine; a higher energy of restricted Boltzmann machine gives a lower probability of node activation [59].

6 A Proposed Framework for Big Data Analytics Using Computational Intelligent Algorithms

Previously, several big data analytics techniques have required an investment in computer hardware and software in order to overcome energy computations. This can result in the downgrading of the performance of the systems used in highly computational environments, which need to process massive amounts of energy data. However, cloud computing and software as a service (SaaS) have now made on-premise solutions unnecessary. In addition, fog computing now allows much of the analytics for these tasks to be moved to the grid edge, to further support the implementation of forecasting and optimisation programmes in real time and at large scales. Two primary big data intelligence applications are load modelling and forecasting, which have been used for energy. These applications are necessary to understand the behaviour of the system in achieving efficient energy management and to enable generic load forecasting [17].

One of the forces driving the adoption of big data analytics is the development of smart grids, although the data generated through these smart grids is increasing in size and difficult to process. Advancements in big data and cloud computing technologies are therefore crucial for a sustainable energy system. Figure 6 shows the processing of large-scale data using big data analytics techniques such as computational intelligent algorithms.

Fig. 6
figure 6

Large-scale data processing using big data analytics tools such as computational intelligent algorithms

The framework proposes three steps in big data analytics. The first is to collect the required data from sources such as smart industry, smart grid and smart home applications. Subsequently, these data can be stored in a database and servers using cloud storage. A parallel algorithm for optimal power flow based on MapReduce, proposed by Liang et al. [60], is to be applied for power flow calculation in a smart grid, since this reduces computational complexity. Lastly, the stored data can be processed using computational intelligent algorithms in order to gain insights from the big data.

The application of computational intelligent algorithms consumes a large amount of energy during the training phase as discussed earlier. The proposed framework involves the application of an energy-efficient emerging memristor, which is embedded with an artificial neural network for big data analytics. New and unexpected challenges have been created for the research community, since the current theories and techniques cannot handle big data analytics. Therefore, extension and upgrading of the existing techniques and theories are required to handle big data analytics [4]. As a result, there is no need to propose additional computational intelligent algorithms in the literature (refer to Sect. 2 for a justification); attention should be focused on the modification of the existing ones, since these all have limitations that require improvements in order to allow them to work on big data. The existing computational intelligent algorithms should be modified to handle big data analytics and to require a low energy consumption during the training phase. When processing big data for analytics using a computational intelligent algorithm, a hybrid storage device which combines hard disks and solid-state disks, as proposed by Polato et al. [61], should be used for data storage, as this reduces energy consumption and enhances performance. In addition, the storage of only a proportion of the data in solid-state disks has the potential to significantly save energy and speed up convergence. One typical example of the very high volumes of data generated from smart meters that have the potential to be analysed by computational intelligent algorithms is the case of the 200 TB data generated in Jiangsu, China. Data generation is increasing by 90 GB each day. These data were generated by over 1.81 million acquisition terminals, 1.54 million concentrated meter-reading terminals and 38 million smart meters [62]. Modern data centres should give priority to massive parallel processing in order to enhance computing speeds and reduce energy consumption. In turn, large amounts of data can be moved between the various virtual machines [63]. This framework suggests the use of the following techniques to cushion the high cost of energy consumption by distributed system clusters:

Proportioning the Use of Power

The power consumed by hardware components can be proportioned to optimise the operational efficiency of a particular server. The proportioning can also be based on a capped budget, within which the server can be underprovided, in order to be more efficient, or oversubscribed, to operate at high load levels in carrying out its task [31, 62, 64].

Performance Improvement

High energy consumption can also be optimised by improving the performance of big data analytical platforms. One example of these improvements is the indexing approach, which works with the main analytical processes and ensures that a full scan of the input data is prevented. This indexing helps to restrict the range of input data to be processed to only that required to carry out the task. Another approach in this category is the scheduling of jobs by considering data locality [65].

Iqbal et al. [66] have pointed out the potential application areas of computational intelligent algorithms within big data analytics which are as follows: personalised health services, biometrics and surveillance, transportation, visualisation of data and interpretation, business and governance, sentimental analysis, models for population displacement, effective computation, fault detection and manufacturing.

7 Conclusions

This chapter presents a theoretical perspective of energy issues within big data analytics, as related to computational intelligent algorithms. Theoretical issues of energy consumption related to big data analytics are described based on computational intelligent algorithms. It is found that the high consumption of energy in big data analytics using computational intelligent algorithms occurs mostly during the training phase of big data processing. We propose a big data analytics theoretical framework based on computational intelligent algorithms with the potential for low energy consumption and performance improvement. The theoretical study presented in this chapter may guide researchers to apply computational intelligent algorithms efficiently and effectively in big data analytics, with the possibility of consuming low energy, and improve performance. Future research directions should be focused on the application of a deep belief network in big data analytics, which consists of the stack of a restricted Boltzmann machine.