1 Introduction

The contemporary decade is distinguished for the explosion of information that is generating, transferring, and storing over vast and complex networks [1]. Technological advancements in information technology are creating a sea change in today’s life. A majority of public and private sectors [2] beyond different industries are utilizing digital devices and procedures to provide their clients with high quality and reliable services. This widespread usage ranging from healthcare [3] and transport systems [4] to smart grids [5] and military services [6] has resulted in an inconceivable volume of data being generated and processed. The importance and sensitivity of such big data have turned it into an invaluable target for cybercriminals.

The privacy of big data has acquired new urgency due to the different issues linked to it [7, 8]. Regulating the pace of data growth with confidentiality, integrity, and availability of data processing technique is a challenging issue [9] which should be addressed. Moreover, investigation of big data on cloud-based platforms to identify and recover traces of criminal activities for forensic investigations is a time-consuming process [10] that demands novel approaches to overcome this challenge. Besides, big data storage, processing, sharing and management are crucial procedures [11] that should be carefully tuned because it may increase attack surface for malicious activities and data leakage.

On the other hand, and in terms of big data advantages, big data provides exemplary opportunities to leverage the high volume of data. It is projected that increasing data will lead to in-depth knowledge about the domain of data. Consequently, extracting in-depth knowledge from big data paves the way for proposing robust and outstanding mechanisms for protecting data and securing information technology networks [12]. Besides, storing big data and recovery mechanisms designed for it provides forensic investigators with more pieces of evidence that lead them to quick and accurate decision making [13].

2 Book Outline

This handbook presents existing state-of-the-art advances from both academia and industry, in big data and privacy. The remainder of the book is structured as follows. The second chapter [14] reviews security challenges and concerns related to critical infrastructure and methods that utilize artificial intelligence to protect these infrastructures. In the third chapter [15], authors survey new concepts, methodologies, and applications to achieve full autonomy in industry 4.0. In the fourth chapter [16], Moghadam et al. propose a privacy protection key agreement protocol for smart grid based on energy consumption controllers (ECC).

The fifth chapter [17] reviews the application of machine learning for the Internet of Things and discuss about their challenges and issues. In the subsequent chapter [18] (sixth chapter), Peters et al. apply different machine learning methods on the Internet of Things malware dataset and compare their performance and discuss the results. Singh et al. [19](seventh chapter) survey about the latest artificial intelligence based researches and methodologies undertaken for measuring and managing industrial cyber threats risks and security metrics that have been identified as a barrier to implementing these methodologies.

Eighth chapter [20] gives information about traditional machine learning based threat detection techniques for network security that are incapable of facing with huge amount of data so as to obtain more efficient knowledge to design and choose such techniques. In the next chapter, Sharma et al. [21] propose a multi-level network security and privacy evaluation scheme to evaluate and assess the security of cyber physical systems. Chapter 10 [22] is dedicated to machine learning approaches for cyber physical system anomaly detection. Then, through a case study, authors demonstrate the effectiveness of machine learning techniques for classifying False Data Injection attacks. The next chapter (Chapter 11) [23] briefly introduces renewable energy resources as well as different aspects and relations of security and big data for power systems using such resources. In the subsequent chapter, Cabello et al. [24] describe the importance of using cyber-physical systems and big data in healthcare sector. Chapter 13 [25] proposes a deep learning approach for abnormality detection while preserve privacy for a medical images dataset.

In order to provide a clear insight about researches related to security of smart farming, Nakhodchi et al. [26] in the fourteenth chapter propose a bibliometric analysis to comprehensively assess security and privacy of smart agriculture systems and related literature. In the next chapter, Amrollahi et al. [27] highlight the impact of big data and privacy in financial systems and survey the work related to FinTech banking cyber security concerns and detection methods. Chapter 16 [28] proposes a hybrid deep generative metric learning approach for intrusion detection and protect critical infrastructures. Nassiri et al. [29] present a method that combines the static and dynamic machine learning based malware detection methods. They experimentally demonstrate the performance of their proposed method. In the subsequent chapter [30], BehradFar et al. introduce a machine learning algorithm that applies a two-layer feature selection to obtain the optimum set of features for Remote access Trojan (RAT) detection and achieve high performance for RAT detection. In the last chapter [31], Azmoodeh et al. propose an active spectral clustering method to tackle problem of massive data in botnet detection research sphere that consumes the minimum number of similarity between network nodes to identify botnets.