1 Introduction

The exponential growth in technology has fuelled the rise of complex computing applications churning out reams of data and information which in turn needs to be processed using high-performance computing solutions, stored using mammoth data centers and managed through the support of refined data governance. Such applications span a broad range of areas and disciplines and this spread is accelerating at a phenomenal rate. The world around us offers endless possibilities of monitoring and gathering data. Our cities, homes and even ourselves are amassed with technology for monitoring, collating and analyzing data. From a vision of smart cities (Townsend 2014) in which the very control of home heating is managed through analytical decisions (ODwyer et al. 2016) through to effective control of power generation (BritishGas 2017), it is clear to see how such analysis opens up the potential for affecting power consumption and ultimately impacts the global fuel crisis.

Through the development of powerful technology such as smart phones, wearable tech and sensors, we are now generating huge amounts of personal data on our daily lives, behavior, health and well-being. We are currently amidst a self-quantification era in which we wear sensors to report back on activity, behavior and well-being.Footnote 1 From a non-clinical aspect this enables a tracking of fitness and personal goals with the added dimension of social support through disseminating our personal metric data through social media communities. The direction this is going, is to a more biological level in that we are prepared to share biosignal metrics and signals such as Electroencephalography (EEG) (Terrell 2015) and even our own DNA (AncestryDNATM 2016; 23andMe 2015) in the concerted goal to furthering ourselves and medical science.Footnote 2

Continuing the discussion in the medical domain, a further source of high volume heterogeneous data is with digital records. Such an encompassing term spans far beyond text-based information to mammoth digital files of x-ray images, Magnetic Resonance Imaging (MRI) scans, recordings of EEG and possible Exome or Genome sequences. The image processing required for digital capture again needs to be of a significant quality as not to lose vital information from the record. Furthermore, methods to analyze and quantify what the images are showing indicate a necessity for high-performance computing solutions (Wang et al. 2010).

The result of such generation of huge volumes of data is referred to as big data. However, it is not only the sheer quantity of data created that defines big data, but there are also the four ‘V’s’ (Hashem et al. 2015) that are recognized characteristics:

  1. 1.

    Volume: refers to the sheer amount of data coming from multiple resources.

  2. 2.

    Variety: refers to the heterogeneous nature of the data. That is data of the different types coming from the different collection mechanisms, such as sensors, physiological recordings, speech, video, text, social networks, to name just a few. In addition to the sheer amount of data, a major hurdle is in handling the diversity in data format and whether the data is structured or unstructured.

  3. 3.

    Velocity: refers to the speed at which the data is created and transferred.

  4. 4.

    Value: The benefit of meeting such a challenge is the potential that by gathering such a diverse and large set of data then previously hidden trends and patterns can emerge through analysis.

Big data opens up a range of challenges along every stage of data handling, processing and analysis (Chen et al. 2014). The computational challenges are extreme and as such a range of solutions exists, where each platform is heralding scalability and performance advantages. In this chapter a high-level review is given of the range of common applications in which big data now features. The overview provides some insight into different solutions or examples of how the computational challenges have been met in these applications. A summary is provided of high-performance platforms, ranging from multiple CPU setups, GPUs, FPGAs and cloud solutions. The chapter concludes with a discussion around custom hardware solutions versus scalable on-demand cloud computing solutions, asking the question whether cloud computing holds all the cards? A peek into current technology trends is given suggesting that custom devices may be the support engine for computational enhancements for the cloud, while providing customers with the scalable and on-demand service that they require.

2 Applications

The range of applications involving big data is comprehensive and diverse, playing a role in personalized medicine, genomics, self-quantification through to monitoring financial markets or transactions. Smart cities and the Internet of Things (IOT) create a wealth of recordable data from the devices in homes through to cities. This section provides a high-level overview of some of the current big data challenges.

2.1 Genomics and Proteomics

In the last decade there has been a seismic shift in the technological advances for sequencing DNA. Edward Sanger developed the Sanger approach in 1975 using capillary electrophoresis and for decades this approach has been the technique employed. It is expensive and slow, limiting the opportunities for use. However, recent technological advances in sequencing has led to it being possible to sequence a whole human genome using a single instrument in 26 h (Miller et al. 2015). The enabler for this has been the development of High-Throughput Sequencing (HTS) which provides massively parallel sequencing power at an accelerated rate yet with significant cost reductions (Baker 2010; ODriscoll et al. 2013).

The reduction in costs has made HTS technologies much more accessible to labs and has facilitated their use in a broad range of applications and experimentation, including diagnostic testing for hereditary disorders, high-throughput polymorphism detections, comparative genomics, transcriptome analysis and therapeutic decision-making for somatic cancers (Van Dijk et al. 2014). A review and comparison of sequencing technologies can be found in Metzker (2009) and Loman et al. (2012).

However, HTS generates enormous datasets, with the possibility of producing > 100 gigabases (Gb) of reads in a day (Naccache et al. 2014). For these reasons, coupled with the challenges of integrating heterogeneous datasets, HTS sequencing data can be characterised as big data, and as such there lies a significant computational challenge. High-performance, cloud and grid computing are aspects of computing that have become ubiquitous with processing and analysis of HTS data (Lightbody et al. 2016), generated at ever increasing momentum. As the technologies are ever developing, sequencing could become a routine facet of personalized medicine (Erlich 2015).

2.2 Digital Pathology

Traditional microscopy involves the analysis of a sample, for example, a biopsy on a glass slide using a microscope. The domain of virtual microscopy has moved from viewing of glass slides to viewing of diagnostic quality digital images using specialised software. These slides can be viewed on-line through a browser or as recently demonstrated via a mobile device whereby the computational power of mobile devices provide a cost-effective mobile-phone-based multimodal microscopy tool which combines molecular assays and portable optical imaging enabling on-site diagnostics (Kuhnemund et al. 2017). Where more extensive computational power is required, some service providers have opted for cloud based virtual microscopy solutions which offer the promise of in-depth image processing of the tissue samples (Wang et al. 2010).

The drive towards personalized medicine has led to a deluge of personal data from heterogeneous sources. This big data challenge is discussed by Li et al. (2016), in which they highlight that “integrative analysis of this rich clinical, pathological, molecular and imaging data represents one of the greatest bottlenecks in biomarker discovery research in cancer and other diseases”. They have developed a framework, Pathology Integromics in Cancer (PICan), to accelerate and support data collation and analysis. This framework connects the tissue analysis to other genomic information, enabling a full and comprehensive understanding to be attained.

2.3 Self-Quantification

We are in an era in which society is ‘comfortable’ with every aspect of their behavior and person being monitored and analyzed. Part of this, has been the birth of a Quantified Self (QS) movement in which the person collates data on their daily life and physiology. It is reported as “self-knowledge through numbers”.1

The goal of such monitoring is often for self-improvement, whether it is to encourage more physical activity or to improve on lifestyle choices (Almalki et al. 2013). Alternately, it could come from the belief that by gathering enough data from enough people, then trends in the data can be found. This offers the opportunity to impact society’s health and well-being, and not just benefit the individual.

The advances in personal devices such as smart phones and sensor technology have promoted the gathering of such vast resources of personal data, which can fall into the category of big data, due to the sheer amount of data, the heterogeneous nature of the data and the speed at which it needs to be processed and managed.

An emerging addition to the QS movement is in collecting and analyzing electrical activity of the brain. Measured using the EEG, evaluation and classification of brain function such as sensory, motor and cognitive processes can be made. With the advancements in electronics,Footnote 3 wearable sensors, algorithms and software development kits there has been a shift towards exploring other possible applications in which EEG can play its part. One organizationFootnote 4 has developed a neuroscience platform to encourage users to perform “routine brain health monitoring”. By many users sharing their EEG, it is envisaged that it may be possible to derive critical insight into brain health and disease.

As QS applications evolve, it is expected that advanced machine learning and pattern recognition techniques will be involved in the analysis of data coming from multiple heterogeneous sources such as wearable electronics, biosensors, mobile phones, genomic data, and cloud-based services (Swan 2013).

2.4 Surveillance

Surveillance, specifically videos, are becoming ubiquitous in a number of situations for the monitoring of activity. With threats of terrorism, crime events, traffic incidents and governance, we have seen a rise of surveillance across global cites. Alongside this increase, we have seen progress on research in the area of computer vision, whereby processing and understanding surveillance videos can be performed automatically and key tasks such as people segmentation, tracking moving entities, as well as classification of human activities have been undertaken. Big data and the four ‘V’s’ are relevant to the surveillance domain due to the scope and volume of video data captured (Xu et al. 2016). It has been estimated by the British Security Industry Association that there are between 4 and 5.9 million cameras in the UK. A single camera can capture up to 48 GB of high-definition video a day. This results in issues with local storage through to the fusion of data from multiple video streams which may differ in terms of format. These issues lead to the processing of video analytics which has an impact upon terrorist prediction and governance. To address such needs, research has been performed in the area. This includes the study by Xu et al. (2015) whereby a semantic based model called Video Structural Description was proposed to represent and organize video resources (Najafabadi et al. 2015).

Another application in the area has been work performed by Krizhevsky et al. (2012) where deep convolutional neural networks were applied to classify 1.2 million images in the ImageNet dataset, achieving top-1 and top-5 and error rates of 37.5% and 17.0%, outperforming state-of-the-art classifiers. To speed up the process and improve efficiency, GPU convolution operations were implemented.

2.5 Internet-of-Things

IOT has been defined by the radio frequency identification group as “the worldwide network of interconnected objects uniquely addressable based on standard communications protocols” (Gubbi et al. 2013). These objects, such as sensors can be embedded in various devices across diverse domains such as healthcare, environment and astrology and are continually collecting and communicating data. These data are often semi-structured and require processing and analysis to provide useful information (Riggins and Wamba 2015).

An example of IOT and big data analytics is urban planning and smart cities (Kitchin 2014). A smart city can consist of devices built into the urban environment such as utility, communication and transport systems. These devices can be used in real-time to monitor and regulate city flows and processes. The integration and analysis of the data produced from these devices could provide an improved understanding of the city that enhances efficiency and sustainability (Hancke et al. 2013) and further models and predicts urban processes for future urban development (Batty et al. 2012). Examples of such platforms to support the IOT within a smart city include ThingSpeakFootnote 5 which provides a cloud-based platform where sensor data can be uploaded and analyzed using MatLab and iOBridge,Footnote 6 which provides a hardware solution to connect to the cloud with developed Application Programming Interfaces (APIs) to allow integration with other web services. Multi-nationals such as HP and IBM are also investing in projects such as CeNSEFootnote 7 and Smarter Planet,Footnote 8 respectively. CeNSE is deploying a vast number of sensors used to track for a range of applications from monitoring use and location of hospital equipment to tracking traffic flow. It then gathers and transmits such data to computing engines for analysis in real-time.

2.6 Finance

Financial institutions are adopting a data-driven approach with the aim of improving their performance, service and, as seen with the financial crash in 2008, their risks (Fan et al. 2014). Financial data can be in a structured or semi-structured form; such data includes stock prices, derivative trades, transaction records and high-frequency trades (HFT). A study by Seddon and Currie (2017) proposed a model for applying big data analytics in HFT. HFT uses algorithmic software to perform trades built upon advanced technological infrastructure with a focus on speed to process and leverage vast amounts of financial data (Aldridge 2009). This study analyzed big data and its impact upon financial markets. An important discussion, applicable to all application areas is data security and privacy. With high volumes of data used in analysis, questions need to be addressed around data security protection, intellectual property protection, personal privacy protection, commercial secrets and financial information protection (Chen and Zhang 2014).

3 Computational Challenges

At the heart of many of the computationally intense applications lies pattern matching and machine learning:

  • Machine learning

  • Deep learning

  • Pattern matching

  • Image/video/audio processing

  • Sentiment analysis

  • Natural language processing

Recent advances in high-performance computing has encouraged the field of deep learning to move out from research laboratories and become a commercial opportunity. Deep learning, driven by research centers and initiatives such as the Google Brain project,Footnote 9 has projected to become a multi-billion pound industry by 2024 (Tractica 2015; PR Newswire 2016), finding potential enterprise applications in areas of finance, advertisement, automotive, medical and other end-user applications. An enabler for this projected growth is in research and development of infrastructures, software and hardware technologies optimized for deep learning solutions.

4 High-Performance Computing Solutions

A background into different approaches is provided in this section. It should be noted that different application domains will have varied computational demands (Singh and Reddy 2014). The sections below discuss high-performance computing solutions ranging in computational performance.

4.1 Graphics Processing Units (GPU) Computing

Graphics processing units as the name suggests, are custom devices consisting of many processing cores or co-processors that have been tailored for processing the vast computational and memory requirements for graphics rendering and image processing. They enable highly mathematical and computationally intense functions to be performed at an accelerated rate due to the parallel computational units at the heart of their structure. The ability to offload computation most suited to parallel operations, while maintaining a great level of flexibility and scalability is a leading benefit of GPU-based computing over sequential operation CPU-based computing (Blayney et al. 2015; Melanakos 2008; Fan et al. 2004). However, the scale of the benefits depends strongly on the nature of the computations.

The application and use of GPUs has gone far beyond computer graphics and gaming, although expansion these markets have certainly reduced the cost of GPUs, making them a more affordable and thus widespread technology (Fan et al. 2004). The terms General-Purpose computation on Graphics Processing Units (GPGPU) and GPU Computing have arisen which signifies that the processors have a broad range of potential applications.

NVIDIA, is a market leader GPU producer, providing a range of GPU processors, boards and platforms.Footnote 10 The power of their GPUs can be harnessed through NVIDIA’s own Compute Unified Device Architecture (CUDA) parallel computing platform. This technology has been used in a range of applications spanning gaming, mobile, personal computers through to high-performance computing, and deep learning. For example, in bioinformatics there have been a large number of CUDA-based tools developed for accelerating sequence processing and analysis (Klus et al. 2012; Liu et al. 20122013). Although GPU computing is a promising direction for bioinformatics, memory handling and slow data exchange between CPU and GPU processors can still cause challenges (Starostenkov 2013).

In the area of deep learning, NVIDIA sees a market extending its capabilities in the area of accelerating Artificial Intelligence (AI) algorithms (Azoff 2015) in industries such as automotive, internet, healthcare, government, finance and others.Footnote 11 They are clearly positioning themselves for the expected growth in the big data market.

4.2 Field Programmable Gate Arrays

FPGAs are integrated circuits which enable a level of programmability. Their structure consists of an array of programmable logic blocks containing computational units, memory and interconnections that can be fully preconfigured. They sit between highly programmable digital signal processing chips and custom design ASICs, providing a balance of flexibility with parallel custom designed operations. They offer an experimentation and development platform to design and refine solutions. Yet they also provide enterprise solutions for applications in which a certain degree of reconfiguration may be required. However, unlike CPUs and GPUs this reconfiguration cannot be done totally on the fly and takes a level of reprogramming the device. Where there are advantages is when there is a large number of repetitive operations that are suited to parallel implementation, such examples are in image processing, pattern matching, or routing algorithms. In such cases FPGAs can be orders of magnitude faster compared to other platforms. The content below provides an overview of some examples of FPGAs in use.

FPGAs can offer possible solutions to computational challenges in bioinformatics and molecular biology (Ramdas and Egan 2005). A major computational challenge in genomics is in sequence alignment. The Smith–Waterman algorithm is a database search algorithm suited for protein sequence alignment. However, it is computationally intensive and the complexity increases quadratically as the dataset increases. Dydel and Bala (2004), present an implementation of it on FPGA. Tan et al. (2016) also present a FPGA-based co-processor to speed up short read mapping in HTS, reporting a throughput of 947 Gbp per a day, while providing better power efficiency.

Another aspect that can benefit from computational enhancement is in the image processing component in Genomic Microarrays. In these examples, sequencing is not being performed, however, genetic markers are being looked for that respond to known chemical interactions leading to a change in colour in the array, depending on the level of expression. Rodellar et al. (2007) present such a device, tailored to be portable so to make it applicable in regions remote from core healthcare provision. An implementation of the CAST algorithm used for detecting low-complexity regions in protein sequences is described by Papadopoulos et al. (2012). Significant speed-up in computations in the region of 100× where observed. These examples are not in themselves related to big data, however, they have relevance in the context of personalized medicine in which such data can routinely form part of a heterogeneous patient dataset.

4.3 Cloud Computing Platforms

The National Institute for Standards and Technology (NIST) defines Cloud computing as “a pay-per-use model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction”.

Foster et al. (2001) pioneered an idea of Grid computing which constitutes a large-scale distributed resource sharing under specified rules among the users and/or organizations. This idea was based on other known technologies of the time such as distributed computing. Grid computing proved to be useful in many scenarios, especially, for large-scale scientific computations (Di et al. 2012).

The concept of ‘Clouds’ as a similar but yet different way of distributed computing has been popularized by AmazonFootnote 12 in 2006. Armbrust et al. (2010) compare Cloud computing to other similar computing concepts in their work. Hence, they claim that although Grid computing offers protocols to share distributed resources, Cloud computing has advanced forward by offering “a software environment that grew beyond its community” (referring to the high-performance community).

Cloud computing has become a strong industry enabling a range of different services to be deployed typically by a pay-per-use cost model providing scalability in computing performance, storage and applications. Their expandability and sheer flexibility of services can provide a cost effective option for organizations in which the cost for development and maintenance for in-house solutions does not make business sense. Furthermore, cloud services can provide tools such as project and data management tools to aid in collaborations, provision of security and regulations in accessing shared data and analytical resources for the visualisation and understanding of datasets.

Cloud services fall under three different categories depending on the extent of the service provided:

  • Infrastructure as a Service (IaaS) – Providing access to the core computing and storage infrastructure.

  • Platform as a Service (PaaS) – Users can develop or build upon libraries and existing core platforms, and these solutions run on the cloud infrastructure.

  • Software as a Service (SaaS) – Users access applications that form part of the cloud infrastructure.

Some of the first adopters of big data in cloud computing are users that deployed NoSQL and Hadoop clusters in highly scalable and elastic computing environments provided by vendors, such as Google, Microsoft, and Amazon. An overview of the key market players is summarised as follows.

4.3.1 Amazon Web Services

Amazon Web Services (AWS) are the strongest competitors in cloud services (Leong et al. 2016), entering the market in 2006 and offering a range of relatively cost effective solutions. Their Amazon Elastic Compute Cloud (EC2) provides a scalable IaaS cloud service,Footnote 13 offering users a simplistic interface to their computing infrastructure. PaaS services are also supported. AWS have added Amazon EC2 Elastic GPUs to their provision allowing performance enhancements.

4.3.2 Microsoft Azure

Microsoft Azure provides both PaaS and more recently IaaS services.Footnote 14 The Azure platform offers functionality to integrate models, analyze data and visualization tools to scale data analysis. The Microsoft Azure model has been described in Gannon et al. (2014) as “layers of services for building large scale web-based applications”. These layers communicate across various levels including the hardware level, utilizing data centers worldwide for computation and content delivery. The ‘fabric controller’ acts as the kernel of the Azure operating system. It performs tasks such as monitoring and managing the virtual machines and hardware resources that make up the Azure system.

4.4 Deep Learning Libraries

Machine learning and, in particular, deep learning have become of immediate interest for companies and researchers alike. Such technology is finding its way into a range of products from speech recognition, image processing, search optimization, through to any application where there is a need or interest to understand behavior, images, speech and sentiment analysis. TensorFlowTM and other such systems can be a great enabler to develop such features.Footnote 15

TensorFlowTM is an open source machine learning infrastructure originating from Google as part of their Google Brain project started in 2011. It formed part of the Google’s Machine Intelligence research organization with its focus on machine learning and in particular deep neural networks. A key feature of TensorFlowTM is its sheer scalability and flexibility. It facilitates distribution of computations over a range of devices and platforms, from mobile devices and desktops, through to large scale infrastructures consisting of hundreds of machines or thousands of GPU devices (Abadi et al. 2015). More recently it has been incorporated within AWS Elastic Cloud (Amazon EC2) provision. It is part of their Deep Learning Amazon Machine Image (AMI) and is just one of a suite of deep learning libraries included (see Table 9.1).

Table 9.1 Deep learning libraries

5 The Role for Custom Hardware

Do we need to look at big data at the micro level or at the macro level? For example, genetic sequencing, particularly as part of next generation sequencing, requires a substantial computational overhead in the alignment of the small reads coming from the initial sample analysis. From this alignment the DNA sequence of smaller exome components can then be used to determine conditions and states of disease. Opposite to this are huge datasets of genomic data across thousands of people ranging in phenotype and genomic marker such as exome sequences. Gathering such huge expanses of genetic data and combining this with other associated information offers huge opportunities in disease stratification, biomarker discovery and drug development (Raghupathi and Raghupathi 2014). This is clearly big data at the macro level. So the question lies – would the same high-performance computing suit both applications? This particular example is further complicated by the size of even a single DNA sequence. Uploading such a file-size to a cloud-based system in itself presents challenges. Techniques have been developed to look at easing storage of such genetic information. One particular approach is with compression algorithms to find an efficient method to represent the data (Qiao et al. 2012). Such a method needs to be loss-less, fast, and effective.

Another consideration could be the need for secure solutions which keep data local, although cloud services such as AWS take great measures to keep their services secure. Establishing a custom system incurs a significant investment and maintenance overhead, and would be difficult to scale up. However, big data computations pose an ever increasing challenge in meeting performance needs. In particular, deep learning is an area of machine learning showing great commercial prospect. The next sections look at some of the deep learning solutions available.

5.1 Deep Learning

TensorFlowTM and other deep learning libraries (Table 9.1) combined with cloud services provide a platform to develop and create deep learning solutions, leading on to commercial opportunities. However, despite the great flexibility and scalability advantages of such a system, is there a possibility that a hardware-based solution might provide the better solution? This of course depends strongly on the application at hand and the limitations and challenges associated. Nevertheless, deep learning is a component of machine learning with great commercial interest. fpgaConvNet (Venieris et al. 2016) is a framework for mapping convolutional neural networks, a form of deep learning, onto FPGAs. The authors relate to the computational issues presented in convolutional networks, in particular, the classification computation overhead and the rapid scaling in complexity. CNNLab (Zhu et al. 2016), is another parallel framework for deep learning neural networks that distributes computation to both GPUs and FPGAs. Microsoft Azure has also incorporated FPGAs within their cloud platform (Feldman 2016). Woods and Alonso (2011) have developed an FPGA based framework for analytics on high-rate data streams. The next section looks further at enhancing cloud performance through incorporating custom hardware provision.

5.2 ASIC Enhanced Cloud Platforms

NervanaFootnote 16 has developed a platform for deep learning that is powered using a custom ASIC engine accessed through a cloud platform. They state that their cloud solution enables industry commercialized deep learning solutions. The platform they provide is described by them as a full stack solution for “AI on demand”, optimized at each level.

Nervana Neon is an open source Python-based scalable deep learning library. The Nervana Engine is custom ASIC hardware optimized for machine learning and in particular deep learning. They promote high-speed data access with high bandwidth memory, reaching speeds of 8 Terabits per second for memory access. Additionally, on-chip memory is large (32 GB) to meet the excessive storage requirements for machine learning. The core computational power is achieved through a sea of multipliers supported with local memory, without a reliance on cache memory. Nervana have paid great attention to data transfer across the chip including communication pipelines tailored for machine learning operations. One key aspect of this is the design allowing ASICs to be interconnected directly without reliance on Peripheral Component Interconnect Express (PCIE) buses which cause data flow bottlenecks. Nervana Engine is set to be released in 2017 and hopes to establish a place in the top deep learning technologies (Schneider 2017).

5.3 ASIC Deep Learning Processors

However, Nervana are not the only ones interested in this market with others are providing custom machine learning processing engines.

One of the most interesting areas in developing on-chip processing is based on the operation of the human brain, termed Neuromorphic chips . In this field, Spiking Neural Networks (SNN) are used to form the computations. The SpiNNaker Project is one example (Sugiarto et al. 2016) and forms part of the Human Brain Project.Footnote 17 The Darwin Neural Processing Unit is another exciting example of an ASIC co-processer based on SNN (Shen et al. 2016). Through the very nature of how SNN operate they may lend themselves more closely to machine learning and therefore show great promise in this area (Elton 2016).

6 Discussion

Big data and its analysis have the potential to provide insight into many diverse domains. The wealth of data collected at such a vast scale has led to the need for computationally intensive solutions to find useful information hidden in the chaos. The applications for such analysis are far reaching, from surveillance, finance, IOT, and smart cities through to personalized health. Potential of such applications include clinical decision support systems, personalized medicine for healthcare, distribution and logistics optimization for retail and supply chain planning for manufacturing (Sagiroglu and Sinanc 2013). However, even within each example, applications will have different needs in terms of data growth, infrastructure and governance along with integration, velocity, variety, compliance and data visualization (Intel 2012). A number of challenges still need to be addressed such as handling structured and unstructured data in real/near-time at a volume whereby traditional data storage and analysis approaches are not applicable (Zikopoulos and Eaton 2012). Furthermore, as big data analytics becomes mainstream, important issues such as data governance, guaranteeing privacy, safeguarding security, increased network bottlenecks, training of skilled data science professionals, development of compression technologies and establishing standards will require urgent attention (Intel 2012).

Big data analytics and applications are still in the early stages, however, the continuation of technology and platform improvement such as Hadoop, Spark, NoSQL coupled with the development of new analytical algorithms and infrastructure will contribute towards the maturing of the field. Companies such as Nervana are developing custom hardware to work in tandem with their cloud platform to accelerate deep learning. This is one field in which hardware developers can create impact for cloud computing infrastructure and big data analytics. Recently, Microsoft (Feldman 2016) announced the inclusion of Altera FPGAs within their Azure cloud service with the promise of creating an AI supercomputer. Microsoft does not currently plan to use the FPGAs for training neural networks, using GPUs instead for offline training. At present, they see FPGAs providing effective acceleration for evaluating already trained neural networks.

Qualcomm, recognize that their consumers require on-device solutions that do not rely fully on cloud services. Their machine learning platform is implemented on their Snapdragon Neural Processing Engine. The example here highlights that data analytics is a challenge that may not always be resolved through scalable cloud services, but as applications require more computationally intensive data analytics, some of this workload may need to be shared between on-device and cloud-based services. Other companies are also active in this area (Table 9.2) and seemingly there is a strong market for this level of on-device processing. Furthermore, there have been exciting advances happening in the area of Neuromorphic chips for machine learning. It will be interesting to see how this technology impacts the deep learning market.

Table 9.2 Deep learning ASIC processors

Clearly, each computational solution offers unique opportunities for overcoming the challenges of big data. FPGA and ASIC solutions can provide computational benefits under certain conditions and as demonstrated through companies such as Microsoft and Nervana they can form a key part of a high-performance cloud platform. Conversely, they play an important role for on-device big data analytics with companies such as Qualcomm and Intel investing largely in developing the next generation of AI chips. In each example the solutions have been tailored for the ever growing market of big data and deep learning. Meeting these challenges will have great impact to applications in the future, advances in healthcare, smart cities, security, automotive industry among other examples forming part of our daily lives.