Keywords

1 Introduction

The first idea towards the Internet of things concept, was initially expressed as “computers everywhere”, formulated by Ken Sakamura at the University of Tokyo in 1984 [1], and latter referred to as “ubiquitous computing” by Mark Weiser in 1988 (Xerox PARC) [2]. However, in 1999 Kevin Ashton was the first to coin the term “Internet of things” (IoT) in the context of supply chain management [3]. In 2001 the IoT concept was further developed by David Brock in an article of the Auto-ID Center of the Massachusetts Institute of Technology [4]. Since then a large number of researchers have followed and developed this idea, embodied in a wide variety of scientific articles, books and conferences. In all of them, the vision of integrating intelligence in the objects around us persists. This is achieved by providing objects with sensing, acting, storage, and processing capabilities, but overall providing them with interconnectivity via the Internet, all this in order to provide services to different users. It is precisely in the users, where lies the main difference of the IoT with respect to the traditional Internet, since while in the traditional Internet users are people connected to the Internet via a PC, laptop or some mobile device (smart phone or tablet, for example), in the IoT users can be other objects or “smart things” that require some service or interaction. This interaction can take place even with additional sources of information, such as social networks. Moreover, the IoT concept involves some very important activities which are frequently forgotten, the analysis, storage and utilisation of the data collected by different devices/objects. This opens up the possibility of developing a myriad of applications, for example: goods and people tracking, smart houses, smart cities, remote command and control, location based services, remote patient monitoring, and environmental monitoring; to name just a few of them.

In this context, the IoT represents the radical evolution of the traditional Internet into an interconnected network of “smart objects” that not only collect data from the environment (sensing) and interact with the physical world (actuation, command and control), but also use the Internet to provide services for information transfer, analytics, applications, and communications. Furthermore, as Gubbi et al. [5] described in their definition, the IoT is also the “Interconnection of sensing and actuating devices providing the ability to share information across platforms through a unified framework, developing a common operating picture for enabling innovative applications. This is achieved by seamless ubiquitous sensing, data analytics and information representation with cloud computing as the unifying framework.”

Generally, interoperability between connected devices is a responsibility of a single vendor that ensures it either through using proprietary interfaces and protocols, through installing add-on software clients on devices, or through the use of a gateway device [6]. However, in the future, devices can be expected to gain sufficient intelligence to interoperate directly, without the need for dedicated gateways [7].

Across the IoT, devices create data that are sent to the main application to be sent on, consumed and used. Depending on the device, the network and power consumption restraints, data can be sent in real time or in batches at any time [8]. Due to the heterogeneity of the data sources, nature of data generated and heterogeneity of data processing servers, an IoT system can also be viewed as a computational grid consisting of a very large number of devices generating data for processing and a large number of resources capable of processing such data [9]. For smart devices and sensors, each event that they perceive or that is registered can and will create data. This data can then be sent over the network back to the central application. At this point, it must be decided which standard the data will be created in and how it will be sent over the network. For delivering this data back, Message Queue Telemetry Transport (MQTT) [10], Hypertext Transfer Protocol (HTTP) [11], and Constrained Application Protocol (CoAP) [12] are the most common standard protocols used [8].

For years, cloud computing was the most used base technology for a lot of enterprise architectures and companies decided to move all their data, computation, processing and so on from “on-premise” infrastructure to the cloud itself. The cloud seems to offer infinite storage space and scaling for computation without any concerns from a company point of view which can configure all the features to change automatically. The result could be less time to spend on handling “on premise” infrastructures and less money to invest [13]. Cloud computing enables convenient, on-demand network access to a shared pool of configurable computing resources, such as networks, servers, storage, applications, and services, that can be rapidly provisioned and released with minimal management effort or service provider interaction [14]. However, for many operations, a cloud-only model is not necessarily the perfect fit. The cost of cloud storage is dropping, but transmitting and storing massive amounts of data in the cloud for analysis quickly becomes prohibitively expensive. Usually, all data are transmitted at their current fidelity, with no ability to sift out what data are of the highest business value and what data can be disregarded. Cloud-based IoT platforms also require a constant network connection, making them less than ideal for companies with remote operations, who cannot afford to cease operations when their connection goes down [15]. As a consequence, a new set of paradigms for IoT data handling have been proposed.

In this scenario, this chapter is developed around two main goals. The first one is to provide a review of the main concepts and paradigms related to data and process management in the IoT, aiming to provide an articulated view of the IoT environment as it is currently plagued by multiple definitions and general concepts. Thus, three different important approaches, cloud, fog and edge computing, are presented, together with a discussion of their main characteristics, advantages, disadvantages and differences between each other. The second goal is to provide an overview of distributed data processing approaches along the cloud, the fog and the edge paradigms concentrating the discussion on developments proposed towards optimizing resources, bandwidth usage and best fit approaches which take advantage of the characteristics that each layer of computing can provide. Finally, the main issues, challenges and opportunities of distributed computing and data processing in the IoT are discussed.

1.1 The Need for Distributed Computing

In 2003 Eric Schmidt, Executive Chairman of Google, claimed that up to that year 5 exabytes of data had been generated since the origin of humanity [16]. Besides, nowadays it is claimed that this much data is generated every 2 days and this rate is only increasing [16]. Furthermore, with the advent of the IoT CISCO’s estimated that the number of connected objects is going to reach \(\sim \)50 billion in 2020 [17]. This means that the data deluge problem is going to be worsened. The question is then, if the technology available will evolve fast enough to deal with this forthcoming extraordinary production of data. On the one hand, the evolution of compute and storage technologies is governed by Moore’s Law, which stipulates that compute and storage technologies will double in capability/capacity every 18 months. On the other hand, Nielsen’s Law projects that the Internet bandwidth doubles every 24 months [18]. Unfortunately, by comparing the two Laws it is shown that bandwidth grows slower than computer power. Hence, this anticipates a future IoT where data will be produced at rates that will far outpace the network’s ability to backhaul the information from the network edge, where it is produced by the billions of connected things, to the cloud where it will need to be processed and probably stored. Although Moore’s Law contributes partly to the data problem, the same Law can provide the solution. This solution will consists in increasing the functions of the network itself with compute and storage capabilities at the edge of the network. This means to bring cloud computing closer to the data sources allowing the network to perform processing, analysis, and storage of data in lieu of blindly pushing all data up to the cloud. As Bayer and Wetterwald [19] describe this is not just about aggregation or concatenation of sensed physical data (like a gateway will do), but really about distributing intelligence, where effective real time and deterministic processing is needed to implement a functionality, instead of leaving all the processing and analytics to be performed at the cloud side.

From the several reasons that make distributed intelligence a need for IoT, scalability, network resource preservation, close loop control, resilience and clustering are the main aspects [19]. In the case of scalability, the centralised approach is not sufficient to handle an increasing volume of end devices and its geographical specificities. Referring to network resource preservation, distributed processing helps relieving the constraints on the network by sending to the cloud or operation centre only the necessary information and by doing most of the data processing at the remote site much closer to the data’s source. In applications where a low latency is critical, such as real-time systems, for close loop control, large delays found in many multi-hop networks and overloaded cloud server farms prove to be unacceptable, and the local, high performance nature of distributed intelligence can minimise latency and timing jitter. Only local processing could satisfy the most stringent requirements, very often combined with advanced networking technologies like deterministic networking. Resilience is of most importance for critical processes that must run even if communication with the operation centre is not effective. In this case, an architecture based on distributing processing is the only valid solution. Finally, moving from individual devices to clusters helps to manage many units as a single one.

Under this context, as new technologies and services become available, it is important to consider that system deployments are in most of the cases application defined. Therefore, in the following paragraphs main IoT’s data and process paradigms are described followed by some use cases.

2 Cloud, Fog and Edge Computing Definitions

As the IoT paradigm matures, core concepts continue to evolve providing better insights about real world IoT implementations. However, a clear consensus in some areas still does not exist, as many current developments are vendor specific solutions. Thus, in this section cloud, edge and fog computing concepts are defined as a result of a review of different author’s proposals.

Before proceeding, it is important to notice that we have observed a loosely use of the “edge” term among definitions, making sometimes difficult to differentiate between concepts. Hence, we propose to use the terms Internet edge and IoT edge to differentiate from the devices at the edge of the Internet network from those at the edge of an IoT system. On the one hand, the Internet edge is the network infrastructure that provides connectivity to the Internet, and that acts as the gateway for the enterprise to the rest of the cyber space. As the gateway to the Internet, the Internet edge infrastructure plays a critical role in supporting the services and activities that are fundamental to the operation of the modern enterprise [20]. Some common Internet edge elements are: edge routers, switches and firewalls. On the other hand, the edge of the IoT includes those system end-points that interact with and communicate real-time data from smart products and services, examples of such devices include a wide array of sensors, actuators, and smart objects [21]. For example, in an IoT patient health monitoring system, the edge of the IoT would be the body area sensor network acquiring patient’s data, such as patient’s body temperature, respiration rate, heart beat and body movement; while the edge of the Internet would be the Internet-connected gateway which concentrates all the data and sends them to the cloud.

2.1 Cloud Computing

A commonly accepted definition for cloud computing was provided by the National Institute for Standards and Technology (NIST) in 2011 [22] as “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” In this regard, as described in [23] cloud computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the data centres that provide those services. The services themselves have long been referred to as Software as a Service (SaaS). The data centre hardware and software is what is call a cloud. When a cloud is made available in a pay-as-you-go manner to the public, it is referred to as a public cloud; the service being sold is utility computing. The term private cloud is used to refer to internal datacentres of a business or other organization that are not made available to the public. Thus, cloud computing is the sum of SaaS and utility computing, but does not normally include private clouds. Cloud computing offers the next characteristics [24]: (a) Elastic resources, which scale up or down quickly and easily to meet demand; (b) Metered service so a user only pay for what was used; (c) Self-service, all the IT resources that a user needs with self-service access.

Another useful definition of cloud computing is that one provided by Vaquero et al. [25]: clouds are a large pool of easily usable and accessible virtualized resources (such as hardware, development platforms and/or services). These resources can be dynamically reconfigured to adjust to a variable load (scale), allowing also for an optimum resource utilization. This pool of resources is typically exploited by a pay-per-use model in which guarantees are offered by the Infrastructure Provider by means of customized Service-Level Agreements (SLAs).

Therefore, one of the key characteristics of the cloud computing model is the notion of resource pooling, where workloads associated with multiple users (or tenants) are typically collocated on the same set of physical resources. Hence, essential to cloud computing is the use of network and compute virtualisation technologies. Cloud computing provides elastic scalability characteristics, where the amount of resources can be grown or diminished based on user demand [16], with minimal management effort or service provider interaction [22].

Depending on the type of provided capability, there are three cloud service delivery models [24]:

  • Infrastructure as a Service (IaaS). Provides companies with computing resources including servers, networking, storage, and data center space on a pay-per-use basis and as an Internet-based service.

  • Platform as a Service (PaaS). Provides a cloud-based environment with everything required to support the complete lifecycle of building and delivering web-based (cloud) applications, without the cost and complexity of buying and managing the underlying hardware, software, provisioning, and hosting (users do not install any of these platforms or support tools on their local machines).

  • Software as a Service (SaaS). Cloud-based applications run on distant computers “in the cloud” that are owned and operated by others and that connect to users’ computers via the internet and, usually, a web browser.

The PaaS model may be hosted on top of IaaS model or on top of the cloud infrastructures directly, while the SaaS model may be hosted on top of PaaS, IaaS or directly hosted on cloud infrastructure, see Fig. 1.

The cloud computing model is well-suited for small and medium businesses because it helps adopting IT without upfront investments in infrastructure, software licenses and other relevant requirements.

Fig. 1
figure 1

Cloud service delivery models

Despite the maturity that cloud computing has reached through the years, and the many potential benefits and revenues that could be gained from its adoption, the cloud computing model still has several open issues. Fog computing and edge computing are new paradigms that have emerged to attend some of the cloud computing issues, these paradigms are defined in the following sections together with some technological approaches for cloud computing that support them.

2.1.1 Mobile Cloud Computing

As mobile phones and tablets are getting “smarter,” their usage and preference over traditional desktops and laptops has increased dramatically. At the same time, the availability of a huge number of intelligent mobile applications has attracted more people to use smart mobile devices. Some of these applications, such as speech recognition, image processing, video analysis, and augmented reality are computing-intensive and their implementation in portable devices is still impractical due to the mobile device resource limitations. In contrast, the high-rate and highly-reliable air interface allows to run computing services of mobile devices at remote cloud data centres. Hence the combination of mobile computing with cloud computing has resulted in the emergence of what is called mobile cloud computing (MCC) technology. In MCC computing and communications-intensive application workloads, also as storage, are moved from the mobile device to powerful and centralised computing platforms located in clouds [26]. These centralised applications are then accessed over wireless connections based on a thin native client or web browser on the mobile devices. This alleviates the challenge of resource constraint and battery life shortage of mobile devices.

The general architecture of MCC is represented in Fig. 2. As can be seen in the figure, mobile devices are connected to the mobile networks via base stations (e.g., base transceiver station, access point, or satellite) that establish and control the connections and functional interfaces between the networks and mobile devices. Mobile users’ requests and information (e.g., ID and location) are transmitted to the central processors that are connected to servers providing mobile network services. Mobile network operators provide services to mobile users as authentication, authorisation, and accounting based on subscribers’ data stored in databases. After that, the subscribers’ requests are delivered to a cloud through the Internet. In the cloud, cloud controllers process the requests to provide mobile users with the corresponding cloud services [26].

Fig. 2
figure 2

Mobile cloud computing architecture [26]

Although MCC has several advantages, it has an inherent limitation, namely, the long propagation distance from the end user to the remote cloud centres, which will result in excessively long latency for mobile applications. MCC is thus not adequate for a wide-range of emerging mobile applications that are latency-critical. Presently, new network architectures are being designed to better integrate the concept of cloud computing into mobile networks.

2.2 Fog Computing

Recently, as described in [27] to address some of the limitations of cloud computing under the IoT paradigm, the research community has proposed the concept of fog computing, aiming at bringing cloud service features closer to what is referred to as “things,” including sensors, embedded systems, mobile phones, cars, etc. These things form part of what in this paper has been labelled as IoT edge devices. It is important to notice that, for this section, when referring to the “edge” or “edge devices” we refer to the IoT edge, unless otherwise stated, trying to keep the reference authors’ original essence but differentiating from the Internet edge concept.

The first formal definition of fog computing was stated in 2012 by Bonomi et al. [28] from CISCO as: “Fog computing is a highly virtualised platform that provides compute, storage and networking services between end devices and traditional cloud computing data centres, typically, but not exclusively located at the edge of the network.” Since then, different definitions have emerged under distinct scenarios and contexts.

In [29] fog computing is defined as “a model to complement the cloud for decentralising the concentration of computing resources (for example, servers, storage, applications and services) in data centres towards users for improving the quality of service and their experience.” The cloud and fog synergistically interplay in order to enable new types and classes of IoT applications that otherwise would not have been possible when relying only on stand-alone cloud computing [16]. In this scenario it is expected that a huge number of heterogeneous ubiquitous and decentralised devices will communicate and potentially cooperate among them and with the network to perform storage and processing tasks without the intervention of third-parties. These tasks can be for supporting basic network functions or new services and applications that run in a sandboxed environment [25]. This platform can extend in locality from IoT end devices and gateways all the way to cloud data centres, but is typically located at the network edge. Fog augments cloud computing and brings its functions closer to where data is produced (e.g., sensors) or needs to be consumed (e.g., actuators) [16], enable computing directly at the edge of the network, delivering new applications and services. The computational, networking, storage and acceleration elements of this new model are known as fog nodes. These are not completely fixed to the physical edge, but should be seen as a fluid system of connectivity [30].

Fog-centric architecture serves a specific subset of business problems that cannot be successfully implemented using only traditional cloud based architectures or solely intelligent edge devices. Although fog computing is an extension of the traditional cloud-based computing model where implementations of the architecture can reside in multiple layer of a network’s topology, all the benefits of cloud should be preserved with these extension to fog, including containerisation, virtualisation, orchestration, manageability, and efficiency [30].

Furthermore, several surveys have tried specifically to define this new paradigm, its challenges, possible applications as well as scenarios of application, see for example [25, 31, 32]. Although there are several definitions, all of them agree in the sense that the fog computing model moves computation from the cloud closer to the edge of the network (Internet), and potentially right up to the edge of the IoT, this is to the things: sensors and actuators. Moreover this idea is enhanced by considering fog computing as a system-level horizontal architecture that distributes resources and services of computing, storage, control and networking anywhere along the continuum from cloud to things, with the aim of accelerating the velocity of decision making [30].

Fig. 3
figure 3

(Adapted from [30])

IoT system deployment models

As indicated by the OpenFog Consortium [30], fog computing is often erroneously called edge computing, but there are key differences. Fog works with the cloud, whereas edge is defined by the exclusion of cloud. Fog is multilayer and hierarchical, where edge tends to be limited to three or four layers. In addition to computation, fog also addresses networking, storage, control and acceleration. Figure 3 shows a subset of the combination of fog and cloud deployments to address various use cases as framed by the layered view of IoT systems [30]. Each fog element may represent a hierarchy of fog clusters fulfilling the same functional responsibilities. Depending on the scenario, multiple fog and cloud elements may collapse into a single physical deployment. Each fog element may also represent a mesh of peer fog nodes in use cases like connected cars, electrical vehicle charging, and closed loop traffic systems. In these use cases, fog nodes may securely discover and communicate with each other for exchanging context-specific intelligence.

2.3 Edge Computing

As mentioned in the previous section, it is common in the literature to find that edge and fog computing are defined as the same concept. Nevertheless, these two concepts share several characteristics and both aim to bring cloud resources and services closer to the things which are generating data, in this work edge and fog computing are differentiated in the sense that in the former edge refers to the edge of the Internet, while in the later edge refers to edge of the IoT. Taken this into consideration in the last section a definition of fog computing was provided, while the definition of edge computing is presented in this section.

In [33], it is mentioned that the term edge computing was first coined around 2002 and was mainly associated with the deployment of applications over Content Delivery Networks (CDN). The main objective of this approach was to benefit from the proximity and resources of CDN edge servers to achieve massive scalability. An edge node includes routers, mobile base stations and switches that route network traffic [29]. In this case, “edge” must be understood as the edge of Internet, as defined above. These devices perform sophisticated processing in order to handle the packets coming in over the different subnetworks they aggregate. Examples of edge devices are appliances at the frontend of a data centre that perform functions such as XML acceleration, load balancing, or other content processing, as well as devices at the entry point of an enterprise that perform security-related functions such as firewalls, intrusion detection, and virus checking [34]. This concept places applications, data and processing at the logical extremes of a network rather than centralising them. Placing data and data-intensive applications at the edge reduces the volume and distance that data must be moved [30].

However, notice that devices which have direct connection to the Internet can also be considered as the Internet edge, referred to as edge devices in [35]. For example, a smart phone is the edge between body things and cloud, a gateway in a smart home is the edge between home things and cloud, a micro data center and a cloudlet is the edge between a mobile device and cloud. Figure 4 shows a simplified view of this paradigm.

Fig. 4
figure 4

Simplified view of the edge computing paradigm [35]

2.3.1 Mobile Edge Computing

The concept of mobile edge computing (MEC) was first proposed by the European Telecommunications Standard Institute (ETSI) in 2014 [36], as a new platform that “provides IT and cloud-computing capabilities within the Radio Access Network (RAN) in close proximity to mobile subscribers”. In recent years, driven by the visions of the Internet of things and 5G communications, MEC has had a renewed interest causing a paradigm shift in mobile computing, from the centralised MCC towards MEC. The main feature of MEC is to co-locate computing, network control and storage resources at the edge of the mobile Radio Access Network [37] aiming at reducing latency. Hence, MEC is not replacing but complimenting the cloud computing model: the delay sensitive part of an application can be executed on MEC servers whereas delay tolerant compute intensive part of applications can be executed on the cloud server. MEC aims to enable connected mobile devices to execute real-time compute-intensive applications directly at the network edge. The characteristics features that distinguish MEC are its closeness to end-users, mobility support, and dense geographical deployment of the MEC servers [15]. In Fig. 5, a representation of the MEC architecture is presented.

Fig. 5
figure 5

Mobile edge computing architecture [38]

Fig. 6
figure 6

Taxonomy of mobile edge computing [38]

Recently, the concept of fog computing was proposed by CISCO as a generalisation of MEC where the definition of edge devices gets broader, ranging from smartphones to set-top boxes. Frequently, these two areas overlap and the terminologies are used interchangeably. However, as mentioned before, fog computing goes beyond the Internet edge, to the continuum from the cloud up to the IoT edge. Besides, these two concepts are interrelated and share several characteristics in common. The taxonomy proposed in [38], shown in Fig. 6, although it was developed for MEC, the parameters taken into account can be considered as valid for cloud and fog computing. These parameters include: (a) Characteristics, (b) Actors, (c) Access Technologies, (d) Applications, (e) Objectives, (f) Computation Platforms, and (g) Key Enablers. A description of the concepts included in each parameter can be found in [38].

2.3.2 Cloudlets

As mentioned early, although a remote cloud helps mobile devices to perform memory-intensive or computation-intensive tasks, this approach suffers from low bandwidth and long latencies. Hence, in order to try to solve this problem, another cloud related concept has emerged with the name of cloudlet. A cloudlet was first defined in [39] as “a trusted, resource-rich computer or cluster of computers that’s well-connected to the Internet and available for use by nearby mobile devices.” Moreover, as stated later in [40] a cloudlet represents the middle tier of a 3-tier hierarchy: mobile device - cloudlet - cloud. It can be seen as a ‘data centre in a box’ whose goal is to ‘bring the cloud closer’ to mobile devices. Cloudlets adopt the same idea of edge computing, which is to deploy cloud resources in the proximity of mobile users at the edge of the Internet network and processes the computing requests of mobile devices in real-time [41].

Hence, as it is represented in Fig. 7, instead of using a distant public cloud, users can offload their jobs to a one-hop proximity low-latency and high-bandwidth cloudlet. In this solution, the mobile devices create a connection to the cloudlet via WiFi for handling the users’ requests. However, if the resources of the cloudlet are insufficient to address the user’s request, the cloudlet sends it to a public cloud via a wide area network (WAN) connection. Both the cloudlet and public cloud utilise the Virtualisation technology to provide computational resources in the form of Virtual Machines (VMs) on the top of Physical Machines (PMs) [36].

Fig. 7
figure 7

Cloudlet architecture [36]

3 Cloud, Fog and Edge Computing Implementations Overview

Having presented and defined the concepts of cloud, fog and edge computing, in this section an overview of selected implementations of these paradigms is presented. The idea is to give to the reader a flavour of each one of the technologies when they are put on practice, from the point of view of distributed data processing.

3.1 Cloud Computing Implementation Review

The cloud computing model has evolved through time and has been adopted to implement a wide spectrum of applications ranging from high computationally intensive applications down to light weight services. In this section, two application implementations are reviewed, which the authors consider are relevant and demonstrate the usability of the cloud computing model.

3.1.1 Cloud-Based Pervasive Patient Health Monitoring

Abawajy and Hassan presented the IoT and cloud-based architecture for remote healthcare monitoring [42], shown in Fig. 8. The approach is referred to as pervasive patient health monitoring (PPHM) system infrastructure. The suitability of the proposed PPHM infrastructure was demonstrated through a case study considering real-time monitoring of a patient suffering from congestive heart failure using ECG. As can be seen in Fig. 8, the proposed architecture is three-tier with the following components: Collection Station, Data Centre, and Observation Station.

Fig. 8
figure 8

Internet of things and cloud-based architecture for remote healthcare monitoring [42]

The Collection Station consists of an IoT subsystem that is tasked with remote physiological and activity monitoring of patients. The core monitoring infrastructure of the IoT subsystem is a wireless body sensor network (BSN). The personal server provides a link between the IoT subsystem and the cloud infrastructure. The personal server is a dedicated per-patient machine (e.g., a tablet or smartphone) with built in features such as a GPS module, Bluetooth radio module, and SQLite database. It is assumed that the personal server can compatibly interact with various local networks such as WiFi and LTE.

The Data Center Subsystem is a cloud based system where heavy functions that require storing, processing, and analysing the collected patient health data coming from the IoT subsystem is performed. The use of cloud storage offers benefits of scalability and accessibility on demand at any time from any place. The cloud also is used to hosts the middleware system, virtual sensors, and application services that allow medical staff to analyse and visualise patients’ data as well as to identify and raise alerts when events requiring urgent intervention are observed.

The Observation Station is where data-driven clinical observation and intervention take place through a monitoring centre. Healthcare actors involved in the monitoring process include doctors, patients, and nursing staff, in clinical observation, patient diagnosis, and intervention processes. The monitoring centre manages all access requests for patient data. Accordingly, if an authorised user issues a data request to the cloud, this is handled by the monitoring centre. If the requested data is available in the sensor data storage, then the data will be returned to the user.

An interesting aspect of the approach proposed by Abawajy and Hassan is that they demonstrated its suitability through a case of study, which consisted in monitoring in real-time a patient suffering from congestive heart failure. Experimental evaluation of the proposed PPHM infrastructure showed that it is a flexible, scalable, and energy-efficient remote patient health monitoring system. Details of the system implementation and results obtained can be found in the paper [42].

3.1.2 Cloud Computing Applied to Big Geospatial Data Challenges

The second implementation example presented here is that proposed by Yang et al., [43]. In their work the authors propose a cloud computing based framework to address big data challenges in the area of geospatial science, see Fig. 9. Based on this framework the authors investigate how cloud computing can be utilised to address big data challenges. The big data challenge in geospatial science can be characterised by the so-called 4 Vs [43]: (a) the volume and velocity at which geospatial data is produced have far exceeded the stand-alone computer’s storage and computing ability; (b) the variety of geospatial data in format and spatiotemporal resolution make it difficult to find an easy-to-use tool to analyse these data; (c) the veracity in terms of accuracy and uncertainty of geospatial data span across a wide range. Together these characteristics are referred to as the 4 Vs: volume, velocity, variety and veracity.

Fig. 9
figure 9

Cloud computing based framework for big data processing in geospatial science [43]

The transformation of the big data’s 4 Vs into the 5th V, which is defined as ‘value’, meaning big geospatial data can be processed for adding value to better scientific research, engineering development and business decisions, is a grand challenge for processing capacity. Such transformations pose grand challenges to data management and access, analytics, mining, system architecture and simulations. Therefore, as cloud computing provides computing as a utility service with five advantageous characteristics: (a) rapid and elastic provisioning computing power; (b) pooled computing power to better utilise and share resources; (c) broadband access for fast communication; (d) on demand access for computing as utility services; and (e) pay-as-you-go for the parts used without a significant upfront cost like that of traditional computing resources; it can be utilised to enable solutions for the transformation of the 4 Vs into the 5th V (value). In order to exemplify how the proposed architecture can enable this transformation, four examples of big data processing are reviewed in the paper, these include: climate studies, geospatial knowledge mining, land cover simulation, and dust storm modelling.

For the matter of space, in this overview only the case of climate studies big data processing is reviewed. The combined complexities of volume, velocity, variety, and veracity can be addressed with cloud-based advanced data management strategies and a service-oriented data analytical architecture to help process, analyse and mine climate data. For example, climate simulation poses challenges on obtaining enough computing resources for scientific experiments when analysing big simulation data or running a large number of model simulations according to different model inputs. This problem can be addressed using cloud computing as follows [43]: (a) the climate models can be published as a service (Model as a Service, MaaS) and enough VMs can be provisioned with specific model configurations for each ensemble modelling run on demand; (b) the application is deployed as a service with a web portal to support model operation and monitoring; and (c) the workflow involving different analytics is operated as a service (Workflow as a Service, WaaS) with intuitive graphical user interfaces (GUIs). With this, the big climate data analytics are supported by cloud computing at the computing infrastructure level.

The proposed cloud-based architecture for service-oriented workflow system for climate model study can be seen in Fig. 10. This architecture includes [43]: (a) the model service is responsible for compiling and running models on VMs, which are provisioned based on the snapshot of the system containing the modelling software environment to run a model; (b) the VM monitor service provides the cloud platform with VM status information for resource scheduling; (c) the data analysis service feeds the model output as the input for analytics, while analyzing data in parallel to address data intensive issues. Data publishing service enables users to access the analysis results in real time via the Internet. All of these services are controllable through a GUI, which enables users to drag and connect services together to build a complex workflow so the system can automatically transition to the applications specified by the workflow and run on the cloud with automatically provisioned VMs [43].

Fig. 10
figure 10

Cloud-based service-oriented workflow system for climate model study [43]

For further details on the other three big data processing problems addressed using cloud computing, the reader is referred to [43].

3.2 Fog Computing Implementation Review

With the emergence of fog computing, which brings cloud service features closer to the thing generating data, the question now is how processing, storage an analytics tasks are segmented among the fog and the cloud. Deciding what tasks go to fog and what goes to the backend cloud are in general terms application specific. Naturally, certain functions are better fitted to be performed at fog nodes, while other functions are better suited to be performed at the cloud side. Still, customary backend cloud computing will continue to remain an important part of IoT systems as fog computing emerges. This segmentation could be planned, but also can be adjusted dynamically according to the network state, for example, based on changes in processor loads, link bandwidths, storage capacities, fault events, security threats, cost targets, etc., [30]. Hence, several approaches exists which propose the decentralisation of resources and services of computing, storage, control and networking tasks along the continuum from the cloud to the things. In the next sections some proposals for fog computing implementations are reviewed.

3.2.1 Reference Architecture for Fog Computing

Perhaps one of the main fog computing proposals is the reference architecture for fog computing from the OpenFog Consortium [30]. This is a system-level horizontal architecture which considers functional boundaries in fog computing as fluid, meaning that multitude of combinations can exists to physically deploy fog-cloud resources based on domain-specific solutions. These deployments will fit in one of four possible domain scenarios. Depending on the scenario, multiple fog and cloud elements may collapse into a single physical deployment. Figure 11 presents the four fog hierarchical domain deployment scenario models. For a complete description of each one of these scenario models the reader is referred to [30].

Real world deployments may involve multi-tenants, fog, and cloud deployments owned by multiple entities. As the OpenFog Consortium argues, many of the fog computing usages will occur as represented in the scenarios represented by 2 and 3 in Fig. 11. The three-layer fog hierarchies shown are for illustrative purposes only. Real world fog deployments may have more or fewer levels. Different vertical application use cases may use a fog hierarchy differently. For example, in a smart city, there may be fog nodes in a region, neighborhood, street corner, and building level. In a smart factory, the hierarchy may be divided by assembly lines, manufacturing cells, and machines [30].

Fig. 11
figure 11

Fog hierarchical deployment models [30]

3.2.2 Fog Computing Deployment Based on Docker Containerization

An interesting study where practical implementation aspects of fog architectures are reviewed is the one proposed in [44]. This approach considers a fog computing solution based on two primary directions of gateway node improvements: (i) fog-oriented framework for IoT applications based on innovative scalability extensions of the open-source Kura gateway (Kura is an open-source project for IoT-cloud integration and management [45]), and (ii) Docker-based containerization (Docker is an open-source software that automates the deployment of applications inside software containers [46]) over challenging and resource-limited fog nodes, i.e., Raspberry Pi devices [47].

The standard IoT-cloud architecture using Kura is the three-layer shown in Fig. 12a. This architecture has an intermediate fog layer of geographically distributed gateways nodes, positioned at the edge of network localities that are densely populated by IoT sensors and actuators. This architecture is extended under the Kura framework with the introduction of local brokers for scalability purposes, as is shown in Fig. 12b. While in the standard Kura IoT gateways simply aggregate all the data gathered by their sensors and send them to a MQTT broker running on the cloud, delegating all the associated computations to the global cloud, in the proposed extended architecture the Kura IoT gateways work as fog nodes which can scalably serve as local MQTT brokers and dynamically coordinate among themselves, without the need of the continuous support of global cloud resources.

Fig. 12
figure 12

a IoT-cloud architecture based on Kura; b inclusion of an MQTT broker on each gateway [44]. Sx \(=\) Sensor x, Cx \(=\) Client MQTT x, B-k \(=\) Broker at side k, B \(=\) Broker at cloud

Therefore, as depicted in Fig. 12b, MQTT brokers are included on each gateway in order to collect sensed data at the gateway side and, after local processing and inferencing, to send filtered/processed/aggregated data to the cloud. As described by the authors [44], this extension to the fog architecture offers the following advantages: enabling hierarchical topologies, gateway-level MQTT message aggregation, real-time message delivery and reactions, actuation capacity and message priorities, locality awareness and locality-oriented optimisations, gateway-cloud connection optimisation. For more details the reader is referred to [44].

Two additional gateway extensions are proposed by Bellavista and Zanni [44], these include support for cluster and mesh gateway organisations of IoT gateways, shown in Fig. 13a, b, respectively.

Fig. 13
figure 13

a Support for cluster organisation of IoT gateways, b support for mesh organisation of IoT gateways [44]

The most significant advantages associated with the support of cluster/mesh gateway organisations of IoT include: Kura gateway specialization, locality exploitation and data quality, geo-distribution, scalability, security and privacy.

The authors report that with the extensions proposed good scalability and limited overhead can be coupled, via proper configuration tuning and implementation optimisations, with the significant advantages of containerisation in terms of flexibility and easy deployment, also when working on top of existing, off-the-shelf, and limited-cost gateway nodes [44].

3.2.3 Fog Computing for Healthcare Service Delivery

Another interesting approach to fog computing developed in the context of healthcare service provisioning is presented in [48]. The architecture proposed contain three main components: (a) fog nodes, that support multiple communication protocols in order to aggregate data from various heterogeneous IoT devices; (b) fog servers, are lightweight cloud servers that are responsible to collect, store, process, and analyse the information from IoT devices and provide predefined on-demand applications and services, and (c) cloud, provides the data warehouse for permanent storage, performs big data analysis and other back-end applications. The proposed architecture is represented in Fig. 14.

Fig. 14
figure 14

Architecture integrating IoT, fog and cloud computing for healthcare service provisioning [48]

In the architecture shown in Fig. 14, fog nodes, fog servers, and IoT devices are considered as distributed entities according to their geographic location while the cloud is centralised and located at the healthcare provider’s premises. The end users are equipped with a variety of IoT devices such as medical sensors, wearable or implanted, that monitor presence, motion, mobile devices, and environmental sensors. IoT devices interact among each other with different communication technologies such as 3G, LTE, WiFi, WiMAX, 6Lowpan, Ethernet, or ZigBee. These devices are connected directly or via wired and wireless communication protocols to a fog node. Fog nodes act as intermediate points between the IoT devices and the edge network. Fog nodes perform limited computational operations and provide pushing services for receiving and uploading data. The data from IoT devices is transmitted to the fog server. Fog servers perform protocol conversion, data storage, processing, filtering, and analysis. Fog servers are able to make decisions and provide services to the end users based on predefined policies and rules without the necessity to interact with the cloud server [48]. However, it is assumed that fog servers have limited hardware capabilities and cannot fully support the creation of new services and applications. Hence, fog servers have to interact with the cloud in order to provide prediction models, enable big data analysis as well as new service execution and orchestration. This means that the fog cannot replace the cloud but they have to cooperate in order to provide timely value-added services to the end users. For example, data which has been processed, analysed, and temporarily stored in fog servers can be transmitted to the cloud for permanent storage and further analysis or, if there is not any necessity to be stored, these data can be removed from fog servers.

In order to illustrate the benefits that the architecture integrating IoT, fog, and cloud computing offers, in their work the authors presented two use cases scenarios, the first one is related to daily monitoring for provisioning healthcare services; the second one refers to an eCall service system to provide immediate assistance to people involved in a road collision accident. The main advantages, found as possible benefits of using such architecture, are a faster and accurate treatment delivery, reduction of medical costs, improvement of doctor–patient relationships, and the delivery of personalised treatment oriented to users’ needs and preferences, for more details the reader is referred to [48].

3.3 Edge Computing Implementation Review

As described previously, in this work edge computing refers to placing applications, data and processing capabilities at the edge of the Internet. Therefore, in this section a review of various implementations of edge computing is presented.

3.3.1 Edge Computing on Mobile Applications

Hu et al., [49] quantified the impact of edge computing on mobile applications, they explored how using edge computing infrastructure improves latency and energy consumption compared to that used under a cloud computing model. The authors considered five configurations of network: No offload, cloud by WiFi, cloudlet by WiFi, cloudlet by LTE and cloud by LTE, with three statically prepartitoned applications from existing research, face recognition application (FACE), augmented reality application (MAR) and physics-based computer graphics example (FLUID). The proposed experimental setup is shown in Fig. 15. The results presented show that edge computing improves response time and energy consumption significantly for mobile devices on WiFi and LTE networks. Hence local processing consumes less power and is faster than sending the data to a distant cloud. Also results show that offloading computation blindly to the cloud can be a losing strategy. Offloading to a distant cloud can result in lower performance and higher energy costs than running locally on mobile devices. Edge computing achieves performance and energy improvements through offloading computing services at the edge of the Internet for mobile devices [49].

Fig. 15
figure 15

Edge computing setup for experimentation in [49]

3.3.2 Femto Cloud System

An interesting approach is the proposed by Habak et al. [50], which consider how a collection of co-located devices can be orchestrated to provide a cloud service at the edge. As an answer to this question, the authors propose the FemtoCloud system, which provides a dynamic, self-configuring and multiple mobile devices cloudlet by coordinating a cluster of mobile devices. The general FemtoCloud system architecture is shown in Fig. 16.

Fig. 16
figure 16

The FemtoCloud system architecture [50]

The FenmtoCloud architecture consists of two main components, the control device and the co-located mobile devices, as shown in Fig. 16. The co-located devices can appear in different scenarios including: passengers with mobile devices using public transit services, students in classrooms and groups of people sitting in a coffee shop, for example. It is assumed, however, that the devices are in a cluster stability, meaning that there is predictability in the duration of time that a given device is available for use in the FemtoCloud. A representation of cluster stability in different settings is shown in Fig. 17.

Fig. 17
figure 17

Mobile cluster stability representation [50]

The control device is assumed as being provided as part of the infrastructure of the cluster scenario environment, this is the coffee shop, the university, etc. The architecture identifies the functionality to be realised in the controller and in the mobile devices, and the information to communicate within and between the two. Therefore, the FemtoCloud controller is responsible for deciding which mobile devices will be added to the compute cluster in the current environment. A critical problem that must be solved at the controller is the scheduling of tasks onto mobile devices where the transmission of data and receipt of results all happens over a shared wireless channel. Hence, the control device acts as a WiFi hotspot allowing the mobile devices to connect to it using infrastructure mode. The control device is responsible of providing an interface to the task originators and to manage the mobile devices inside the cloud. The more devices in a setting, the more potential for those devices to have tasks they want to offload, but also the more potential exists for idle resources to use in serving offloaded tasks.

A client service, running on the mobile devices, estimates the computational capability of the mobile device, along with user input, to determine the computational capacity available for sharing. This client leverages device sensors, user input, and utilisation history, to build and maintain a user profile. The control device works in collaboration with the FemtoCloud client service installed in the mobile devices to acquire information about the device characteristics and user profiles. The service uses such information to assign task to devices according to a heuristic. Subsequently, the service shares the available information with the control device, which is then responsible for estimating the user presence time and configuring the participating mobile devices as a cloud offering compute as a service.

A FemtoCloud system prototype was developed and used it in addition to simulations to evaluate the performance of the system. The authors argue that their evaluation demonstrate the potential for FemtoCloud clustering to provide a meaningful compute resource at the edge. For the detail of the implementation and results the reader is referred to [50].

4 Conclusions

In a traditional cloud-based architecture application intelligence and storage are centralised in server wire centers. It was believed that this architecture would satisfy the needs of most of the IoT applications. However, as IoT applications started demanding real-time response (for control loops, or health care monitoring applications, for example), generating an increasing amount of data and consuming cloud services overstretching the network and cloud infrastructure, added to a limited network bandwidth, this idea started to break down. In order to meet the network strains of data transmissions to the cloud, different approaches for providing computational, networking, and storage capabilities closer to the end users have been proposed. Among them, there can be distinguished the fog and edge computing approaches as the most relevant ones.

Fog computing has been proposed as a system-level architecture that can optimise the distribution of computational, networking, and storage capabilities in a hierarchy of levels at the continuum network from cloud to IoT end devices. It seeks to provide the exact balance of capacity among the three basic capabilities, computational, networking, and storage, at the precise level of the network where they are the most optimally located. Nevertheless, fog computing should be considered not as replacement of the cloud, but as a supplement to the cloud for the most critical aspects of network operations. Fog also supplements and greatly expands the capabilities of intelligent endpoint devices and other smart objects.

Fog computing can provide enhanced features such as user mobility support, location awareness, dense geographical distribution, low latency, and delays that could not be supported inherently by the cloud computing approach. These characteristics are significant for provisioning delay-sensitive services such as healthcare and emergency services.

The aim of edge computing is to place computing and storage resources at the edge of the Internet, in close proximity to mobile devices or sensors. Edge nodes include base stations, routers and switches that route network traffic. Placing data and data-intensive applications at the edge reduces the volume and distance that data must be moved. Hence, computing on edge nodes closer to application users is a platform for application providers to improve their service.

In this chapter, the two main paradigms that evolved from the initial concept of cloud computing have been presented and some application implementations have been reviewed. The idea was to define every concept from the point of view of decentralisation of data processing, networking and storage capabilities. As seen in the applications reviewed, the utilisation of a given paradigm for a particular application is problem specific, which needs an analysis according with the requirements and characteristics of each approach.