Keywords

1 Introduction

Both the Internet of Things (IoT) and the Smart Cities (SCs) terms are nowadays largely adopted: the former one refers to technological advancements that offer unprecedented levels of connectivity to both users and machines, whilst the latter one identifies urban scenarios where city problems are tackled with novel IT solutions. They both mostly leverage the enormous diffusion of Cloud Computing (CC) and mobiles (e.g., smartphones, tablets), boosted by the mobile broadband (MBB) technology, which currently represents one of the most dynamic market segments worldwide. Higher data rates, more reliable coverage and improved Quality of Service determined a penetration rate of 47 % for MBB and an overall network coverage of 69 % of the world population (89 % if we consider the urban population only) for the year 2015 [1]. The trend for the year 2020 envisions that worldwide mobile subscriptions will amount 9.2bn (6.1bn for smartphones) [1] while currently they are 7.1bn (2.6bn for smartphones). The expected number of connected devices will skyrocket, reaching nearly 30bn entities in 2020.

In this highly dynamic scenario, a new paradigm is emerging, known as Sensing as a Service (S2aaS) [2], which aims at solving typical SC challenges by exploiting CC and IoT infrastructures and by offering multiple sensing capabilities in order to satisfy heterogeneous sensing requests coming from different geographical areas. Mobiles, along with their rich set of embedded (or pluggable external) sensors and their high pervasivity, represent at the moment the most suitable way to offer such sensing capabilities on a large scale without revolving to traditional Wireless Sensor Networks (WSNs) approaches. Therefore, S2aaS is firmly rooted on the Mobile Crowd Sensing (MCS) paradigm [3], which allows collecting data directly from mobiles and overcomes typical limitations of WSNs (thanks to wider coverage areas, high number of deployable nodes, more reliable communication and connectivity). Users can choose monitoring modalities (participatory sensing) or delegate their mobiles to send data automatically (opportunistic sensing). Both the approaches can be combined properly in S2aaS to satisfy different needs, such as directly requesting mobile owners to perform measurements or simply sending automatic sensing tasks to mobiles in a given area.

Urban scenarios offer a promising arena for MCS applications, where citizens can consume/provide information about specific situations occurring around them. We believe that this can improve the S2aaS paradigm, that we can define as Urban Mobile Sensing as a Service (UMS2aaS) to point out how it is deeply tailored on SCs challenges and issues. Citizens could be dynamically scattered across huge areas with multiple sensing purposes and they could acquire contextual awareness from the surrounding environment [4]. Similarly, they could be engaged in collaborative, large-scale monitoring experiences that widen the scope of traditional sensing campaigns [5].

In this paper, we propose a platform capable of paving the way for the deployment of UMS2aaS solutions by: (1) identifying noise and electromagnetic (EM) monitoring as suitable urban sensing scenarios; (2) proposing a mobile app for gathering data from sensors and optional users’ comments; (3) proposing a cloud-based data management system; (4) estimating platform data growth.

As for a technical point of view, we designed, developed and tested (at a city in Southern Italy) a system prototype that gathers data from mobiles and sends them to a context broker application, which forwards them to a Hadoop-based server farm. Then, a complete ETL (Extract-Transform-Load) pipeline elaborates measurements in a Data Warehouse (DWH) system: they are aggregated w.r.t. sensing location, device type, timestamp, serving network type/provider. These functionalities are achieved by merging a set of components from FIWARE middleware [6] with our platform.

The paper is organized as it follows: Sect. 2 briefly examines MCS paradigm and our research purposes; the proposed platform is detailed in Sect. 3; discussions about the platform prototype are presented in Sect. 4; Section 5 enlists conclusions.

2 State-of-the-Art and Research Purposes

MCS is the enabling element for S2aas and UMS2aaS paradigms and it actually helps addressing multiple urban monitoring issues such as traffic and road safety [3]; air [7] and water [8] pollution; noise [9]; flooding [10]; earthquakes [11], large-scale events [12]. Let us now consider the two identified monitoring scenarios for our research. As for noise monitoring, the majority of MCS applications are for personal use only: they mimic Sound Level Meters (SLMs) interfaces and allow users to check how loud much their surrounding environment is (e.g., Advanced Decibel MeterFootnote 1, Sound Meter ProFootnote 2). However, they do not provide data aggregation on a geographical/temporal basis. Very few research works address urban noise mapping, such as the “Ear-Phone” project [9] where smartphones are used to predict outdoor sound levels or the “2Loud?” project [13] that uses iPhones to assess nocturnal indoor noise near highways in Australia.

Conversely, a comparable diffusion of MCS solutions for EM field level assessment is not yet available. The majority of them only refers to Wi-Fi indoor coverage analysis [14] or outdoor Access Points (APs) localization [15], with very few proposals considering 3G/4G systems, specifically tailored to evaluate traffic data for network operators rather than users [16] or quantifying signal strength for single devices [17].

MCS-based noise and EM monitoring currently suffer from a series of limitations: (1) absence of functionalities tailored to city managers for improving citizens’ life quality; (2) users’ involvement as mere data collectors, without providing them with educational outcomes or trying to raise awareness; (3) lack of extensive monitoring purposes. Our platform aims at filling these gaps. Firstly, we want to increase users’ awareness about phenomena under observation by adding educational contents in the mobile app. Secondly, we aim at complying with S2aaS by adopting proper architectural design solutions and a general-purpose data modelling approach, easily customizable for different sensing scenarios. Thirdly, our platform will act as a preliminary, low-cost, large-scale and sufficiently accurate monitoring tool for locating areas with potential pollution risks where more accurate sensing campaigns can be performed.

We referred to noise and EM monitoring to test our platform, since European citizens are particularly concerned about these topics. Urban noise is considered one of the most relevant factors of life quality condition worsening [18] (due to congested roads, high-traffic, wrong or obsolete urban planning) and still very few interventions are made by city managers and local administrations to reduce citizenship’s noise exposure [19]. Several scientific research works examining the correlations between health effects and noise enforce the necessity of proper monitoring, since noise exposure may determine progressive hearing losses, stress, distraction, sleep fragmentation, socio-behavioral changes, hypertension and other long-term or chronic diseases [20].

Similarly, EM pollution concern is due to the increasing number of base stations that are installed across our cities. However, whilst citizens can perceive quite easily their exposure to noise by referring to the “loudness” of the surrounding emitting sources, the exposure to EM fields is even more difficult to be evaluated. Despite this inherent complexity, and although no scientific research works have yet determined a direct correlation between EM fields and medium/long-term health effects, public opinion is becoming more and more sensitive to this problem [21]. From a monitoring point of view, whilst mobile-embedded microphones can provide sufficient accuracy in assessing noise levels, EM fields cannot be sensed so easily: mobile internal antennas provide neither broadband metering nor a direct quantification of the effective electric field levels in a given point (they can assess the received signal strength from their serving cell). Therefore, we performed an accurate selection of the physical quantities under observation and we introduced some error-mitigation policies (Subsect. 4.2).

3 The Proposed System

3.1 Adopted Quantifiers for Noise and Signal Strength Exposure

Our system provides both noise and signal strength opportunistic measurements. Noise measurements can be also achieved in a participatory way. As for the noise exposure quantification, we adopted the well-known A-weighting scale, which measures the Sound Pressure Level (SPL) in units of dB(A) [22] and allows assessing the dependence of perceived loudness w.r.t. frequency. The SPL is an instantaneous measurement, therefore actual noise regulations require to consider also the Equivalent Sound Level LEQ(T) [22] quantity to cope effectively with sounds varying in time and having different durations. The LEQ(T) averages, in dB(A), SPL values measured during a given time window T (which ranges typically from 30 s to 24 h), thus smoothing spikes and outliers. Despite mobile-embedded microphones differ from professional sound metering equipment due to a series of limitations (e.g., optimization for voice reception rather than environmental noise; reduced sensitivity; heterogeneous usage conditions, etc.), several recent studies demonstrated the effectiveness of MCS applications for noise monitoring scenarios, by assessing mismatches between ± 1.5 dB and ± 5 dB [23].

Signal strength measurements estimate the power level received by mobile antennas and can be used as a quantifier for electric field exposure in the range 0.9-2.4 GHz, even if they are not as accurate as broadband field probes. For both UMTS and LTE networks, we refer to the RSSI (Received Signal Strength Indicator) quantity [24], which expresses in dBm the total received power over the carrier frequency. The RSSI includes: signals from the co-channel serving cell, interferences from non-serving neighboring cells, thermal noise, etc. However, each mobile is able to provide RSSI from just its serving network provider, therefore RSSI is always a portion of the overall signal power available in a given location (since signals from other providers are also present but not sensed by that mobile) and this casts the need of post-processing analyses (Subsect. 3.3). Additionally, we refer to RSRP (Reference Signal Received Power), for both UMTS and LTE. It represents the linear average over the power contributions in Watt of the resource elements carrying cell-specific reference signals along carrier frequency (therefore RSRP is always lower than RSSI). RSRP expresses, in ASU (i.e., Arbitrary Strength Unit, which is an integer value proportional to the received signal strength), the contribution of the pilot channels compared to RSSI.

3.2 Data Modeling

Data coming from sensors are multidimensional [25], thus a typical solution for dealing with them is to follow a DWH approach [26], according to which data are processed in an ETL pipeline, thus allowing us to clean, transform and store measurements before aggregating and making them available to final users. We adopted the Dimensional Fact Model (DFM) [26], which is a graphical conceptual model based upon the fact entity (i.e., any concept evolving in time, relevant to decision-making processes). We identified two facts: noise (Fig. 1A) and signal strength measurements (Fig. 1B). Facts (the central rounded boxes) are described qualitatively by fact attributes and quantitatively by fact measures (i.e., numerical properties or calculations, enlisted in the bottom part of the fact). Noise fact measures are SPL and current/max/min/average LEQ(T), both in dB(A). Similarly, signal strength fact measures are RSRP value (in ASU) and current/max/min/average RSSI value (in dBm). Each analysis coordinate of a fact is called a dimension and it consists of several dimensional attributes organized as a directed tree departing from the fact (the attributes are the circles connected by lines to the fact; the dimension is the root circle). Dimensional attributes qualify the finite domain of their dimension along with its different degrees of granularity (e.g., the temporal dimension can vary from seconds to days, weeks, months; a product is described by its name, series, brand, etc.). The dimensions shared among multiple facts are the conformed hierarchies: time (timestamp, date/month/year); position (latitude, longitude, town, province, region, country); sensor type (external or embedded); device type (model and brand) and outlier condition. The device type also stores the IMEI (International Mobile Equipment Identity) code, which univocally identifies each mobile. The signal strength fact also has the following dimensions: MNC (Mobile Network Code, i.e. the network provider) and network (e.g., GSM, UMTS, LTE). The noise fact also has an optional dimension representing user’s annotations about the source (e.g., type, annoyance, distance, etc.). Unit of measurement is the descriptive attribute (depicted as a simple line departing from the fact) for both the facts.

Fig. 1.
figure 1figure 1

DFM representation: noise (on the left) and signal strength (on the right) measurements.

3.3 Platform Architecture and ETL Pipeline

Our platform consists of a mobile sensing app and of a cloud-based system tasked to data management. The app works on Android mobile devices (and exploits Android 4.2 APIsFootnote 3; the app mimics a professional SLM user interface and collects peak, average and current values of SPL and LEQ(T) on customizable temporal windows, as required by EU and Italian noise regulations. It also collects RSSI and RSRP values that assess the power of the signal received by the mobile. Measurements are stored locally (short-term history) and sent to the cloud-hosted system for data aggregation and filtering. The data brokering functionality is achieved by using OrionFootnote 4, a Generic Enabler (GE) from FIWARE middleware [6] that provides publishing and subscribing operations on collected data. Another FIWARE GE, CosmosFootnote 5, offers the HDFS-based persistent storage (but other solutions are under examination at the moment). Orion data are persisted in Cosmos thanks to the FIWARE CygnusFootnote 6 connector. Figure 2 depicts the proposed three-layer logical architecture. The first layer consists of non-persistent sensor data storage on mobiles (implemented via SQLite), of persistent storage on the cloud (implemented via Apache Hive) and of relational DBs for law regulations, device technical specifications and administrative divisions. The second layer has context-brokering capabilities for managing multiple sensors as well as data filtering (thanks to Pentaho CEFootnote 7, a freeware ETL application), integration and reporting functionalities. The third layer offers a Web app for accessing data reporting and integration results. Mobiles and a limited number of fixed monitoring stations represent data sources. We also developed a Web app for data visualization purposes, according to requirements elicited from users (i.e., city managers, citizens, students).

Fig. 2.
figure 2figure 2

Platform logical architecture.

The ETL pipeline is responsible for data management, outlier identification and removal process as well as for the RSSI aggregation of measurements from mobiles served by different network providers but located in the same area in a relatively short time window. By doing so, it is possible to achieve a more realistic evaluation of the overall received power in a given area, since each mobile is able to quantify only the RSSI provided by its serving network operator.

4 Prototype Platform Analysis

4.1 On-Site Trials

A group of students from our University tested the platform in a central area of the city of Lecce (Southern Italy). The selected area presents high-traffic hotspots (two roundabouts and two 4-lane roads) and two base stations hosting multiple antennas from different network providers (on the rooftop of two multi-storey buildings).

As for the mobile app usage test, we evaluated the opportunistic sensing mode by collecting measurements in 1-hour time windows by walking across the area. Once started, the app does not require any further intervention by the user, who can examine measurements at any time, as indicated in Fig. 3A (app overall page for opportunistic measurements). Both LEQ(T) and SPL values are reported and plotted on a XY graph. Additionally, selected observation time window T, actual RSSI and serving network type are indicated. The user can stop the sensing session with a dedicated button (page bottom). The participatory sensing mode allows users to decide when performing a measurement and whether enriching it with comments assessing noise sources w.r.t. location (indoor/outdoor), nature (artificial/natural), estimated distance from the observer, typology (amongst a set of predefined values). It is also possible to quantify perceived nuisance levels (by activating a slider on a 10-value scale) and to add free-text comments. This mode is available for noise measurement only, since the assessment of EM emitting sources is much more difficult for unskilled users.

Fig. 3.
figure 3figure 3

Mobile UI for opportunistic measurements (on the left) and georeferenced map of noise measurement locations along with sensed LEQ (30s) values (on the right).

Users can benefit from a Web application for georeferencing and visualizing measurements coming from a given area as points on a map with a colour ramp directly proportional to measured values: both LEQ(T) and RSSI values can be plotted on this map. Figure 3B reports the LEQ (30s) values sensed across the selected area. Measurements can be interpolated as well, thus achieving an intensity map, which is a surface map where adjacent measurements are interpolated according to a given algorithm in order to compute values also for those points where no measurements were actually performed. Intensity maps are extremely useful for understanding how measured levels are distributed throughout the urban environment. Map renderings are achieved by forwarding data, after the ETL process, towards a CartoDBFootnote 8 instance, an open-source, Software as a Service cloud platform for GIS map storage and Web visualization.

4.2 Measurement Accuracy and Privacy Concerns

One of the most relevant issues about MCS is the measurement accuracy, since mobile-embedded sensors are typically less accurate than professional metering equipment. We tackled this aspect in a two-fold way: on the one hand, noise measurements have been validated instrumentally against a known sound sample thanks to a professional SLM; on the other hand, both noise and signal strength measurements are examined during the ETL pipeline in order to remove outliers. The instrumental validation involved a 30 s steady, mid-level, broadband noise source against which measurements gathered from different smartphone models and from a professional, portable, Class-1 SLM (i.e., DeltaOhm HD9019) have been compared. After these trials, we achieved acceptable accuracy, with average ±5 dB bias between MCS and SLM measurements, thus confirming smartphone amenability to be used as preliminary monitoring tools. The outlier detection is perfomed thanks to a univariate algorithm removing measurements with excessive amplitude in a given temporal window. We opted for a slightly modified version of the Tukey’s method [27], which is simple and quite effective with datasets following both a normal distribution and a not highly skewed lognormal distribution.

We also considered privacy issues, for reducing users’ concerns about their potential tracking or identification. Any information or metadata capable of identifying the device owner is discarded and users are notified about this when they start the app for the first time. Mobile devices are only indexed thanks to their IMEI code, which do not allow going back to respective owners (therefore, mobiles are traceable but their owners are unknown to both platform managers and other application end users).

4.3 Data Estimation for a Smart City Scenario

The proposed platform exploits mobile devices and their embedded sensors, therefore the number of prospected users is significant and it can be considered as a real Smart City scenario, where several hundreds of data providers can be enrolled on a very large geographical scale. This subsection is devoted to estimate the data occupancy growth for our platform. Firstly, we estimated the average storage occupation of a single sensor data measurement in nearly 4 kB. Then, by considering energy consumption issues and typical users’ behaviors, we hypothesized that a plausible data collection pattern would consist of 30 raw measurement per hour, over a time window of 6 h per day. We also hypothesized 20 days of usage per month and 10 months of usage per year. Finally, if we estimate to involve 5000 users during the first year of deployment and to double this quantity each year, we have the estimations reported in Table 1, according to which the DWH storage will grow of 30 GB per month and 314 GB per year in the third year after the system deployment.

Table 1. Data growth estimations up to 3 years.

5 Conclusions and Further Developments

In this paper, we examined how the Mobile Crowd Sensing (MCS) paradigm can be exploited as an enabling factor for the fulfilment of the so-called Sensing as a Service model (S2aaS) in a urban context, thus aiming at reaching a Urban Mobile Sensing as a Service (UMS2aaS) model. Two monitoring scenarios have been identified, related to typical life quality concerns of European citizens: noise and EM field exposure. Therefore, we designed, developed and preliminarily tested a mobile app allowing us to gather (1) noise measurements by using smartphone-embedded microphones and (2) received signal power levels (RSSI) by using smartphone internal antennas. The platform also consists of a DWH system for managing sensing data and of a web app providing users with multiple views about collected measurements. The platform exploits some components of the FIWARE middleware for data brokering and storage functionalities. Preliminary tests have been performed in a central area of the city of Lecce demonstrating its suitability in assessing both noise levels and RSSI. A series of improvements are currently under evaluation, such as introducing other sensing tasks and providing the system with publishing/subscribing functionalities, in order to schedule and request sensing tasks to mobile devices scattered across a given geographical area. We will also study proper policies to address energy consumption issues, in order to make the platform capable of sending the requested sensing tasks only to those devices having enough energy to fulfill them for a sufficient lapse of time.