Intelligent forecaster of concentrations (PM2.5, PM10, NO2, CO, O3, SO2) caused air pollution (IFCsAP)

Al-Janabi, Samaher; Alkaim, Ayad; Al-Janabi, Ehab; Aljeboree, Aseel; Mustafa, M.

doi:10.1007/s00521-021-06067-7

Intelligent forecaster of concentrations (PM2.5, PM10, NO2, CO, O3, SO2) caused air pollution (IFCsAP)

Original Article
Published: 20 May 2021

Volume 33, pages 14199–14229, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Neural Computing and Applications Aims and scope Submit manuscript

Intelligent forecaster of concentrations (PM2.5, PM10, NO2, CO, O3, SO2) caused air pollution (IFCsAP)

Download PDF

1227 Accesses
76 Citations
Explore all metrics

Abstract

Upgrading health reality is the responsibility of all, it is necessary to think about the design of a smart system based on modern technologies to reduce the time and effort exerted on the competent authorities in both health and environmental sectors furthermore making their work environment smart and easy to enable the creativity and innovation as well as to reduce the material costs granted by state of this case “environment.” The process to find a solution for the problem of a triangle as shown in Figure (1) with contradictory heads is a very important and difficult issue in the field of health and environment, these are: (to optimize time utilization, and minimize the human errors, that accompany this human effort as much as possible, and to reduce material costs). Therefore, the idea of Internet technology and the Intelligent Big Data Analysis was developed to design an integrate electronic system of hardware to be developed in different and specific locations to collect information on concentrations that cause air pollution. So, it was invested an idea of Internet of things technology and intelligent data analysis ("Internet of things" and "Intelligent Data Analysis") for the construction of an integrated system of Hardware entities and Software entities placed. The aim of this work is to build a programmable system capable of predicting the pollutant concentrations within the next 48 h called intelligent forecaster of concentrations caused air pollution (IFCsAP) and making the machine the primary source of information after these concentrations are collected and stored in real time. On this basis, we will rely on modern technologies to reach this goal. The proposed design is highly efficient, cost-effective and easy to use and can be deployed in all places (environment with concentrations of air pollution). The main objective of the proposed system is to issue periodic reports (within 48 h of the future) based on the information input from different stations in real time. These reports are issued based on the readings of sensors planted at each station. Each sensor has a measurement of one or more concentrations that cause air pollution. Designed system consists of three basic phases: the construction phase of an integrated electronic circuit consisting of several devices (Modern, LoRa, Waspmate Platform, Arduino, Five sensors).

Low-cost Real-time IoT-Based Air Quality Monitoring and Forecasting

Self Configurable Air Pollution Monitoring System Using IoT and Data Mining Techniques

Particulate Matter Assessment in Association with Temperature and Humidity: An Experimental Study on Residential Environment

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Air pollution is one of the most important challenges facing the world today as a result of the development of technology [1, 2], where it can be defined from several aspects in terms of pathogenesis. Pollution due to the presence of living or invisible organisms, such as bacteria and fungi, in the environment such as water, air or soil (Air pollution as chemical is the imbalance of the ecosystem by chemical effects, and these pollutants can be in the form of solid particles or liquid droplets or gases), from the scientific point of view (a change in the harmonic movement between the components of the ecosystem that paralyzes the efficiency of this system and loses its ability to perform its natural role in the self-disposal of pollutants). This research deals with intelligent predictive design to address this phenomenon [3]. There are different types of error measurement, including: Root Mean Square Error (RMSE) measures how much error there is between two data sets. In other words, it compares a predicted value and an observed or known value. It's also known as Root Mean Square Deviation and is one of the most widely used statistics in GIS [4]. ${\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{n} (F_{i} - A_{i} )^{2} }}{n}}$ Where: F = forecasts (expected values or unknown results), A = observed values (known results). n = sample size. Other measures known, cross-entropy loss is another loss function mostly used in regression and classification problems. Cross-entropy loss [5] is given by $H\left( A \right) = - \mathop \sum \nolimits_{i} F_{t} \log \left( {A_{i} } \right)$ where $y_{i}^{ - }$ is the target label, and $y_{i}$ is the output of the classifier. Cross-entropy loss function is used when the output is a probability distribution, and thus it is preferred [6]. While, symmetric mean absolute percentage error (SMAPE) is an accuracy measure based on percentage (or relative) errors. It is usually defined as [7]: ${\text{SAMPE}} = \frac{1}{n}\mathop \sum \nolimits_{t = 1}^{n} \frac{{\left| {F_{t} - A_{t} } \right|}}{{\left| {A_{t} + F_{t} } \right|/2}}$ where At is the actual value and Ft is the forecast value. The absolute difference between At and Ft is divided by half the sum of absolute values of the actual value At and the forecast value Ft. The value of this calculation is summed for every fitted point t and divided again by the number of fitted point’s n. If actual value and forecast value are both 0, we will set SMAPE score 0, too. This paper used SMAPE to evaluate the quality of a prediction, by comparing predicted to observed values.

Forecasting is one of taking decision process to find estimates values for the future based on past data [8]. There are three type of prediction [9]: First, perspective predication model indicates the task of developing a model that is aimed to predict the target’s value as a work of the informative variable and the main aim of these tasks is predicting the value regarding a specific attribute according to the other attribute values. Second, Traditional Prediction: During the first half of the twentieth century, many methods used to extrapolate the future were used for decision-making. They are part of the planning process, at the same time, they have succeeded in helping planners predict and make rational decisions about the future, it is considered a traditional means of dealing with the future when compared to modern methods and techniques in this field. Traditional methods include: Method of prediction by guessing: This method depends on the intuitive way used by the individual in assessing some aspects of the future. But such predictions may fail more than success (Fig. 1).

Deep techniques are set of multi-levels learning techniques derivative from automated learning [10, 11]. A field in which the computer tests algorithms and programs by learning to improve and develop it by itself. Modern computer vision, speech recognition programs and future prediction are all the product of deep learning [12, 13]. The need for this method increases with the emergence of the concept of large data. Because of its ability to deal with these data, so the computer needs preliminary data to understand the relationship between objects, if we can say that it is a set of algorithms that allow the device to learn from itself and events, this makes the device learns and then develops itself through the neural classes [14, 15]. The greater the number of neural classes, the greater the performance of the device. This is characterized by deep instruction in teaching the device on other techniques that have a certain level of learning, injury to develop with the increase in the volume of data. To ensure the quality of automated learning through deep learning, you must provide and enter as many data as possible and to illustrate the relationship among these terms can be conceived in the form of concentric circles as explain in Fig. 2.

In this paper, we will present new forecaster through synthesis between tow techniques LSTM and PSO after develop one of it through build DSN-PSO to enhance the performance of other LSTM to satisfy the following: highly efficient, least cost and easy to use. Before build the forecaster, we design electric circuit consisting of several devices (LoRa, Waspmate Platform, Five sensors).

LoRa is modulation technique which allows sending data at extremely low data-rates to extremely long ranges for more detail see [16].

‘Waspmote platform is an architecture available as open access allow by connect devise and sensors platform, for more detail see [16, 17]. A sensor is a device, module, machine, or subsystem used in many applications to read the data for specific event or change in specific environment in the real times, for more detail see [18].

In this work, we deal with five types of sensors are Grove—Laser PM2.5 Sensor (HM3301) to measure PM2.5, PM10, MQ-7 to measure carbon monoxide (CO), MQ131 to measure Ozone, and NO2 sensor to measure nitrogen dioxide (NO2) used to collect data in real-time Fig. 3 showing the electrical circuit connects the main parts of station.

The main points attempt the achieve in this work:

Increase accuracy in knowing the percentage of air pollution in the coming days to take precautionary measures against the risks of such pollution and try to reduce it.
This integrated system is part of the electronic management and chemical safety of laboratories.
Apply decisions from health and environmental communities and avoid early pollution risks by educating people.
The system provides us with important statistical information in raw form that can contribute to the treatment of sources that cause pollution of air produced by human effort such as factories, houses or produced by nature such as burning forests and volcanoes and others and guidance.
The system is inexpensive and therefore does not burden the Ministries of Health and Environment.
To achieve the innovative method of safety of personnel working in laboratories that deal with these chemicals and comply with the requirements of UNESCO for the achievement of chemical safety and safety conditions.

The sensor is a device that works to detect the physical or chemical ambient state, some measure temperature, some measure pressure, some measure gases, and some measure air quality. It converts the signals incident upon it into electrical impulses that can be measured or counted by a device such as a computer [27]. In other words, the sensor is a device, module, or subsystem that aims to detect events or changes in its environment and send information to other electronics, often a computer processor. The sensor is always used with other electronic devices.

There are many sensors to measure the concentrations that cause air pollution, but in this paper, we will focus on the sensors specific to our work:

1.1 Sensor–Grove PM2.5 Laser (HM3301)

It is a new generation of laser dust detection sensor, which is used for continuous and real detection of dust in the air. It is used to measure PM2.5 and PM10 concentrations.