Abstract
Upgrading health reality is the responsibility of all, it is necessary to think about the design of a smart system based on modern technologies to reduce the time and effort exerted on the competent authorities in both health and environmental sectors furthermore making their work environment smart and easy to enable the creativity and innovation as well as to reduce the material costs granted by state of this case “environment.” The process to find a solution for the problem of a triangle as shown in Figure (1) with contradictory heads is a very important and difficult issue in the field of health and environment, these are: (to optimize time utilization, and minimize the human errors, that accompany this human effort as much as possible, and to reduce material costs). Therefore, the idea of Internet technology and the Intelligent Big Data Analysis was developed to design an integrate electronic system of hardware to be developed in different and specific locations to collect information on concentrations that cause air pollution. So, it was invested an idea of Internet of things technology and intelligent data analysis ("Internet of things" and "Intelligent Data Analysis") for the construction of an integrated system of Hardware entities and Software entities placed. The aim of this work is to build a programmable system capable of predicting the pollutant concentrations within the next 48 h called intelligent forecaster of concentrations caused air pollution (IFCsAP) and making the machine the primary source of information after these concentrations are collected and stored in real time. On this basis, we will rely on modern technologies to reach this goal. The proposed design is highly efficient, cost-effective and easy to use and can be deployed in all places (environment with concentrations of air pollution). The main objective of the proposed system is to issue periodic reports (within 48 h of the future) based on the information input from different stations in real time. These reports are issued based on the readings of sensors planted at each station. Each sensor has a measurement of one or more concentrations that cause air pollution. Designed system consists of three basic phases: the construction phase of an integrated electronic circuit consisting of several devices (Modern, LoRa, Waspmate Platform, Arduino, Five sensors).
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Air pollution is one of the most important challenges facing the world today as a result of the development of technology [1, 2], where it can be defined from several aspects in terms of pathogenesis. Pollution due to the presence of living or invisible organisms, such as bacteria and fungi, in the environment such as water, air or soil (Air pollution as chemical is the imbalance of the ecosystem by chemical effects, and these pollutants can be in the form of solid particles or liquid droplets or gases), from the scientific point of view (a change in the harmonic movement between the components of the ecosystem that paralyzes the efficiency of this system and loses its ability to perform its natural role in the self-disposal of pollutants). This research deals with intelligent predictive design to address this phenomenon [3]. There are different types of error measurement, including: Root Mean Square Error (RMSE) measures how much error there is between two data sets. In other words, it compares a predicted value and an observed or known value. It's also known as Root Mean Square Deviation and is one of the most widely used statistics in GIS [4]. \({\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{n} (F_{i} - A_{i} )^{2} }}{n}}\) Where: F = forecasts (expected values or unknown results), A = observed values (known results). n = sample size. Other measures known, cross-entropy loss is another loss function mostly used in regression and classification problems. Cross-entropy loss [5] is given by \(H\left( A \right) = - \mathop \sum \nolimits_{i} F_{t} \log \left( {A_{i} } \right)\) where \(y_{i}^{ - }\) is the target label, and \(y_{i}\) is the output of the classifier. Cross-entropy loss function is used when the output is a probability distribution, and thus it is preferred [6]. While, symmetric mean absolute percentage error (SMAPE) is an accuracy measure based on percentage (or relative) errors. It is usually defined as [7]: \({\text{SAMPE}} = \frac{1}{n}\mathop \sum \nolimits_{t = 1}^{n} \frac{{\left| {F_{t} - A_{t} } \right|}}{{\left| {A_{t} + F_{t} } \right|/2}}\) where At is the actual value and Ft is the forecast value. The absolute difference between At and Ft is divided by half the sum of absolute values of the actual value At and the forecast value Ft. The value of this calculation is summed for every fitted point t and divided again by the number of fitted point’s n. If actual value and forecast value are both 0, we will set SMAPE score 0, too. This paper used SMAPE to evaluate the quality of a prediction, by comparing predicted to observed values.
Forecasting is one of taking decision process to find estimates values for the future based on past data [8]. There are three type of prediction [9]: First, perspective predication model indicates the task of developing a model that is aimed to predict the target’s value as a work of the informative variable and the main aim of these tasks is predicting the value regarding a specific attribute according to the other attribute values. Second, Traditional Prediction: During the first half of the twentieth century, many methods used to extrapolate the future were used for decision-making. They are part of the planning process, at the same time, they have succeeded in helping planners predict and make rational decisions about the future, it is considered a traditional means of dealing with the future when compared to modern methods and techniques in this field. Traditional methods include: Method of prediction by guessing: This method depends on the intuitive way used by the individual in assessing some aspects of the future. But such predictions may fail more than success (Fig. 1).
Deep techniques are set of multi-levels learning techniques derivative from automated learning [10, 11]. A field in which the computer tests algorithms and programs by learning to improve and develop it by itself. Modern computer vision, speech recognition programs and future prediction are all the product of deep learning [12, 13]. The need for this method increases with the emergence of the concept of large data. Because of its ability to deal with these data, so the computer needs preliminary data to understand the relationship between objects, if we can say that it is a set of algorithms that allow the device to learn from itself and events, this makes the device learns and then develops itself through the neural classes [14, 15]. The greater the number of neural classes, the greater the performance of the device. This is characterized by deep instruction in teaching the device on other techniques that have a certain level of learning, injury to develop with the increase in the volume of data. To ensure the quality of automated learning through deep learning, you must provide and enter as many data as possible and to illustrate the relationship among these terms can be conceived in the form of concentric circles as explain in Fig. 2.
In this paper, we will present new forecaster through synthesis between tow techniques LSTM and PSO after develop one of it through build DSN-PSO to enhance the performance of other LSTM to satisfy the following: highly efficient, least cost and easy to use. Before build the forecaster, we design electric circuit consisting of several devices (LoRa, Waspmate Platform, Five sensors).
LoRa is modulation technique which allows sending data at extremely low data-rates to extremely long ranges for more detail see [16].
‘Waspmote platform is an architecture available as open access allow by connect devise and sensors platform, for more detail see [16, 17]. A sensor is a device, module, machine, or subsystem used in many applications to read the data for specific event or change in specific environment in the real times, for more detail see [18].
In this work, we deal with five types of sensors are Grove—Laser PM2.5 Sensor (HM3301) to measure PM2.5, PM10, MQ-7 to measure carbon monoxide (CO), MQ131 to measure Ozone, and NO2 sensor to measure nitrogen dioxide (NO2) used to collect data in real-time Fig. 3 showing the electrical circuit connects the main parts of station.
The main points attempt the achieve in this work:
-
Increase accuracy in knowing the percentage of air pollution in the coming days to take precautionary measures against the risks of such pollution and try to reduce it.
-
This integrated system is part of the electronic management and chemical safety of laboratories.
-
Apply decisions from health and environmental communities and avoid early pollution risks by educating people.
-
The system provides us with important statistical information in raw form that can contribute to the treatment of sources that cause pollution of air produced by human effort such as factories, houses or produced by nature such as burning forests and volcanoes and others and guidance.
-
The system is inexpensive and therefore does not burden the Ministries of Health and Environment.
-
To achieve the innovative method of safety of personnel working in laboratories that deal with these chemicals and comply with the requirements of UNESCO for the achievement of chemical safety and safety conditions.
The sensor is a device that works to detect the physical or chemical ambient state, some measure temperature, some measure pressure, some measure gases, and some measure air quality. It converts the signals incident upon it into electrical impulses that can be measured or counted by a device such as a computer [27]. In other words, the sensor is a device, module, or subsystem that aims to detect events or changes in its environment and send information to other electronics, often a computer processor. The sensor is always used with other electronic devices.
There are many sensors to measure the concentrations that cause air pollution, but in this paper, we will focus on the sensors specific to our work:
1.1 Sensor–Grove PM2.5 Laser (HM3301)
It is a new generation of laser dust detection sensor, which is used for continuous and real detection of dust in the air. It is used to measure PM2.5 and PM10 concentrations.
The main features of this sensor:
-
High sensitivity to dust particles 0.3 μm or greater.
-
Continuous detection of dust concentration in the air in real time.
-
Based on laser light scattering technology, readings are accurate, stable and consistent.
-
Low noise.
-
Energy consumption is very low.
1.2 Sensor–MQ-7
The MQ-7 gas sensor has high sensitivity to carbon monoxide. The sensor can be used to detect different gases that contain carbon dioxide, so it is low in cost and suitable for different applications.
The main features of this sensor:
-
High sensitivity to combustible gas (CO) in a wide range.
-
Stable performance, long life, and low cost.
-
Simple drive circuit.
1.3 Sensor-MQ131
The MQ131 gas sensor is highly sensitive to ozone.
The main features of this sensor:
-
Good sensitivity to ozone in a wide range of gases.
-
Long life and low cost.
-
Simple drive circuit.
1.4 Sensor–WSP1110 nitrogen dioxide sensor
Low-cost electrochemical nitrogen dioxide sensors provide exciting new opportunities for rapid and distributed outdoor air pollution measurements. This type of sensor is stable, long lasting, requires little energy, and is capable of accurately measuring parts per billion (parts per billion).
The main features of this sensor:
-
High sensitivity, stable performance and long-life time
-
Small in size and light in weight
-
5 V voltage, low consumption
-
Quick response reset function, simple drive circuit
-
Long-term stability (50 ppm overload).
1.5 Sensor–SO2
SO2 sensor is designed to measure sulfur dioxide for applications in: air quality monitoring, industrial safety and air purification monitoring.
The main features of this sensor:
-
Small in size with low profile (15 × 15 × 3 mm).
-
Long life (10 years life expectancy).
-
Fast response (15 s typical).
2 Related works
The issue of air quality prediction is one of the critical topics related to human lives and health. The aim of the work presented herein is to develop a new method for such prediction based on the huge amount of data that is available and operating on data series. This section first reviews previous studies by researchers in this area and compares them based on the database used in each case, the methods applied to assess the results, the advantages of each method, and its limitations.
Li et al. [19] used a long short-term memory extended (LSTME) neural network model with combined spatial–temporal links to predict concentrations of air pollutants. In that approach, the LSTM layers automatically extract potential intrinsic properties from historical air pollutant and accompanying data, while meteorological data and timestamp data are also incorporated into the proposed model to improve its performance. The technique was evaluated using three measures (RMSE, MAE, and MAPE) and compared with the STANN, ARMA, and SVR models. The work presented herein is similar in its use of the LSTM approach as part of a recurrent neural network structure but differs in its use of another evaluation measure.
Lifeng et al. [20] reported that the best predictions of air quality could be obtained using the GM model (1.1) with fractional order accumulation, i.e., FGM (1.1), to find the expected average annual concentrations of PM2.5, PM10, SO2, NO2, 8-h O3, and O-24 h. The measure used in that work was the MAPE. Application of the FGM (1.1) method resulted in much better performance compared with the traditional GM model (1.1), revealing that the average annual concentrations of PM2.5, PM10, SO2, NO2, O8–O3, and O3 24-h will decrease from 2017 to 2020. That work presented herein is similar in that it predicts the concentration of air pollutants and finds ways to address them, but differs in its use of the LSTM method for the predictions.
Wen et al. [21] combined a convolutional neural network (CNN) and LSTM neural network (NN), as well as meteorological and aerosol data, to refine the prediction performance of the model. Data collected from 1233 air quality monitoring stations in Beijing and the whole of China were used to verify the effectiveness of the proposed model (C-LSTME). The results showed that the model achieved better performance than state-of-the-art technologies for predictions over different durations at various regional and environmental scales. The technique was evaluated using three measures (RMSE, MAE, and MAPE). In comparison, the LSTM approach is also applied in a RNN in this work, but after having identified the best structure for the network. In addition, another evaluation measure is used herein.
Shang et al. [22] described a prediction method based on a classification and regression tree (CART) approach in combination with the ensemble extreme learning machine (EELM) method. Subgroups were created by dividing the datasets using a shallow hierarchy tree through the CART approach. At each node of the tree, EEL models were constructed using the training samples of the node, to minimize the verification errors sequentially in all of the subtrees of each tree by identifying the number of hidden intestines, where each node is considered to be a root. Finally, the EEL models for each path to a leaf are compared with the root of each leaf, selecting only the path with the smallest error to check the leaf. The measures used in that work were the RMSE and MAPE. This experimental measurement results revealed that such a method can address the issue of global–local duplication of the prediction method at each leaf and that the combined CART–EELM approach worked better than the random forest (RF), v-(SVR), and EELM models, while also showing superior performance compared with EELM or k-means EELM seasonal. The work presented herein is similar in that it uses the same set of six air pollution indexes (PM2.5, O3, PM10, SO2, NO2, CO) but differs in terms of the mechanism applied to reduce air pollutants, applying the RNN method.
Li et al. [23] applied a new air quality forecasting method and proposed a new positive analysis mechanism that includes complex analysis, improved prediction units, data pretreatment, and air quality control problems. The system analyzes the original series using an entropy model and a data processing process. The multiobjective multiverse optimization (MOMVO) algorithm is used to achieve the required performance, revealing that the least-squares (LS)SVM achieved the best accuracy in addition to stable predictions. Three measures were used for the evaluation in that work, viz. RMSE, MAE, and MAPE. The results of the application of the proposed method to the dataset revealed good performance for the analysis and control of air quality, in addition to the approximation of values with high precision. The work presented herein uses the same evaluation measures but differs in its use of the LSTM approach in the RNN after identifying the best structure for the network.
Kim et.al. [24] aim to build annual-average integrated empirical geographic (IEG) regression models for the contiguous USA for six criteria pollutants during 1979–2015; explore systematically the impact on model performance of the number of variables selected for inclusion in a model; and provide publicly available model predictions. We compute annual-average concentrations from regulatory monitoring data for PM10, PM2.5, NO2, SO2, CO, and ozone at all monitoring sites for 1979–2015.
3 Building IFCsAP
The model presents in this paper consist of two phases, the first including build the station as electrical circuit to collect the data related to six concentrations in real time and saved it on the master computer to preparing and processing in next phase. The second phase focuses on processing dataset after splitting it based on station identifier, the processing phase pass on many levels of learning to product forecaster can deal with hug/big dataset. All the actives of this researcher summarization in Fig. 5 while the algorithm of IFCsAP model described in main algorithm. To making the model more understanding, we explain the first phase on it in Fig. 3 while the second phase in Fig. 4. The main constructions used.
-
PM2.5: 10 µg/m3 (average allowable value per year), 25 µg/m3 (average allowable value in 24 h).
-
PM10: 20 µg/m3 (average allowable value per year), 50 µg/m3 (average allowable value per year).
-
o3: 100 µg/m3 (average allowable value in eight hours). The recommended maximum value, previously set at 120 µg/in eight hours, has been reduced to 100 μg/m3 based on recent findings of relationships between daily mortality and ozone levels in locations where the concentration of the substance is less than 120 µg/m3.
-
No2: 40 µg/m3 (average allowable value per year), 200 μg/m3 (average allowable value per hour).
-
SO2: 20 µg/m3 (average allowable value in twenty-four hours), 500 μg/m3 (average allowable value in 10 min).
Dataset collection through two types of resources (i.e., directory web site represents by KDD cup 2018 dataset and by building station have multi-sensors to caption concentrations). That dataset needed to handle it before building the predictor as follows.
-
Split the dataset for each station and save it in separated file hold the name of this station.
-
After that, treatment missing values through drop each row have one or more missing values.
-
Finally, apply the normalization for each column in dataset related to each station to make the value of that concentration in the range [0, 1].
3.1 Develop long short-term memory (DLSTM)
This paper presents how can employ PSO through build new algorithm called DSN-PSO as explained in algorithm 2 to enhance the performance of one of deep learning algorithm LSTM (i.e., for more detail see Main steps for training LSTM–RNN in “Appendix”) through determined the structure and parameters of it. This explains with details in Algorithm 3 (Table 1).
3.2 Running the IFCsAP model
We will train and predict concentrations movements for several epochs and see whether the predictions get better or worse over time. The algorithm is shown how execution the IFCsAP model.
3.3 Evaluation stage
The symmetric mean absolute percentage error (SMAPE) is used in this paper as measured to determine the accuracy and robust of the predictor.
SAMPE =
N: number of samples. : forecast value.: Actual value. t: every fitted point.
Set SMAPE score as 0. If both values predict and actual are 0 actual value and forecast value are both 0. In each station forecasting the concentration levels “PM2.5, PM10, NO2, CO, O3 and SO2” to the next 48 h. We can calculate the values of this measure daily contagious through one moth then sort these values and compute the average of 25 lowest daily SMAPE scores. The main steps of evaluation shown with details in algorithm 5.
4 Results of IFCsAP model
The results and justification of it will explain with details in this section.
4.1 Pre-processing
This stage consists of multi-steps performance on the dataset after collecting it, each step handles the dataset from one problem as we will be discussed later.
4.1.1 Split station
The second column of Table 4 shows the result of splitting the dataset based on the name of station, where each station saves in a separated file hold the name of it.
4.1.2 Missing values [8]
Missing values one of the problem effect in the final results of any model. Spatially the prediction model, where all researches know the result of predictor become more accuracy of that predictor build based on true values otherwise the results become not truest. Therefore, in that model, we will drop any record have missing values in each station. In general, the station has different rate of missing value as explained in Table 2 column three and Fig. 5.
Table 2 explains the dataset after split it into 35 stations have the same number of record 8886 and six features. Also show the dataset after handle the missing values with the rate of dropping. Also, Fig. 6 shows the percentage of records have missing values in each station.
4.1.3 Normalization
Normalization dataset based on MinMaxScaler scales to become in the range [0 and 1] [20, 25]. This is a necessary step for the proposed predictor. The main purpose of the normalization stage is to make all the values in the same range with save the natural of each feature in that dataset.
4.1.4 Split the dataset
Cross-validation is the best techniques for evaluating the performance of a given model. Because badly selected samples for training and testing affects the performance badly, cross-validation has different methods for wisely selecting the best samples for training and testing a given model. As shown in Table 3 as attached in “Appendix” and Fig. 7.
Table 4 shows idea of cross-validation which, in this paper, used ten cross-validation for each station to determine the best number of samples will used from training dataset to build model and from testing dataset to evaluation of the model.
We note that the station with the highest percentage of missing values has a very high SMAPE score compared to stations with the lowest percentage of missing values. We conclude that using the drooping process will make the predictor results more accurate compared to other methods used to process missing values (Fig. 8).
4.2 DSN-PS
Select the suitable parameters of any deep learning algorithm is consider one of the main challenges in the science, in general, all known LSTM take a very long time in implementation to give the result; therefore, this section shows how DSN-PS solve this problem and exceed this challenge. The optimal structure with main parameters was find to DLSTM.
In other words, values determined as hidden layers number, nodes in each hidden layer, weights among layers, the bias, and activation function type of the deep learning network are essential parameters that fundamentally affect DLSTM performance. In general, all the network based on the try and error principle in select the parameters of it, while this led to long time on implementation that network. Therefore, the main parameters of DLSTM result from DSN-PS as shown in Table 5. While Table 6 shows the best parameters that represent the structure of DLSTM and compare with the parameters of traditional LSTM.
Table 6 shows best parameters (# hidden layers, #nodes in each hidden layer, weights, bias, and activation function) resulted from the DSN-PS algorithm that represents the initial structure of the DLSTM (Table 7).
4.3 DLSTM
DLSTM is mainly based on the LSTM algorithm, which is capable of handling large data and retains data for long periods because each cell contains memory. In this stage, forward the parameters result from DNS-PS to DLSTM that represent the structure of it with the dataset of that station generated from the best split of ten cross-validations to represent training of DLSTM. Compute the prediction values for each station (Station #1… Station #35) based on the best split result from ten cross-validations. The best parameters result from DNS-PSO as structure of DLSTM represent one input layer have six nodes each node represent one of six constructions; one hidden layer contain 250 nodes, one output layer. All other parameters and activation function described in Table 8. Also, we used 150 iteration in each iteration we enter batch size 24.
Compare the actual and prediction values results from DLSTM for first station shown in Fig. 9.
Compare the prediction values Station #18 based on the best split result from ten cross-validation and compare with the real values shown in Fig. 10.
Compare the prediction values Station #34 based on the best split result from ten cross-validation and compare with the real values shown in Fig. 11.
4.4 SMAPE evaluation
After, build the DLSTM based on the training dataset for each station, the model evaluated through compute SMAPE for testing dataset.
The result score of each concentration is the average of 25 lowest daily SMAPE scores. If a concentration misses a day, the score of this concentration on that day will be imputed by the baseline score. As shown in “Appendix” under Table 8.
5 Compare between traditional LSTM and IFCsAP based on the values of SMAPE
To explain the successful of IFCsAP model, we compare the result values of SMAPE come from the traditional LSTM and IFCsAP. As shown in Table 9.
The above table showed the result of SMAPE of IFCsAP Model, in comparison with the result of SMAPE of traditional LSTM. Which used the same dataset from the pre-processing stage (i.e., the dropping, normalization and the same split of training and testing resulting from ten cross-validation) were applied at each station. We found that the results SMAPE of IFCsAP model are better than traditional LSTM as shown in Fig. 12.
6 Summary
Air quality index dataset is a huge data needed to intelligent and deep computation to extract a useful pattern from it. The advantage of this data set is diverse and large in size, resulting in accurate and reliable decisions. In addition, the data used in this thesis were obtained from more than one station and this in itself is considered a challenge in building a stable prediction system for behaviors. Limitation of this dataset contains on concentrations that cause air pollution are usually unequal and unknown to non-experts, which contain missing value and taken from different stations in terms of the environment assigned to those stations.
DSN-PS is determined the parameters and activation function of DLSTM, the advantage of DSN-PS is the time of execution LSTM will be reduced, limitation of DSN-PS will increase complex of LSTM.
DLSTM is a develop of LSTM by DSN-PS, POS used to determine the optimal (number of hidden layers, number of nodes in each hidden layer, weight, bias, and activation function), the advantage of DLSTM capable to deal with huge data and contain memory cell to save information at the long term, the limitation of DSTM contain on huge number of parameters.
Evaluation is the process of calculating the amount of error from the actual value and its predicted value, there are different types of error measures: including prediction (i.e., MSE, RMSE, MAE, MAPE and, etc.) and coefficient matrix (i.e., accuracy, F, FP, etc.). While in this research, use SMAPE Evaluation.
-
How particle swarm can be useful in building a recurrent neural network (RNN)?
PSO works to modify the behavior of each in a particular environment gradually, depending on the behavior of their neighbors until they are obtained the optimal solution.
On the other hand, the neural networks use the principle of the try and error in the selection of the basic parameters of their own and modified gradually to reach the values accepted for those parameters.
Depending on the PSO and neural networks of the above subject, we used the PSO principle to find the optimal parameters and the activation function of the neural network.
-
How to build a multi-layer model with a combination of two technologies LSTM-RNN with particle swarm?
Through, building new predictor called IFCsAP that combining between the DSN-PS and the DLSTM. Where DSN-PS used to find the best structure with parameter to LSTM while DLSTM used to predict the rate Concentrations of air pollution.
-
IS SMAPE measure enough to evaluate the results of suggesting predictor?
Yes, The SMAPE is sufficient to evaluate the results of the predictor within the next 48 h.
-
What is the benefit result from building predictor by combination between DSN-PS and DLSTM?
By combining DNS-PS and DLSTM, reduce the execution time by defining network parameters but at the same time will increase the computational complexity.
7 Conclusions
We can summarize the main point performance in that paper as the follows: Building an integrated platform based on physical and program entities in the form of an integrated station ((H/W, S/W) used for essential needs only and reduces the damage resulting from air pollution; thus, this platform saves effort and cost through sensor programming and activating its role to read data on concentrations that cause pollution real-time air, increasing performance, reducing effort, reducing time and cost. Building a special station to measure the concentrations that cause air pollution, which depends on the principle of Intelligent Data Analysis (IDA). Where data are collected from the stations which are considered as Class Node by the wireless network that was built represented (LoRa & Waspmate) on the calculator which is considered as Master Node. The IFCsAP is fed with the data collected in real time and the preliminary processing is performed on it, after which the predictor results are evaluated using a symmetric mean absolute percentage error (SMAPE). Through this scale, we will evaluate the levels of PM2.5, PM10, NO2, CO and O3. And SO2 for the next 48 h for each station. Often the data contain a proportion of missing or incomplete data, which causes an increase in the prediction or classification error of that data. Therefore, this problem can be addressed by deleting the entries that contain that data to create a more accurate forecast. The purpose of the Normalization process is to convert data within a specified range of values to be dealt with more accurately in the subsequent stages of processing. In our work, the data were converted within the range [0, 1], because the activation function deals with data within that range. The designed IFCsAP to dealt with one of the most important problems facing the environment at the present time as a result of increased pollution due to electronic waste, factories and laboratories, and the lack of real projects in Iraq to reduce air pollution rates. The designed model proved its accuracy and efficiency in predicting the concentrations that cause air pollution. The designed model is distinguished by the construction of a new tool called DLSTM, which is characterized by its ability to deal with large-size data as well as containing memory that enables it to retain data for long periods. Experiments have shown that the combination of the two technologies that have been designed, which are both DLSTM and DSN-PS, achieve more accurate results and reduce implementation time. The first tool that was designed, DSN-PS, used it to select the best parameters (parameters) to determine the structure of the second tool that was built called DLSTM, thus improving the performance of deep learning models and resulting in the production of a IFCsAP predictor that displays more accurate and efficient results.
The following point gives good idea for features works; Through explore the PSO, to tune other LSTM parameters such as: the learning rate, max error and the number of epochs, instead of based on trial-and-error principles that take too long to find the optimal parameters for the LSTM network. PSO with other deep learning models to find the best parameters and activation functions, instead of based on trial-and-error principles to find the optimal of network structure. Other type of swarm optimization such as (ant colony optimization, cuckoo search algorithm and glowworm swarm optimization) or a genetic algorithm can be used to find the best parameters and activation functions for LSTM.
Abbreviations
- DLSTM:
-
Developed long short–term memory
- LSTM:
-
Long short-term memory
- PSO:
-
Particle swarm optimization
- SMAPE:
-
Symmetric mean absolute percentage error
- PM2.5:
-
Particulate matter that has a diameter of less than 2.5 µm
- PM10:
-
Particulate matter 10 µm or less in diameter
- O3:
-
Ozone is the unstable triatomic form of oxygen
- Sox:
-
Sulfur oxides
- CO:
-
Carbon monoxide
- NOx:
-
Nitrogen oxides
- ⊙:
-
Is the element-wise product or Hadamard product
- ⊗ :
-
Outer products will be represented
- σ :
-
Represents the sigmoid function
- at:
-
Input activation
- it:
-
Input gate
- ft:
-
Forget gate
- ot:
-
Output gate
- Statet:
-
Internal state
- Outt :
-
Output
- W :
-
The weights of the input
- U :
-
The weights of recurrent connections
- \(V_{i}^{t}\) :
-
Particle velocity i in swarm in dimension j and frequency t.
- \(X_{i}^{t}\) :
-
The location of the particle i in a swarm in dimension j and frequency t.
- c 1 :
-
Acceleration factor related to Pbest.
- c 2 :
-
Acceleration factor related to gbest.
- \(r_{1}^{t}\), \(r_{2}^{t}\) :
-
Random number between 0 and 1.
- t :
-
Number of occurrences specified by type of problem.
- \(G_{{{\text{best}},i}}^{t} \) :
-
Gbest position of swarm
- \(P_{{{\text{best}},i}}^{t}\) :
-
Pbest position of particle
References
Al-Janabi S, Alkaim AF, Adel Z (2020) An innovative synthesis of deep learning techniques (DCapsNet and DCOM) for generation electrical renewable energy from wind energy. Soft Comput 24:10943–10962. https://doi.org/10.1007/s00500-020-04905-9
Alkaim AF, Al_Janabi S (2020) Multi objectives optimization to gas flaring reduction from oil production. In: Farhaoui Y (ed) Big data and networks technologies. BDNT 2019. Lecture notes in networks and systems, vol 81. Springer, Cham. https://doi.org/10.1007/978-3-030-23672-4_10
Chen B et al (2018) How do people in different places experience different levels of air pollution? Using worldwide Chinese as a lens. Environ Pollut 238:874–883. https://doi.org/10.1016/J.ENVPOL.2018.03.093
Donahue NM (2018) Air pollution and air quality. Green Chem. https://doi.org/10.1016/B978-0-12-809270-5.00007-8
Das HS, Roy P (2019) A deep dive into deep learning techniques for solving spoken language identification problems. Intell Speech Signal Process. https://doi.org/10.1016/B978-0-12-818130-0.00005-2
Aunan K, Hansen MH, Liu Z, Wang S (2019) The hidden hazard of household air pollution in rural China. Environ Sci Policy 93:27–33. https://doi.org/10.1016/J.ENVSCI.2018.12.004
Basavaraju S, Gaj S, Sur A (2019) Object memorability prediction using deep learning: location and size bias. J Vis Commun Image Represent 59:117–127. https://doi.org/10.1016/J.JVCIR.2019.01.008
Al-Janabi S, Alkaim AF (2020) A nifty collaborative analysis to predicting a novel tool (DRFLLS) for missing values estimation. Soft Comput 24(1):555–569. https://doi.org/10.1007/s00500-019-03972-x
Al-Janabi S, Mahdi MA (2019) Evaluation prediction techniques to achievement an optimal biomedical analysis. Int J Grid Util Comput 10(5):512–527
Chien J-T, Chien J-T (2019) Deep neural network. Source Sep Mach Learn. https://doi.org/10.1016/B978-0-12-804566-4.00019-X
Al-Janabi S, Alkaim AF (2021) A comparative analysis of DNA protein synthesis for solving optimization problems: a novel nature-inspired algorithm. In: Abraham A, Sasaki H, Rios R, Gandhi N, Singh U, Ma K (eds) Innovations in bio-inspired computing and applications. IBICA 2020. Advances in Intelligent Systems and Computing, vol 1372. Springer, Cham. https://doi.org/10.1007/978-3-030-73603-3_1
Congcong W, Shufu L, Xiaojing Y, Ling P, Xiang L, Yuan H, Tianhe C (2019) A novel spatiotemporal convolutional long short-term neural network for air pollution prediction. J Sci Total Environ 654:1091–1099
Al-Janabi S et al (2015) Design and evaluation of a hybrid system for detection and prediction of faults in electrical transformers. Int J Elect Power Energy Syst 67:324–335. https://doi.org/10.1016/j.ijepes.2014.12.005
Bianchi FM, Maiorino E, Kampffmeyer MC et al (2017) An overview and comparative analysis of recurrent neural networks for short term load forecasting. https://doi.org/10.1007/978-3-319-70338-1
Al-Janabi S, Salman AH (2021) "Sensitive integration of multilevel optimization model in human activity recognition for smartphone and smartwatch applications". In: Big data mining and analytics, vol 4, no 2, pp 124–138. https://doi.org/10.26599/BDMA.2020.9020022
Samaher AJ (2020) Smart system to create an optimal higher education environment using IDA and IOTs. Int J Comput Appl 42(3):244–259. https://doi.org/10.1080/1206212X.2018.1512460
Al-Janabi S, Al-Shourbaji I, Shojafar M, Abdelhag M (2017) Mobile cloud computing: challenges and future research directions. In: 2017 10th international conference on developments in esystems engineering (DeSE), pp 62–67. https://doi.org/10.1109/DeSE.2017.21
Tebrean B, Crisan S, Muresan C, Crisan TE (2017) Low cost command and control system for automated infusion devices. In: Vlad S, Roman N (eds) International conference on advancements of medicine and health care through technology; 12th–15th October 2016, Cluj-Napoca, Romania. IFMBE Proceedings, vol 59. Springer, Cham. https://doi.org/10.1007/978-3-319-52875-5_18
Li X, Peng L, Yao X, Cui S, Hu Y, You C, Chi T (2017) Long short-term memory neural network for air pollutant concentration predictions: method development and evaluation. EnvironPollut 231(Pt 1):997–1004. https://doi.org/10.1016/j.envpol.2017.08.114
Wu L, Li N, Yang Y (2018) Prediction of air quality indicators for the Beijing-Tianjin-Hebei region. J Clean Prod 196:682–687. https://doi.org/10.1016/j.jclepro.2018.06.068
Wen C et al (2019) A novel spatiotemporal convolutional long short-term neural network for air pollution prediction. Sci Total Environ 654:1091–1099. https://doi.org/10.1016/j.scitotenv.2018.11.086
Shang Z, Deng T, He J, Duan X (2019) A novel model for hourly PM2.5 concentration prediction based on CART and EELM. Sci Total Environ 651:3043–3052. https://doi.org/10.1016/j.scitotenv.2018.10.193
Li H, Wang J, Li R, Lu H (2019) Novel analysis–forecast system based on multi-objective optimization for air quality index. J Clean Prod 208:1365–1383. https://doi.org/10.1016/j.jclepro.2018.10.129
Kim SY, Bechle M, Hankey S, Sheppard L, Szpiro AA, Marshall JD (2020) Concentrations of criteria pollutants in the contiguous US, 1979–2015: role of prediction model parsimony in integrated empirical geographic regression. PLoS ONE. https://doi.org/10.1371/journal.pone.0228535
Matos J, Faria RPV, Nogueira IBR et al (2019) Optimization strategies for chiral separation by true moving bed chromatography using Particles Swarm Optimization (PSO) and new Parallel PSO variant. Comput Chem Eng 123:344–356. https://doi.org/10.1016/J.COMPCHEMENG.2019.01.020
Al-Janabi S, Alwan E (2017) Soft mathematical system to solve black box problem through development the FARB based on hyperbolic and polynomial functions. In: 2017 10th international conference on developments in esystems engineering (DeSE), pp 37–42. https://doi.org/10.1109/DeSE.2017.23
Zhou B-Z, Liu X-F, Cai G-P et al (2019) Motion prediction of an uncontrolled space target. Adv Sp Res 63:496–511. https://doi.org/10.1016/J.ASR.2018.09.025
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the author.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
1.1 Main steps for training LSTM–RNN
In this section, we will show the main steps required to take the decision based on LSTM-RNN, also show how can update variables.
Step 1: The forward components
Step 1.1: Compute the gates
Memory cell:
Input gate:
Forget gate:
Output gate:
Then fined:
Internal state:
Output:
where
Step. 2: The backward components:
Step 2.1. Find
\(\Delta t\) the output difference as computed by any subsequent.
\(\Delta {\text{OUT}}\) the output difference as computed by the next time-step
Step 2.2: Gives
Step 3: update to the internal parameter
2.1 Example of LSTM
This example will show calculations performed in LSTM.
Assume the out internal weights:
Let input data:
Forward t = 0
Backward t = 1
Compute the different between outputs
So, \(\Delta_{1} = \partial_{x} E = 0.77197 - 1.25 = - 0.47803\).
\(\Delta {\text{out}}_{1} = 0 \) Because there is no future time – steps
At this time need return to back our Δout and forwarding on calculation.
Backward t = 0
After complete backward, let λ = 0.1 and update the parameters by:
Depend on the SGD updating out parameters:
\(W_{a} = \left[ {\begin{array}{*{20}c} {0.45267} \\ {0.25922} \\ \end{array} } \right],\;\;U_{a} = \left[ {0.15104} \right],\;\; b_{a} = [0.20364\)].
\(W_{i} = \left[ {\begin{array}{*{20}c} {0.95022} \\ {0.80067} \\ \end{array} } \right],\;\;U_{i} = \left[ {0.80006} \right],\;\;b_{i} = [0.65028\)]
Rights and permissions
About this article
Cite this article
Al-Janabi, S., Alkaim, A., Al-Janabi, E. et al. Intelligent forecaster of concentrations (PM2.5, PM10, NO2, CO, O3, SO2) caused air pollution (IFCsAP). Neural Comput & Applic 33, 14199–14229 (2021). https://doi.org/10.1007/s00521-021-06067-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06067-7