Introduction

Fine particle (PM2.5) concentrations in the air contribute to harmful health effects (Rodríguez-Cotto et al. 2014). In November 2019, the Brazilian Environmental Council (CONAMA 2018) updated air quality standards, including the standard for PM2.5. This update, however, will be implemented in steps. Initially, the intermediate standard for PM2.5 will be 20 μg m−3 per year and 60 μg m−3 in 24 h. The final standard will meet the World Health Organization (WHO) recommendations, i.e., 10 μg m−3 or 25 μg m−3 per day. In this study, we will compare PM2.5 levels with the guidelines recommended by the WHO.

Rio de Janeiro state has the oldest air quality monitoring network in Brazil and one of the oldest in Latin America, having been in operation since the 1960s (Gioda et al. 2016). However, only in 2010 did the Rio de Janeiro network start PM2.5 monitoring (Ventura et al. 2017a, b).

Due to the mortality and morbidity caused by PM2.5, air pollution controls have become urgent (Liu and Peng 2018; Pope et al. 2018). Making predictions based on time series prediction techniques is fundamental to the analysis and support needed for environmental agencies to make decisions (Relvas and Miranda 2018; Mehdipour et al. 2018).

Statistical models of air quality forecasting have been sparsely used in South America to predict critical pollution episodes, hampering the ability to control emissions on critical days (Perez 2012). There are many statistical models available to predict air pollutant concentrations, such as principal component analysis with multiple linear regression (PCA-MLR) (e.g., Ul-Saufie et al. 2013), autoregressive integrated moving average (ARIMA) (e.g., Díaz-Robles et al. 2008), nearest neighbor model (NNM) (e.g., Perez 2012), and artificial neural network (ANN) (e.g., Perez and Reyes 2006; Chattopadhyay and Chattopadhyay 2012; Luna et al. 2014). Among these models, artificial neural networks have proven to be useful and effective to predict PM2.5 concentrations in local scale (Perez et al. 2000; Mckendry 2002; Ordieres et al. 2005; Thomas and Jacko 2007; Voukantsis et al. 2011). On the other hand, the Holt–Winters (HW) model, which is a seasonal time series prediction technique widely known and used (Dantas et al. 2017), was never applied to predict air quality levels. According to Tratar and Strmčnik (2016), the Holt–Winters model is very simple and robust. This model has been used to predict sales (Ribeiro et al. 2017), energy demand (Tratar and Strmčnik 2016), food industry (Veiga et al. 2014), air transportation demand (Dantas et al. 2017), traffic in cities (Baghyasree et al. 2014), tourism (Lim et al. 2009), power generation (Muche 2014), and solid waste generation (Bezerra 2006). This method allows modeling data through an average, a trend and a seasonal factor, all of which are updated by an exponential smoothing (Winters 1960).

This study aims to evaluate two air quality forecasting models (Holt–Winters [HW] and artificial neural networks [ANN]) using time series of PM2.5 concentrations from three different areas (rural, urban, and industrial) in the metropolitan region of Rio de Janeiro, Brazil. It is the longest time series data of PM2.5 obtained in Rio de Janeiro and one of the longest in South America. According to our knowledge, this is the first study developed in a tropical region of South America applying ANN and HW models to predict PM2.5 daily concentrations.

Material and methods

Sampling

The metropolitan region of Rio de Janeiro (MRRJ) has different air pollution sources associated with a complex topography that inhibits air mass movement and consequently affects pollutant dispersions (Ventura et al. 2017a, b).

For this study, three sampling sites in MRRJ were selected: (1) Seropédica (22° 45′ 28.14 S/43° 415.85″ W), a region with rural characteristics; (2) Taquara (22° 56′ 58.34″ S/43° 21′ 33.94″ W), an urban area with high population concentration, heavy traffic and some pharmaceutical industries; and (3) Duque de Caxias (22° 40′ 26.50″ S/43° 17′ 12.99″ W), the main industrial area in MRRJ, composed by many petrochemical industries, a power plant, and a refinery, in addition to heavy traffic.

PM2.5 samples were collected by the Environment Institute of Rio de Janeiro State (INEA) from January 2011 to December 2013, including rain days, totaling 180 samples per site. The samples were collected in glass fiber filters using Hi-Vol samplers (AGVMP252/Energy, Brazil) with a flow rate of 1.14 m3 min−1, simultaneously at all sites, every 6 days, for 24 h (NBR 13412 method, similar to ASTM-D 4096 method).

Meteorological data

Meteorological data were measured every 15 min by public agencies (INMET and INEA) from January 2011 to December 2013. Meteorological variables were temperature (T), relative humidity (RH), wind speed (WS), and atmosphere pressure (P). Although solar radiation is a good meteorological parameter to be used for modeling, it was not monitored at all sites; therefore, this parameter was not used in the models. Wind direction was also not used because several studies did not show significant correlation with PM2.5 (Luna et al. 2014; Ventura et al. 2017a, b; Mehdipour et al. 2018).

Descriptive statistics for these variables are presented in Table S1 in Supplementary Material.

Prediction of PM2.5 concentration applying Holt–Winters model

Holt–Winters is an exponential prediction model used since the 1960s to predict the linear trend and seasonality of time series, and it is an extension of exponential smoothing method. The HW model uses a modified form of exponential smoothing, and it applies three exponential smoothing formulae. The mean is smoothed to give a local average value for the series (Eq. 1). The trend is smoothed, and lastly each seasonal sub-series is smoothed separately to give a seasonal estimate for each season (Eq. 2). The exponential smoothing formula is applied to a series with a trend and constant seasonal component using the HW additive or multiplicative methods (Eq. 3).

$$ {L}_t=\alpha \left({Y}_t-{S}_{t-s}\right)+\left(1-\alpha \right)\left({L}_{t-1}+{b}_{t-1}\right) $$
(1)
$$ {b}_t=\beta \left({L}_t-{L}_{t-1}\right)+\left(1-\beta \right){b}_{t-1} $$
(2)
$$ {S}_t=\gamma \left({Y}_t-{L}_t\right)+\left(1-\gamma \right){S}_{t-s} $$
(3)

Where α, β, and γ are the smoothing parameters, at is the smoothed level at time Lt, bt is the change in the trend at time t, St is the seasonal smoothing parameter at time t, and s is the number of seasons per year.

An additive model is used when the amplitude of seasonal variation remains constant, and a multiplicative model is used when the amplitude of the seasonal variation increases in time (Winters 1960). The Holt–Winters additive model was used in this study, and it was calculated as the sum of the level components, trend, and seasonality (Eq. 4).

$$ {F}_{t+m}=\left({L}_t+{b}_{t-m}\right){S}_{t-s+m} $$
(4)

All data values in a series contribute to the calculation of the prediction model (Winters 1960). To optimize estimative smoothing parameters (α, β, and γ), the lowest mean squared error (MSE) was adopted. The seasonality used was four seasons during the year. The training set used was the first 173 observations of PM2.5 time series. On the other hand, the tests were performed with the last 5 and last 10 observations registered in the PM2.5 time series to estimate the next 5 or 10 PM2.5 concentrations, respectively.

Prediction of PM2.5 concentrations by applying artificial neural network

Artificial neural networks were originally developed to mimic basic biological neural systems, which are composed of neurons. An artificial neuron is a mathematical structure that seeks to simulate the shape, behavior, and functions of a biological neuron. An artificial neural network corresponds to a set of artificial neurons separated in layers (input, hidden, and output) (Chattopadhyay and Chattopadhyay 2012; Relvas and Miranda 2018). According to Luna et al. (2014), this model was successfully used for other authors to predict air quality levels.

To apply the artificial neural networks, it was necessary to set the lag number (input variables) and the neuron number in each hidden layer. Lag numbers were defined by the autocorrelation function (ACF) selecting the last number of autocorrelation that exceeds the confidence level of 95% (Fig. S1, Suppl. Material). To apply the ANN, the validation set was not used and the training was conducted with 1000 interactions. Applying the descent gradient method by batch, using as the training, the 173 PM2.5 concentrations initial dataset without the last 10 observations of each input variable for the performance of the tests with the next 5 and 10 PM2.5 concentrations predicted. Due to the low number of input variables, topologies (neurons quantity in the hidden layer) equal to 1 were used for Taquara and Seropédica and equal to 2 for Duque de Caxias.

Prediction of PM2.5 concentrations by applying ANN associated to meteorological parameters

To input in prediction model, the meteorological variables were transformed in daily means. The lowest MSE was used as the criterion for the choice of topology (Table S2, Suppl. Material). The MSE averages for Taquara, Seropédica, and Duque de Caxias were based on 100 initializations of the network calculated for the next 10 PM2.5 concentrations predicted, considering the input variables and different topologies. The results indicated that the ideal number of the topology is 2 for Taquara and Seropédica and 4 for Duque de Caxias.

Comparison between the prediction models

After the model is specified, its performance characteristics should be verified or validated by comparing its forecasts with historical data using accuracy measures. Root mean square error (RMSE) results and the prediction of the next 5 and 10 PM2.5 concentrations were adopted to compare the PM2.5 concentrations predicted in MRRJ by Holt–Winters and artificial neural networks models. Furthermore, to improve the assessment of the forecasting models, determination coefficients (R2) were generated from estimated values against the real values for each model, separated by wet and dry season and for each sampling year (2011–2013).

All statistical analyses were performed using statistical computing free R Language (R Development Core Team 2014).

Results and discussion

PM2.5 concentrations

Table 1 shows the maximum, minimum, and annual average of PM2.5 concentrations from 2011 to 2013, and the number of times that concentrations exceeded the daily guideline recommended by the WHO (25 μg m3). In Seropédica, a rural area, PM2.5 average concentrations were the lowest registered (10–11 μg m3). These averages were slightly above the annual limit recommended by the WHO (10 μg m3) and only 2 exceedances occurred.

Table 1 Maximum, minimum, and average PM2.5 concentrations (μg m−3) and the exceeding number of the air quality guidelines from the World Health Organization (WHO 2006)

The monitoring data from Duque de Caxias revealed that from 2011 to 2013, the annual average PM2.5 concentrations had similar behavior (18–20 μg m3) in exceeding the WHO limit, with 9–13 exceeding days. The maximum concentration in this area was two times higher than in Seropédica in 2013. These high concentrations are related to the proximity of an industrial complex and roads with heavy traffic.

Compared to other sites, Taquara presented the highest concentrations in 2011 (30 μg m3) and 2012 (23 μg m3). In 2011, it registered the highest number of exceedances (35) of the daily standard—three times higher than others in the study. However, as a positive fact, the annual average PM2.5 concentrations suffered a 23% reduction from 2011 to 2013. These results were similar to the ones observed by Godoy et al. (2009) and Ventura et al. (2017a, b), which also indicated Taquara as the place with the worst air quality among sites analyzed in MRRJ.

Comparing the three sites, it can be observed that the annual average of PM2.5 concentrations was higher in Taquara and lower in Seropédica. These results showed that vehicle emissions are the main source of fine particles in MRRJ, with Taquara as most representative of urban areas.

Predictions of PM2.5 concentration

Holt–Winters model

The Holt–Winters model was applied in the time series of PM2.5 concentrations monitored from January 2011 to December 2013 in the rural, urban, and industrial areas (Fig. 1). The estimations using the HW model explained very well the central results (median) observed for all sites. However, when taking into account the seasonality and the linear trends, this model was not able to explain PM2.5 concentration peaks in Seropédica (Fig. 1a). Nonetheless, for data from Duque de Caxias, the model estimated successfully (Fig. 1c). The effectiveness of this model for the industrial region was due to the fact that emissions are continuous, so seasonality is an extremely important factor in determining the success of this prediction model.

Fig. 1
figure 1

Forecast PM2.5 concentration using Holt–Winters model in rural (a), urban (b), and industrial (c) areas

For a method to be adequate for predictions, estimating the concentration peaks is extremely important, because peak days indicate critical episodes of air pollution, demanding that authorities take action to control the emission source and minimize future concentrations. The HW model showed itself to be weak at rendering this kind of estimation in rural and urban areas.

The HW model did not predict satisfactorily PM2.5 concentrations from urban areas. According to Winters (1960), this model is not suitable for estimating variables with cyclical frequencies, which is what occurs in cities, due to the cycle of vehicle circulation, which can vary during the weekends and holidays.

Artificial neural network model

The same time series for PM2.5 concentrations applied to the HW model was used in artificial neural networks (ANN1) in the industrial, rural, and urban areas (Fig. 2). For urban and rural areas, ANN1 estimated the PM2.5 concentrations better than the HW model, as can be seen in Fig. 2a, b, where the values observed corresponded well to the values estimated. However, for the industrial area (Fig. 2c), the model was inaccurate.

Fig. 2
figure 2

Forecast PM2.5 concentration using artificial neural networks in rural (a), urban (b), and industrial (c) areas

In industrial areas, PM2.5 concentrations have a more constant profile than in urban areas, due to the fact that emissions in industrial areas follow a certain linearity while urban areas have a more cyclical profile due to irregular traffic emissions. Besides, any changes in the daily activity of the region, such as traffic accidents or civil work, can influence the monitored concentrations.

Artificial neural network with meteorological variables

It is well established that air quality relies strongly on meteorological conditions. Therefore, the effects of different meteorological variables are already implied in the structure of the time series of a determined pollutant, such as PM2.5. Due to the complexity of the correlation and also because of the presence of noise, an explicit consideration of variable effects, such as temperature, wind speed, and relative humidity, can yield a better prediction of particle concentrations (Perez et al. 2000; Luna et al. 2014). Therefore, the time series of PM2.5 concentrations measured from 2011 to 2013, together with meteorological data, were evaluated by artificial neural networks (ANN2) (Fig. 3).

Fig. 3
figure 3

Forecast PM2.5 concentration using artificial neural networks associated to meteorological variables in rural (a), urban (b), and industrial (c) area

The artificial neural network model, when associated to meteorological data, improved at estimating PM2.5 concentrations in urban and rural areas (Fig. 3a, b). An increase in information can augment the interpretation of the artificial neural network model because it facilitates synapse learning. Many studies (e.g., Ordieres et al. 2005; Thomas and Jacko 2007; Perez 2012) have already indicated that the adequate choice of variable access could be the most important step in statistical modeling, considerably increasing the predictive power of models when local meteorological variables are added. On the other hand, this model remained insufficient to explain the concentrations observed in the industrial area (Fig. 3c).

Predicted models accuracy

In order to evaluate numerically the accuracy of the models (HW, ANN1, and ANN2), the root mean square error (RMSE) was used for the stages of simulation (train) and the estimation of the next 5 and 10 PM2.5 concentrations (Table 2). For the three forecasting models, the train RMSE varied from 3.6 to 11.1 μg m−3 and the prediction RMSE ranged from 4.2 to 14.9 μg m−3. The lowest RMSE in the train simulation are justified, since the more information number to be input for the model about study variation, the better its response function will be.

Table 2 Root mean square error (RMSE) applied to PM2.5 concentrations (μg m−3) using artificial neural networks and Holt–Winters models

For future concentrations, the estimated RMSE were always higher for the next 10 PM2.5 concentrations. However, the difference in the prediction ranged only from 4 to 11% in relation to the forecast of the next 5 PM2.5 concentrations.

Many researchers found that the meteorological conditions input into the ANN model improved the results’ precision (e.g., Thomas and Jacko 2007; Perez et al. 2000). In this study, a 20–30% reduction was observed in the RMSE when meteorological data was added to ANN models. Furthermore, RMSE results for ANN2 were 37% to 62% smaller than for the HW model. This was possible because meteorological variables introduce seasonal information, improving the generalization. According to Ospina and Zamprogno (2003), the ANN model reports a better performance during long periods of time, because it adjusts more quickly to structural changes through time.

Studies using ANN to preview atmospheric particulate matter concentrations found RMSE between 4 and 37 μg m−3 (Perez et al. 2000; Mckendry 2002; Ordieres et al. 2005; Thomas and Jacko 2007; Voukantsis et al. 2011). Therefore, the verified results in these studies were very similar to the smallest RMSE observed in previous ones.

Predicted models assessment

In the rural area, the three models evaluated presented a determination coefficient (R2) above 0.9 (Fig. 4), which shows that both are good PM2.5 estimators for this type of zone. When evaluated to determine whether the estimates were influenced by sampling years or season (Table 3), it was verified that all the models maintained steady performance.

Fig. 4
figure 4

PM2.5 concentration observed versus estimated to rural area from HW (a), ANN1 (b), and ANN2 (c)

Table 3 Coefficient of determination (R2) of PM2.5 concentrations (μg m−3) observed against estimated for each area by year and period

Artificial neural network models were more appropriate to estimate PM2.5 concentration in urban areas, with an R2 higher than 0.95 (Fig. 5), regardless of sampling years or whether it was wet or dry season (Table 3). The Holt–Winters model proved ineffective for the prediction of this pollutant in this zone, since its R2 did not reach 0.7.

Fig. 5
figure 5

PM2.5 concentration observed versus estimated to urban area from HW (a), ANN1 (b), and ANN2 (c)

It is possible to verify in Fig. 6 that the model that best estimated PM2.5 concentrations in the industrial area was the Holt–Winters (R2 = 0.83). The artificial neural network models presented a low determination coefficient (R2 < 0.7). When the prediction by sampling years was evaluated, the R2 was verified at 0.84 ± 0.05. However, it can be seen in Table 3 that the wet season, which is from November to March, hinders the prediction of the HW model, since the concentrations fluctuate more in their values, due to the concentration drops on rainy days. This same behavior was also repeated in the urban area.

Fig. 6
figure 6

PM2.5 concentration observed versus estimated to industrial area from HW (a), ANN1 (b), and ANN2 (c)

It is noteworthy in Table 3 that the addition of daily meteorological information did not significantly improve (< 3%) the artificial neural network model’s (ANN2) performance when compared to the model without that information (ANN1) in rural and urban areas, where ANN proved most useful.

Conclusion

The use of predictive models, such as Holt–Winters and artificial neural networks, constitutes powerful tools to help make decisions about air quality management. The models aid in the anticipation of air pollution critical episodes, e.g., with regard to fine particulate matter (PM2.5). These models, when applied using different observed data in Brazil, reported a good accuracy, with RMSE ranging from 4.2 to 14.9 μg m−3. Overall, both models have enough precision to be considered useful tools in air pollution management by environmental agencies, allowing those agencies to warn the population about future adverse conditions. Moreover, they will help to implant palliative control actions to avoid critical episodes previously predicted.

The Holt–Winters model, though not previously used for air quality prediction, proved efficient at forecasting PM2.5 concentrations in industrial and rural areas where emissions are relatively constant throughout the year. However, it has been shown inadequate in areas with seasonal influences, such as wet periods, due to the fluctuation of concentrations on rainy days.

The artificial neural networks models achieved consistent predictions of PM2.5 concentrations in urban and rural areas, as their predictive power is not subject to cyclical influences. However, the input of meteorological variables into the artificial neural network model was expected to improve the modeling result in estimating PM2.5 concentrations, but this was not verified since the R2 did not increase by more than 5%.