Air quality modeling in the Oviedo urban area (NW Spain) by using multivariate adaptive regression splines

Nieto, P. J. García; Antón, J. C. Álvarez; Vilán, J. A. Vilán; García-Gonzalo, E.

doi:10.1007/s11356-014-3800-0

Air quality modeling in the Oviedo urban area (NW Spain) by using multivariate adaptive regression splines

Research Article
Published: 21 November 2014

Volume 22, pages 6642–6659, (2015)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Environmental Science and Pollution Research Aims and scope Submit manuscript

Air quality modeling in the Oviedo urban area (NW Spain) by using multivariate adaptive regression splines

Download PDF

P. J. García Nieto¹,
J. C. Álvarez Antón²,
J. A. Vilán Vilán³ &
…
E. García-Gonzalo¹

375 Accesses
11 Citations
Explore all metrics

Abstract

The aim of this research work is to build a regression model of air quality by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (northern Spain) at a local scale. To accomplish the objective of this study, the experimental data set made up of nitrogen oxides (NO_x), carbon monoxide (CO), sulfur dioxide (SO₂), ozone (O₃), and dust (PM₁₀) was collected over 3 years (2006–2008). The US National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the MARS technique, conclusions of this research work are exposed.

Modeling of PM10 Air Pollution in Urban Environment Using MARS

Modeling of air pollutants using least square support vector regression, multivariate adaptive regression spline, and M5 model tree models

Article 13 April 2017

Developing a model to predict air pollution (case study: Tehran City)

Article 07 January 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Environmental Chemistry

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Air pollution is the introduction into the atmosphere of chemicals, particulates, or biological materials that cause discomfort, disease, or death to humans, damage other living organisms such as food crops, or damage the natural environment or built environment (García Nieto 2001; García Nieto 2006; Lutgens and Tarbuck 2012). The atmosphere is a complex dynamic natural gaseous system that is essential to support life on planet Earth. For instance, the stratospheric ozone depletion due to air pollution has long been recognized as a threat to human health as well as to the Earth’s ecosystems. Additionally, the urban air quality is listed as one of the world’s worst toxic pollution problems in the 2008 Blacksmith Institute World’s Worst Polluted Places report (Phalen 2011; Domike and Zacaroli 2013). Indeed, air pollution is one of the important environmental problems in metropolitan and industrial cities (García Nieto 2001; García Nieto 2006; Seinfeld and Pandis 2006; Lutgens and Tarbuck 2012) such as Oviedo (Principality of Asturias, Spain). The World Health Organization states that 2.4 million people die each year from causes directly attributable to air pollution, with 1.5 million of these deaths attributable to indoor air pollution. The health effects caused by air pollution may include difficulty in breathing, wheezing, coughing, and aggravation of existing respiratory and cardiac conditions. These effects can result in increased medication use, increased doctor or emergency room visits, more hospital admissions, and premature death (Wark et al. 1997; Wang et al. 2004; Lutgens and Tarbuck 2012). The human health effects of poor air quality are far reaching but principally affect the body’s respiratory system and the cardiovascular system (Anderson 2009; García Nieto 2001). Individual reactions to air pollutants depend on the type of pollutant a person is exposed to, the degree of exposure, the individual’s health status, and genetics (Anderson 2009). A new economic study of the health impacts and associated costs of air pollution in the Los Angeles Basin and San Joaquin Valley of Southern California shows that more than 3800 people die prematurely (approximately 14 years earlier than normal) each year because air pollution levels violate federal standards (Lutgens and Tarbuck 2012). The number of annual premature deaths is considerably higher than the fatalities related to auto collisions in the same area, which average fewer than 2000 per year (Anderson 2009; Brimblecombe 2011). Diesel exhaust (DE) is a major contributor to combustion-derived particulate matter air pollution (Lucking et al. 2008). In several human experimental studies (Törnqvist et al. 2007), using a well-validated exposure chamber setup, DE has been linked to acute vascular dysfunction and increased thrombus formation. This serves as a plausible mechanistic link between the previously described association between particulate matter air pollution and increased cardiovascular morbidity and mortality (García Nieto 2001; Karaca et al. 2005; García Nieto 2006; Lutgens and Tarbuck 2012).

Oviedo is the capital city of the Principality of Asturias in northern Spain. It is also the name of the municipality that contains the city. Oviedo, which is the administrative and commercial center of the region, also hosts the annual Prince of Asturias Awards. This prestigious event, held in the city’s Campoamor Theatre, recognizes international achievement in eight categories. Oviedo University’s international campus attracts many foreign scholars from all over the globe. The city of Oviedo has a population of 225,973 inhabitants. It covers a land area of 186.65 km², and it has an altitude of 232 m above sea level and a density of 1210.68 inhabitants/km². The climate of Oviedo, as with the rest of northwest Spain, is more varied than that of southern parts of Spain. Summers are generally humid and warm, with considerable sunshine, but also some rain. Winters are cold with some very cold snaps and very rainy. The cold is especially felt in the mountains surrounding the city of Oviedo, where snow is present from October till May. Both rain and snow are regular weather features of Oviedo’s winters. On the other hand, there is a coal-fired power plant located 7 km south from the city of Oviedo: the Soto de Ribera’s coal-fired power plant (see Fig. 1a, b). Such plant provides most of the electrical energy used in the city of Oviedo. Its economy is based on coal (e.g., Hunosa Ltd.), limestone and dolomite quarries located in Naranco mount and Olloniego area, livestock rearing, a strong tertiary sector, etc. (Karaca et al. 2005; Lutgens and Tarbuck 2012; García Nieto et al. 2013). Figure 1a shows the geographical location of the three meteorological stations and the Soto de Ribera’s coal-fired power plant. The Soto de Ribera’s coal-fired power plant is located 7 km south from the city of Oviedo in the district of Ribera de Arriba and at an altitude of 126.50 m above sea level.

To fix ideas, there are many air pollution indicators affecting human health (Comrie and Diem 1999; Elbir et al. 2000; Akkoyunku and Ertürk 2003; Godish 2004; Suárez Sánchez et al. 2011). The automatic measurements of meteorological pollution, such as CO, NO, NO₂, SO₂, O₃, and particulate matter (PM₁₀), are more and more important due to their harmful effects on human health (Wark et al. 1997; García Nieto 2001; Wang et al. 2004; García Nieto 2006; Lutgens and Tarbuck 2012). EU and many national environmental agencies have set standards and air quality guidelines for allowable levels of these pollutants in the air (Cooper and Alley 2002; Wang et al. 2004; Suárez Sánchez et al. 2011). The aim of this research work is to construct a model for the averaged next-month pollution that would be applicable for use by the authority responsible for air pollution regulation in the appropriate region of the country. The use of the artificial neural networks of multilayer perceptron (MLP) type as the model of pollution was exploited frequently in the last years (Boznar et al. 1993; Haykin 1999; Hooyberghs et al. 2005; Suárez Sánchez et al. 2011). In this way, an artificial neural network (ANN), usually called neural network (NN), is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation (Kukkonen et al. 2003). In most cases, an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. Modern neural networks are nonlinear statistical data modeling tools (Gardner and Dorling 1999; Chaloulakou et al. 2003; Karaca et al. 2006). They are usually used to model complex relationships between inputs and outputs or to find patterns in data. A MLP is a feedforward artificial neural network model that maps sets of input data onto a set of appropriate output. A MLP consists of multiple layers of nodes in a directed graph, which is fully connected from one layer to the next. Except for the input nodes, each node is a neuron (or processing element) with a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training the network. MLP is a modification of the standard linear perceptron, which can distinguish data that is not linearly separable (Haykin 1999; Bishop 2006; Suárez Sánchez et al. 2011).

In this innovative research work, a model based on the multivariate adaptive regression splines (MARS) is proposed (Friedman 1991; Sekulic and Kowalski 1992; Friedman and Roosen 1995; Vapnik 1999; Hastie et al. 2003; Chou et al. 2004; Xu et al. 2004; de Cos Juez et al. 2009) for the study of considered pollutants: CO, NO, NO₂, SO₂, O₃, and particulate matter (PM₁₀). The data taking part in learning and testing have been collected within 3 years: from 2006 to 2008. The results of numerical experiments based on the application of the MARS technique have confirmed good accuracy of daily modeling for all considered pollutants. These detailed results will be presented and discussed in this paper.

Indeed, the MARS technique is a form of regression analysis introduced by Jerome Friedman in 1991 (Friedman 1991). The MARS technique is a nonparametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions. This technique (García Nieto et al. 2011; Vidoli 2011; García Nieto et al. 2012) has been applied greatly in recent years to many fields of science and engineering with success. For instance, the MARS technique is used in a variety of fields, including biomedicine and bioinformatics and other engineering fields (Friedman 1991; Sekulic and Kowalski 1992; Friedman and Roosen 1995; Vapnik 1999; Hastie et al. 2003; Chou et al. 2004; Xu et al. 2004; de Cos Juez et al. 2009; García Nieto et al. 2011; Vidoli 2011; García Nieto et al. 2012). MARS models are more flexible than linear regression models and they are simple to understand and interpret. The MARS technique can handle both numeric and categorical data, and it tends to be better than recursive partitioning for numeric data because hinges are more appropriate for numeric variables than the piecewise constant segmentation used by recursive partitioning (Vapnik 1999; Bishop 2006). To fix ideas, building MARS models often requires little or no data preparation. The hinge functions automatically partition the input data, so the effect of outliers is contained. In this respect, the MARS technique is similar to recursive partitioning which also partitions the data into disjoint regions, although using a different method. MARS models tend to have a good bias variance trade-off, and they are flexible enough to model nonlinearity and variable interactions (de Cos Juez et al. 2009; García Nieto et al. 2011; Vidoli 2011; García Nieto et al. 2012).

In this research work, the MARS technique (Friedman 1991; Sekulic and Kowalski 1992; Friedman and Roosen 1995) was used as an automated learning tool when building three MARS models for nitrogen dioxide (NO₂), sulfur dioxide (SO₂), and aerosol particles less than 10 μm (PM₁₀) as a function of other measured relevant pollutants in air quality: nitric oxide (NO), carbon monoxide (CO), and ozone (O₃). The aim was to make accurate concentration estimates of the three abovementioned pollutants (NO₂, SO₂, and PM₁₀) (Schnelle and Brown 2001; Colbeck 2008; Hewitt and Jackson 2009; Suárez Sánchez et al. 2011). MARS models were used as an alternative to the traditional regression approaches. The three MARS models were found to be better to tackle nonlinear regression problems such as those associated to air quality and studied in this research work.

This study is structured as follows: firstly, the materials and methods used to carry out this study are described; next, the obtained results are presented and discussed; and finally, the main conclusions drawn from the results are described.

Materials and methods

Sources and types of air pollution

A substance in the air that can be harmful to humans and the environment is known as an air pollutant. Pollutants can be in the form of solid particles, liquid droplets, or gases. In addition, they may be natural or man made. Pollutants can be classified as primary or secondary. Usually, primary pollutants are directly emitted from a process, such as ash from a volcanic eruption, the carbon monoxide gas from a motor vehicle exhaust, or sulfur dioxide released from factories. Secondary pollutants are not emitted directly. Rather, they form in the air when primary pollutants react or interact. An important example of a secondary pollutant is ground-level ozone, one of the many secondary pollutants that make up photochemical smog (Wark et al. 1997; Schnelle and Brown 2001; Monteiro et al. 2005; García Nieto et al. 2013). Some pollutants may be both primary and secondary: that is, they are both emitted directly and formed from other primary pollutants.

Major primary pollutants produced by human activity include the following (Wark et al. 1997; Friedlander 2000; García Nieto 2001; Cooper and Alley 2002; Wang et al. 2004; Karaca et al. 2005; García Nieto 2006; Seinfeld and Pandis 2006; Vincent 2007; Colbeck 2008; Hewitt and Jackson 2009; Suárez Sánchez et al. 2011; Lutgens and Tarbuck 2012; García Nieto et al. 2013):

Particulate matter (PM₁₀), alternatively referred to as atmospheric particulate matter or fine particles, are tiny particles of solid or liquid suspended in a gas. PM₁₀ are particles with a diameter of 10 μm or less. Sources of particulates can be man made or natural. Increased levels of fine particles in the air are linked to health hazards such as heart disease, altered lung function, and lung cancer.
Sulfur oxides (SO_x), especially sulfur dioxide, a chemical compound with the formula SO₂. Since coal and petroleum often contain sulfur compounds, their combustion generates sulfur dioxide. Further oxidation of SO₂, usually in the presence of a catalyst such as NO₂, forms H₂SO₄ and thus acid rain (Wark et al. 1997; Wang et al. 2004; Lutgens and Tarbuck 2012).
Nitrogen oxides (NO_x), especially nitrogen dioxide, are emitted from high-temperature combustion and are also produced naturally during thunderstorms by electric discharge. Nitrogen dioxide is a chemical compound with the formula NO₂. This reddish-brown toxic gas has a characteristic sharp, biting odor. The initial product formed is nitric oxide (NO). When NO oxidizes further in the atmosphere, nitrogen dioxide (NO₂) forms. Commonly, the general term NO_x is used to describe these gases.
Carbon monoxide (CO) is a colorless, odorless, nonirritating but very poisonous gas. It is a product by incomplete combustion of fuel such as natural gas, coal, or wood. Vehicular exhaust is a major source of carbon monoxide.
Volatile organic compounds (VOCs) are an important outdoor air pollutant. In this field, they are often divided into the separate categories of methane (CH₄) and nonmethane (NMVOCs). Specifically, this pollutant is not considered in this study.

Secondary pollutants considered here are as follows (Wark et al. 1997; Anderson et al. 2001; Cooper and Alley 2002; Godish 2004; Wang et al. 2004; Seinfeld and Pandis 2006; Vincent 2007; Weinhold 2008; Jerrett et al. 2009; Lutgens and Tarbuck 2012; García Nieto et al. 2013):

Particulate matter formed from gaseous primary pollutants and compounds in photochemical smog. Smog is a kind of air pollution. The word smog is a portmanteau of smoke and fog.
Ground-level ozone (O₃), which is formed from NO_x and VOCs. Ozone (O₃) is a key constituent of the troposphere (it is also an important constituent of certain regions of the stratosphere commonly known as the ozone layer). Photochemical and chemical reactions involving it drive many of the chemical processes that occur in the atmosphere by day and by night. The negative effects of ozone are well documented. Short-term exposure to elevated levels of ozone causes eye and lung irritations (Jerrett et al. 2009).

With respect to the trends of in air quality, the Clean Air Act of 1970 mandated the setting of standards for four of the primary pollutants (aerosols, sulfur dioxide, carbon monoxide, and nitrogen oxides) as well as the secondary pollutant ozone. At the time, these five pollutants were recognized as being the most widespread and objectionable. Today, with the addition of lead, they are known as the criteria pollutants and are covered by the US National Ambient Air Quality Standards (see Table 1) (Godish 2004; Lutgens and Tarbuck 2012; García Nieto et al. 2013). The primary standard for each pollutant shown in Table 1 is based on the highest level that can be tolerated by humans without noticeable ill effects, minus a 10–50 % margin for safety reasons.

Table 1 National Ambient Air Quality Standards by US Environmental Protection Agency (USEPA) (Godish 2004; Lutgens and Tarbuck 2012; García Nieto et al. 2013)

Full size table

Experimental data set

The Section of Industry and Energy from the government of Asturias has three automatic air quality monitoring stations distributed throughout the city of Oviedo (see Fig. 1). A measuring station is a sampling point, regardless of the number of parameters monitored and the analysis techniques applied. It consists of a group of systems and proceedings to evaluate and assess the appearance of pollution agents in the atmosphere. These three stations measure every 15 min the following primary and secondary pollutants: sulfur dioxide (SO₂), nitrogen oxides (NO and NO₂), carbon monoxide (CO), particulate matter less than 10 μm (PM₁₀), and ozone (O₃). This data set is collected, processed, and delivered on average for the entire city every day. Therefore, we have data for the pollutants listed above each day from January 2006 to December 2008. Additionally, this data set is averaged monthly and shown in Table 2.

Table 2 Monthly average air pollution concentrations in the Oviedo urban area from January 2006 to December 2008

Full size table

The automatic analyzers of monitoring stations use the physical and chemical properties of the gaseous pollutants and particulate matter to determine its concentration. The methods currently used by automatic analyzers of the above gaseous pollutants and particulate matter are as follows (Seinfeld and Pandis 2006; Singal 2012):

The sulfur dioxide analyzers use the principle of the pulsing fluorescence based on the fact that the molecules of SO₂ absorb ultraviolet radiation (UV) at a wavelength in the range of 210–410 nm, entering in an instantaneous state of excitation. Subsequently, this gas decays to a lower energy state, emitting a pulse of fluorescent light of a greater wavelength in the range from 240 to 410 nm. The intensity of the emitted fluorescent light is proportional to the concentration of SO₂.
Chemiluminescence is an analytical technique based on the measurement of the amount of light generated by a chemical reaction. The analyzers of nitrogen oxides (NO_x) use this principle to determine the concentrations of nitric oxide (NO) and nitrogen dioxide (NO₂), taking into account their chemical reaction with ozone (O₃).
The analyzers of carbon monoxide (CO) are based on the ability of this gas to absorb energy at specific wavelengths. Specifically, the absorption of infrared light is measured in the region of maximum absorption of this pollutant.
The operating principle for measuring the particulate matter is the attenuation of the beta radiation. Beta radiation is passed through the deposited particles. The layer of particles, which is increasing, reduces the intensity of beta radiation’s beam, which is measured by an ionization chamber. The electrical output signal is proportional to the actual mass sampled.
The operating principle of ozone analyzers is known as the ultraviolet photometry method and consists of measuring the amount of ultraviolet light at a wavelength of 254 nm absorbed by the ozone in a sample. The operating principle is based on the Beer-Lambert law.

Multivariate adaptive regression splines method

Multivariate adaptive regression splines (MARS) is a multivariate nonparametric classification/regression technique introduced by Friedman (Friedman 1991; Friedman and Roosen 1995). The computational implementation of this regression modeling has been performed using the free software programming language and software environment for statistical computing known as the R project (James et al. 2013; Lantz 2013). The theoretical model that is explained below has already been presented by the authors in previous researches (García Nieto et al. 2011; García Nieto et al. 2012). In spite of this fact and due to its interest for the reader, in order to achieve a full understanding of the research, this technique is presented in this paper. Its main purpose is to predict the values of a continuous dependent variable, y(n×1), from a set of independent explanatory variables, X(n×p). The MARS model can be represented as

$$ \mathbf{y}=f\left(\mathbf{X}\right)+\mathbf{e} $$

(1)

where f is a weighted sum of basis functions that depend on X and e is an error vector of dimension (n×1)

The MARS model does not require any a priori assumptions about the underlying functional relationship between dependent and independent variables. Instead, this relation is uncovered from a set of coefficients and piecewise polynomials of degree q (basis functions) that are entirely “driven” from the regression data (X,y). The MARS regression model is constructed by fitting basis functions to distinct intervals of the independent variables. Generally, piecewise polynomials, also called splines, have pieces smoothly connected together. In MARS terminology, the joining points of the polynomials are called knots, nodes, or breakdown points. These will be denoted by the small letter t. For a spline of degree q, each segment is a polynomial function. MARS uses two-sided truncated power functions as spline basis functions, described by the following equations (Friedman 1991; Sekulic and Kowalski 1992; Friedman and Roosen 1995; Vapnik 1999; Hastie et al. 2003; Chou et al. 2004; Xu et al. 2004; de Cos Juez et al. 2009; García Nieto et al. 2011; Vidoli 2011; García Nieto et al. 2012):

$$ {\left[-\left(x-t\right)\right]}_{+}^q=\left\{\begin{array}{cc}\hfill {\left(t-x\right)}^q\hfill & \hfill \mathrm{if}\kern0.24em x<t\hfill \\ {}\hfill 0\hfill & \hfill \mathrm{otherwise}\hfill \end{array}\right. $$

(2)

$$ {\left[+\left(x-t\right)\right]}_{+}^q=\left\{\begin{array}{cc}\hfill {\left(t-x\right)}^q\hfill & \hfill \mathrm{if}\kern0.24em x\ge t\hfill \\ {}\hfill 0\hfill & \hfill \mathrm{otherwise}\hfill \end{array}\right. $$

(3)

where q(≥0) is the power to which the splines are raised and which determines the degree of smoothness of the resultant function estimate. When q=1, which is the case in this study, only simple linear splines are considered.

The MARS model of a dependent variable $ \overrightarrow{y} $ with M basis functions (terms) can be written as (Friedman 1991; Sekulic and Kowalski 1992; Friedman and Roosen 1995; Chou et al. 2004; Xu et al. 2004; de Cos Juez et al. 2009)

$$ \widehat{\mathbf{y}}={\widehat{f}}_M\left(\mathbf{x}\right)={c}_0+{\displaystyle \sum_{m=1}^M{c}_m{B}_m\left(\mathbf{x}\right)} $$

(4)

where ŷ is the dependent variable predicted by the MARS model; c ₀ is a constant; B _m(x) is the mth basis function, which may be a single spline basis functions; and c _m is the coefficient of the mth basis functions.

Both the variables to be introduced into the model and the knot positions for each individual variable have to be optimized. For a data set X containing n objects and p explanatory variables, there are N=n×p pairs of spline basis functions, given by Eqs. (2) and (3), with knot locations x _ij(i=1,2,…,n; j=1,2,…,p).

A two-step procedure is followed to construct the final model. First, in order to select the consecutive pairs of basis functions of the model, a two-at-a-time forward stepwise procedure is implemented (Friedman 1991; Sekulic and Kowalski 1992; Friedman and Roosen 1995; García Nieto et al. 2012). This forward stepwise selection of basis function leads to a very complex and overfitted model. Such a model, although it fits the data well, has poor predictive abilities for new objects. To improve the prediction, the redundant basis functions are removed one at a time using a backward stepwise procedure. To determine which basis functions should be included in the model, MARS utilizes the generalized cross-validation (GCV) (Chou et al. 2004; Xu et al. 2004; de Cos Juez et al. 2009). In this way, the GCV is the mean squared residual error divided by a penalty dependent on the model complexity. The GCV criterion is defined in the following way (Friedman 1991; Sekulic and Kowalski 1992; Friedman and Roosen 1995; Vapnik 1999; Hastie et al. 2003; Chou et al. 2004; Xu et al. 2004; de Cos Juez et al. 2009; García Nieto et al. 2011; Vidoli 2011; García Nieto et al. 2012):

$$ \mathrm{G}\mathrm{C}\mathrm{V}(M)=\frac{\frac{1}{n}{\displaystyle \sum_{i=1}^n{\left({y}_i-{\widehat{f}}_M\left({\mathbf{x}}_i\right)\right)}^2}}{{\left(1-C(M)/n\right)}^2} $$

(5)

where C(M) is a complexity penalty that increases with the number of basis functions in the model and which is defined as (Friedman and Roosen 1995; Xu et al. 2004; de Cos Juez et al. 2009; García Nieto et al. 2011; Vidoli 2011; García Nieto et al. 2012)

$$ C(M)=\left(M+1\right)+d\;M $$

(6)

where M is the number of basis functions in Eq. (4) and the parameter d is a penalty for each basis function included into the model. It can be also regarded as a smoothing parameter. Large values of d lead to fewer basis functions and therefore smoother function estimates. For more details about the selection of the d parameter, see the references Friedman and Roosen (1995), de Cos Juez et al. (2009), García Nieto et al. (2011), Vidoli (2011), and García Nieto et al. (2012). In our studies, the parameter d equals 2, and the maximum interaction level of the spline basis functions is restricted to 3.

The importance of the variables in the MARS model

Once the MARS model is constructed, it is possible to evaluate the importance of the explanatory variables used to construct the basis functions. Establishing predictor importance is in general a complex problem which in general requires the use of more than one criterion. In order to obtain reliable results, it is convenient the use of the GCV parameter explained before together with the parameters Nsubsets (criterion counts the number of model subsets in which each variable is included) and the residual sum of squares (RSS) (Friedman and Roosen 1995; Xu et al. 2004; de Cos Juez et al. 2009; García Nieto et al. 2011; Vidoli 2011; García Nieto et al. 2012).

Analysis of results and discussion

It is possible to study the trend in concentrations of the preceding pollutants in the years 2006, 2007, and 2008 (Wark et al. 1997; Schnelle and Brown 2001; Godish 2004; Wang et al. 2004; García Nieto et al. 2013). In the first place, Fig. 2 shows the sulfur dioxide concentrations each month during the years 2006, 2007, and 2008. It is possible to observe that the emission peaks occur during late autumn and early winter, namely from November to February each year, reaching the maximum emission during the Christmas of 2006: 51 μg/m³. Similarly, minimum concentrations (13 μg/m³ in August 2007) occur during the summer months because there are lower power consumption and less traffic in the city. This trend is general throughout the years studied, and it is within the logic. Following the US Environmental Protection Agency (USEPA) Air Quality Standards (see Table 1), the maximum allowable concentration of SO₂ expressed as annual arithmetic mean is 80 μg/m³. The annual arithmetic means for this gas during the years 2006, 2007, and 2008 were 24, 23, and 24 μg/m³, respectively. Therefore, the concentrations of this gas were below the maximum permitted and met air quality standards for a healthy person during these 3 years, including emission peaks.

Secondly, Fig. 3 shows the nitric oxide concentrations each month during the years 2006, 2007, and 2008. It is also possible to observe again that the emission peaks occur during late autumn and early winter, reaching the maximum emission in December 2006: 74 μg/m³. Similarly, minimum concentrations (13 μg/m³ in July 2007) occur during the summer months. Although the initial product of the combustion is nitric oxide (NO), this gas is rapidly oxidized and converted to NO₂. Its residence time in the atmosphere is very short and the USEPA Air Quality Standards does not consider it.

Thirdly, the nitrogen dioxide (NO₂) concentrations each month during the years 2006, 2007, and 2008 are also shown in Fig. 2. Note that the emission peak occurs from November to February each year, reaching the maximum emission in December 2006: 53 μg/m³. Similarly, minimum concentrations occur in August 2006 and October 2007 with values of 26 and 23 μg/m³, respectively. Following the USEPA Air Quality Standards (see Table 1), the maximum allowable concentration of NO₂ expressed as annual arithmetic mean is 100 μg/m³. The annual arithmetic means for this gas during the years 2006, 2007, and 2008 were 34, 37, and 36 μg/m³, respectively. Thus, the nitrogen dioxide (NO₂) concentrations are also below the maximum permitted and meet air quality standards for a healthy person during these 3 years, including emission peaks. It is also important to highlight a certain flattening of the concentrations of this gas over time.

Fourthly, Fig. 2 also shows the carbon monoxide (CO) concentrations each month during the years 2006, 2007, and 2008. The main emission peak occurs in December 2006 and January 2007 with a value of 0.85 mg/m³. Similarly, minimum concentrations occur during the summer months: 0.32 mg/m³ in July 2006, 0.31 mg/m³ in August 2007, and 0.35 mg/m³ in July 2008, respectively. This trend is a quasi-sinusoidal signal with peaks and valleys, although the height of the peaks and valleys appears to be decreasing in time (García Nieto 2001; Godish 2004; García Nieto 2006). Similarly, following the USEPA Air Quality Standards (see Table 1), the maximum allowable concentration of CO expressed as annual arithmetic mean is 3.33 mg/m³. The annual arithmetic means for this gas during the years 2006, 2007, and 2008 were 0.52, 0.50, and 0.44 mg/m³, respectively. Hence, the concentrations of CO were below the highest level that can be tolerated by humans according to USEPA Air Quality Standards during these 3 years, including emission peaks.

Fifthly, Fig. 3 shows the particulate matter PM₁₀ concentrations each month during the years 2006, 2007, and 2008. It is possible to observe that the emission peaks occur in December 2006, January 2007, March 2008, and August 2008 with values 71, 72, 75, and 69 μg/m³, respectively. However, the quasi-sinusoidal trend is lost here in general in the case of the particulate matter. All aerosol concentrations are kept in a band or range of concentrations between approximately 50 and 70 μg/m³, regardless of the month and the season. From the point of standard air quality view, following the USEPA Air Quality Standards (see Table 1), the maximum allowable concentration of PM₁₀ expressed as annual arithmetic mean is 50 μg/m³. The annual arithmetic means for this pollutant during the years 2006, 2007, and 2008 were 60, 59, and 56 μg/m³, respectively. Therefore, the aerosol concentrations are greater than the allowable maximum for a healthy person during these 3 years, including emission peaks. This behavior can give place to serious health problems for the population such as chronic diseases and deaths.

Figure 3 also indicates the ozone concentrations each month during the years 2006, 2007, and 2008. Note that there is a variation of the ozone concentration in the form of an oscillating sawtooth over time, reaching maximum values during summer months: 46 μg/m³ in June 2006, 54 μg/m³ in May 2007, and 59 μg/m³ in April 2008. This trend is general throughout the years studied, since ozone is associated with photochemical reactions, and these ones require the presence of strong sunlight. The Clean Air Act directs the USEPA to establish National Ambient Air Quality Standards for several pollutants, including ground-level ozone, and cities out of compliance with these standards are required to take steps to reduce their levels. In May 2008, the USEPA lowered its ozone standard from 80 to 75 μg/m³. This proved controversial, since the agency’s own scientists and advisory board had recommended lowering the standard to 60 μg/m³, and the World Health Organization recommends 51 μg/m³. Many public health and environmental groups also supported the 60-μg/m³ standard. The annual arithmetic means for this gas in the city of Oviedo during the years 2006, 2007, and 2008 were 26, 39, and 41 μg/m³, respectively. Thus, the concentrations of this gas were below the maximum permitted and met air quality standards during these 3 years, including emission peaks. In April 2008, a peak was reached very close to the allowable limit. This fact was very dangerous for the health of the population of Oviedo. Furthermore, there is a great deal of evidence showing that high concentrations of ozone, created by high concentrations of pollution and daylight UV rays at the Earth’s surface, can harm lung function and irritate the respiratory system. Exposure to ozone and the pollutants that produce it has been linked to premature death, asthma, bronchitis, heart attack, and other cardiopulmonary problems. According to scientists of the USEPA, susceptible people can be adversely affected by ozone levels as low as 40 μg/m³.

Finally, in order to explain the peaks of SO₂, NO₂, NO, and CO concentrations during the period November 2006–January 2007, it is necessary to take into account the role of meteorology. In this way, Fig. 4 shows the average monthly temperature as well as the average monthly precipitation in the city of Oviedo from January 2006 to December 2008. Note that the temperature during this period (November 2006–January 2007) is lower than in the remaining years. As a consequence, this means that the electricity and heating consumptions were greater and therefore the concentrations of these pollutants were also increased in this period.

To fix ideas, the list of input variables taken into account in this research work is shown in Table 3. The total number of dependent variables used to build the MARS models was 3: nitrogen dioxide (NO₂), sulfur dioxide (SO₂), and particulate matter less than 10 μm (PM₁₀). Indeed, we have built three different MARS models taking as dependent variables NO₂, SO₂, and PM₁₀, respectively, and as independent input variables (predictor variables) the other variables listed in Table 3.

Table 3 Set of input variables used in this study

Full size table

In this work, three second-order MARS models have been used, so that the basis functions of the model consist of linear and second-order splines and the maximum number of terms was not limited (no pruning). The results of the MARS models computed using all the available data observations are shown in Tables 4, 5, and 6. Tables 4, 5, and 6 show a list of 25, 43, and 33 main basis functions for each of the three MARS models and their coefficients, respectively. Please note that the function h(x) is the bisector of the first quadrant if x is greater than 0 and is 0 if x is less than or equal to 0. Therefore, the MARS model is a form of nonparametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions as a weighted sum of basis functions called hinge functions (Friedman and Roosen 1995; Xu et al. 2004; de Cos Juez et al. 2009; García Nieto et al. 2011; Vidoli 2011; García Nieto et al. 2012). The predicted response for NO₂, SO₂, and PM₁₀ is now a better fit to the original values since the MARS model has automatically produced a kink in the predicted dependent variable to take into account nonlinearities.

Table 4 List of basis functions of the MARS model for the NO₂ pollutant and their coefficients c _i

Full size table

Table 5 List of basis functions of the MARS model for the SO₂ pollutant and their coefficients c _i

Full size table

Table 6 List of basis functions of the MARS model for the PM₁₀ pollutant and their coefficients c _i

Full size table

According to the results shown in Table 7, the most important variables for the prediction of NO₂ (output variable) are in the priority order of importance: (1) sulfur dioxide (SO₂), (2) nitric oxide (NO) and carbon monoxide (CO) alike, (3) particulate matter less than or equal to 10 μm (PM₁₀), and finally (4) ozone (O₃).

Table 7 Evaluation of the importance of the variables that form the model for the NO₂ pollutant according to criteria Nsubsets, GCV, and RSS

Full size table

Similarly, from the results shown in Table 8, it is possible to observe that the most important variables for the prediction of SO₂ (output variable) are in the hierarchical order: (1) NO₂, (2) NO and O₃ alike, (3) PM₁₀, and finally (4) CO.

Table 8 Evaluation of the importance of the variables that form the model for the SO₂ pollutant according to criteria Nsubsets, GCV, and RSS

Full size table

Additionally, the results shown in Table 9 indicate that the most important variables for the prediction of PM₁₀ (output variable) are mainly in the hierarchical order: (1) NO₂, (2) NO, (3) CO, (4) SO₂, and finally (5) O₃.

Table 9 Evaluation of the importance of the variables that form the model for the PM₁₀ pollutant according to criteria Nsubsets, GCV, and RSS

Full size table

Furthermore, a graphical representation of the terms that constitute the three MARS models can be seen in Figs. 5, 6, and 7, respectively.

It is very important to select the model that best fits the experimental data. The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. In regression analysis, the criterion considered in this research to relate to goodness of fit was the coefficient of determination R ² (Freedman et al. 2007; de Cos Juez et al. 2009; García Nieto et al. 2011; García Nieto et al. 2012). This ratio indicates the proportion of total variation in the dependent variables (NO₂, SO₂, and PM₁₀ in our case) explained by the MARS models. A data set takes values t _i, each of which has an associated modeled value y _i. The former are called the observed values and the latter are often referred to as the predicted values. Variability in the data set is measured through different sums of squares:

$ {\mathrm{SS}}_{\mathrm{tot}}={\displaystyle \sum_{i=1}^n{\left({t}_i-\overline{t}\right)}^2} $: the total sum of squares, proportional to the sample variance
$ {\mathrm{SS}}_{\mathrm{reg}}={\displaystyle \sum_{i=1}^n{\left({y}_i-\overline{t}\right)}^2} $: the regression sum of squares, also called the explained sum of squares
$ {\mathrm{SS}}_{\mathrm{err}}={\displaystyle \sum_{i=1}^n{\left({t}_i-{y}_i\right)}^2} $: the residual sum of squares

In the previous sums, $ \overline{t} $ is the mean of the n observed data:

$$ \overline{t}=\frac{1}{n}{\displaystyle \sum_{i=1}^n{t}_i} $$

(7)

Bearing in mind the above sums, the general definition of the coefficient of determination is

$$ {R}^2\equiv 1-\frac{{\mathrm{SS}}_{\mathrm{err}}}{{\mathrm{SS}}_{\mathrm{tot}}} $$

(8)

A coefficient of determination value of 1.0 indicates that the regression curve fits the data perfectly. In this current research work, the three fitted MARS models for NO₂, SO₂, and PM₁₀ have coefficients of determination equal to 0.85, 0.82, and 0.75, respectively. Additionally, their correlation coefficients were 0.92, 0.91, and 0.87, respectively. These results indicate a very high goodness of fit for three MARS models analyzed.In order to guarantee the prediction ability of the three MARS models, the cross-validation (Picard and Cook 1984; Efron and Tibshirani 1997) was the standard technique used here for the three MARS models built in this research work. In this sense, the data set is randomly divided into a number of disjoint subsets of equal size, and each subset is used once as a validation set, whereas the remaining subsets are put together to form a training set. In the simplest case, the average accuracy of all the validation sets is used as an estimator for the accuracy of the method. In this research work, a tenfold cross-validation was used, that is to say, to calculate the error criterion, the models were built using 90 % of the sample and tested with the remaining 10 %, thus simulating as closely as possibly the real conditions under which the model would be built in order to later fit it to new observation data unrelated to the construction of the models.Finally, this research work was able to estimate the concentrations of NO₂ from 2006 to 2008 in agreement with the real experimental concentrations of NO₂ observed with success (see Fig. 8). Similarly, Figs. 9 and 10 show a good agreement between the experimental concentrations of SO₂ and PM₁₀ and their predicted concentrations using the MARS models from 2006 to 2008, respectively.

Conclusions

In the first place, this research described steps for the construction of three MARS models to estimate quickly and with a high degree of accuracy the concentrations of NO₂, SO₂, and PM₁₀ from 2006 to 2008. We have provided examples of real applications and simple explanations of two commonly used statistics for the selection of the best-fitting models: the coefficient of determination and correlation coefficient.Secondly, the MARS models are potentially useful for predicting pollutant concentrations in the atmosphere. In other words, this new and innovative methodology developed here could be applied to other industrial cities with similar or different sources of pollutants, but it would be necessary to take into account the specific nature of each location.Finally, the results of this research about the development of models of local pollutant concentrations are a valuable tool for mitigation projects of acid rain and for the research of the effect of particulate matter on human health. Furthermore, there is an increasing interest to use mathematical models with good physical properties to understand the behavior of the pollutants in the atmosphere in order to improve the air quality and to reduce the number of deaths. In this way, this model can be assembled inside other more general models of the atmosphere. Additionally, one of the main findings of this study was to set the order of priority (hierarchy) of the predictor variables involved in the estimation of the dependent variables: NO₂, SO₂, and PM₁₀. Furthermore, this paper presents examples of real applications and simple explanations of statistical calculation for the selection of the best-fit models.

References

Akkoyunku A, Ertürk FA (2003) Evaluation of air pollution trends in Istanbul. Int J Environ Pollut 18:388–398
Article Google Scholar
Anderson HR (2009) Air pollution and mortality: a history. Atmos Environ 43(1):142–152
Article CAS Google Scholar
Anderson W, Prescott GJ, Packham S, Mullins J, Brookes M, Seaton A (2001) Asthma admissions and thunderstorms: a study of pollen, fungal spores, rainfall, and ozone. Q J Med 94(8):429–433
Article CAS Google Scholar
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
Google Scholar
Boznar M, Lesjack M, Mlakar P (1993) A neural network based method for short-term predictions of ambient SO₂ concentrations in highly polluted industrial areas of complex terrain. Atmos Environ 270:221–230
Article Google Scholar
Brimblecombe P (2011) Air pollution episodes. Enc Environ Health 39–45
Chaloulakou A, Saisana M, Spyrellis N (2003) Comparative assessment of neural networks and regression models for forecasting summertime ozone in Athens. Sci Total Environ 313:1–13
Article CAS Google Scholar
Chou S–M, Lee T–S, Shao YE, Chen I–F (2004) Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines. Expert Syst Appl 27(1):133–142
Article Google Scholar
Colbeck I (2008) Environmental chemistry of aerosol. Wiley, New York
Book Google Scholar
Comrie AC, Diem JE (1999) Climatology and forecast modeling of ambient carbon monoxide in Phoenix. Atmos Environ 33:5023–5036
Article CAS Google Scholar
Cooper CD, Alley FC (2002) Air pollution control. Waveland Press, New York
Google Scholar
de Cos Juez FJ, Sánchez Lasheras F, García Nieto PJ, Suárez Suárez MA (2009) A new data mining methodology applied to the modelling of the influence of diet and lifestyle on the value of bone mineral density in post-menopausal women. Int J Comput Math 86(10):1878–1887
Article Google Scholar
Domike JR, Zacaroli AC (2013) The Clean Air Act handbook. American Bar Association, Washington
Google Scholar
Efron B, Tibshirani R (1997) Improvements on cross-validation: the .632+ bootstrap method. J Am Stat Assoc 92(438):548–560
Google Scholar
Elbir T, Muezzinoglu A, Bayram A (2000) Evaluation of some air pollution indicators in Turkey. Environ Int 26(1–2):5–10
Freedman D, Pisani R, Purves R (2007) Statistics. W.W. Norton & Company, New York
Google Scholar
Friedlander SK (2000) Smoke, dust and haze: fundamentals of aerosol dynamics. Oxford University Press, New York
Google Scholar
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–141
Article Google Scholar
Friedman JH, Roosen CB (1995) An introduction to multivariate adaptive regression splines. Stat Methods Med Res 4:197–217
Article CAS Google Scholar
García Nieto PJ (2001) Parametric study of selective removal of atmospheric aerosol by coagulation, condensation and gravitational settling. Int J Environ Heal R 11:151–162
Article Google Scholar
García Nieto PJ (2006) Study of the evolution of aerosol emissions from coal-fired power plants due to coagulation, condensation, and gravitational settling and health impact. J Environ Manag 79(4):372–382
Article Google Scholar
García Nieto PJ, Sánchez Lasheras F, de Cos Juez FJ, Alonso Fernández JR (2011) Study of cyanotoxins presence from experimental cyanobacteria concentrations using a new data mining methodology based on multivariate adaptive regression splines in Trasona reservoir (Northern Spain). J Hazard Mater 195:414–421
Article Google Scholar
García Nieto PJ, Alonso Fernández JR, Sánchez Lasheras F, de Cos Juez FJ, Díaz Muñiz C (2012) A new improved study of cyanotoxins presence from experimental cyanobacteria concentrations in the Trasona reservoir (northern Spain) using the MARS technique. Sci Total Environ 430:88–92
Article Google Scholar
García Nieto PJ, Combarro EF, del Coz Díaz JJ, Montañés E (2013) A SVM-based regression model to study the air quality at local scale in Oviedo urban area (northern Spain): a case study. Appl Math Comput 219(17):8923–8937
Article Google Scholar
Gardner MW, Dorling SR (1999) Neural network modelling and prediction of hourly NO_x and NO₂ concentrations in urban air in London. Atmos Environ 33(5):709–719
Article CAS Google Scholar
Godish T (2004) Air quality. Lewis Publishers, Boca Raton
Google Scholar
Hastie T, Tibshirani R, Friedman J (2003) The elements of statistical learning. Springer, New York
Google Scholar
Haykin S (1999) Neural networks, comprehensive foundation. Prentice Hall, New Jersey
Google Scholar
Hewitt CN, Jackson AV (2009) Atmospheric science for environmental scientists. Wiley, New York
Google Scholar
Hooyberghs J, Mensink C, Dumont D, Fierens F, Brasseur O (2005) A neural network forecast for daily average PM₁₀ concentrations in Belgium. Atmos Environ 39(18):3279–3289
Article CAS Google Scholar
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning: with applications in R. Springer, New York
Book Google Scholar
Jerrett M, Burnett RT, Arden Pope C III, Ito K, Thurston G, Krewski D, Shi Y, Calle E, Thun M (2009) Long-term ozone exposure and mortality. New Engl J Med 360(11):1085–1095
Article CAS Google Scholar
Karaca F, Alagha O, Ertürk F (2005) Statistical characterization of atmospheric PM₁₀ and PM_2.5 concentrations at a non-impacted suburban site of Istanbul, Turkey. Chemosphere 59(8):1183–1190
Article CAS Google Scholar
Karaca F, Nikov A, Alagha O (2006) NN-AirPol: a neural-network-based method for air pollution evaluation and control. Int J Environ Pollut 28(3–4):310–325
Article CAS Google Scholar
Kukkonen J, Partanen L, Karpinen A, Ruuskanen J, Junninen H, Kolehmainen M, Niska H, Dorling S, Chatterton T, Foxall R, Cawley G (2003) Extensive evaluation of neural networks models for the prediction of NO₂ and PM₁₀ concentrations, compared with a deterministic modelling system and measurements in central Helsinki. Atmos Environ 37:4539–4550
Article CAS Google Scholar
Lantz B (2013) Machine learning with R. Packt Publishing, Birmingham
Google Scholar
Lucking AJ, Lundback M, Mills NL, Faratian D, Barath SL, Pourazar J, Cassee FR, Donaldson K, Boon NA, Badimon JJ, Sandstrom T, Blomberg A, Newby DE (2008) Diesel exhaust inhalation increases thrombus formation in man. Eur Heart J 29(24):3043–3051
Article CAS Google Scholar
Lutgens FK, Tarbuck EJ (2012) The atmosphere: an introduction to meteorology. Prentice Hall, New York
Google Scholar
Monteiro A, Lopes M, Miranda AI, Borrego C, Vautard R (2005) Air pollution forecast in Portugal: a demand from the new air quality framework directive. Int J Environ Pollut 5:1–9
Google Scholar
Phalen RN (2011) Introduction to air pollution science. Jones & Bartlett Learning, Burlington
Google Scholar
Picard R, Cook D (1984) Cross-validation of regression models. J Am Stat Assoc 79(387):575–584
Article Google Scholar
Schnelle KB, Brown CA (2001) Air pollution control technology handbook. CRC Press, Boca Raton
Book Google Scholar
Seinfeld JH, Pandis SN (2006) Atmospheric chemistry and physics: from air pollution to climate change. Wiley, New York
Google Scholar
Sekulic SS, Kowalski BR (1992) MARS: a tutorial. J Chemometr 6:199–216
Article CAS Google Scholar
Singal SP (2012) Air quality monitoring and control strategy. Alpha Science International, Oxford
Google Scholar
Suárez Sánchez A, García Nieto PJ, Riesgo Fernández P, del Coz Díaz JJ, Iglesias-Rodríguez FJ (2011) Application of a SVM-based regression model to the air quality study at local scale in the Avilés urban area (Spain). Math Comput Model 54(5–6):1453–1466
Article Google Scholar
Törnqvist HK, Mills NL, Gonzalez M, Miller MR, Robinson SD, Megson IL, MacNee W, Donaldson K, Söderberg S, Newby DE, Sandström T, Blomberg A (2007) Persistent endothelial dysfunction in humans after diesel exhaust inhalation. Am J Resp Crit Care Med 176(4):395–400
Article Google Scholar
Vapnik V (1999) The nature of statistical learning theory. Springer, New York
Google Scholar
Vidoli F (2011) Evaluating the water sector in Italy through a two stage method using the conditional robust nonparametric frontier and multivariate adaptive regression splines. Eur J Oper Res 212(13):583–595
Article Google Scholar
Vincent JH (2007) Aerosol sampling: science, standards, instrumentation and applications. Wiley, Chichester, England
Book Google Scholar
Wang LK, Pereira NC, Hung YT (2004) Air pollution control engineering. Humana Press, New York
Book Google Scholar
Wark K, Warner CF, Davis WT (1997) Air pollution: its origin and control. Prentice Hall, New York
Google Scholar
Weinhold B (2008) Ozone nation: EPA standard panned by the people. Environ Health Persp 116(7):A302–A305
Article Google Scholar
Xu QS, Daszykowski M, Walczak B, Daeyaert F, de Jonge MR, Heeres J, Koymans LMH, Lewi PJ, Vinkers HM, Janssen PA, Massart DL (2004) Multivariate adaptive regression splines—studies of HIV reverse transcriptase inhibitors. Chemometr Intell Lab 72(1):27–34
Article CAS Google Scholar

Download references

Acknowledgments

The authors wish to acknowledge the computational support provided by the Department of Mathematics at the University of Oviedo. Additionally, this paper has been funded by the Government of the Principality of Asturias through funds from the Programme of Science, Technology and Innovation (PCTI) of Asturias 2006–2009, cofinanced by 80 % within the priority Focus 1 of the Operational Programme FEDER of the Principality of Asturias 2007–2013 (research project FC–11–PC10–19). Finally, we would like to thank Anthony Ashworth for his revision of English grammar and spelling of the manuscript.

Author information

Authors and Affiliations

Department of Mathematics, Faculty of Sciences, University of Oviedo, 33007, Oviedo, Spain
P. J. García Nieto & E. García-Gonzalo
Department of Electrical Engineering, Campus de Viesques, University of Oviedo, 33204, Gijón, Spain
J. C. Álvarez Antón
Department of Mechanical Engineering, University of Vigo, 36200, Vigo, Spain
J. A. Vilán Vilán

Authors

P. J. García Nieto
View author publications
You can also search for this author in PubMed Google Scholar
J. C. Álvarez Antón
View author publications
You can also search for this author in PubMed Google Scholar
J. A. Vilán Vilán
View author publications
You can also search for this author in PubMed Google Scholar
E. García-Gonzalo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P. J. García Nieto.

Additional information

Responsible editor: Michael Matthies

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nieto, P.J.G., Antón, J.C.Á., Vilán, J.A.V. et al. Air quality modeling in the Oviedo urban area (NW Spain) by using multivariate adaptive regression splines. Environ Sci Pollut Res 22, 6642–6659 (2015). https://doi.org/10.1007/s11356-014-3800-0

Download citation

Received: 13 September 2014
Accepted: 29 October 2014
Published: 21 November 2014
Issue Date: May 2015
DOI: https://doi.org/10.1007/s11356-014-3800-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Air quality modeling in the Oviedo urban area (NW Spain) by using multivariate adaptive regression splines

Abstract

Similar content being viewed by others

Modeling of PM10 Air Pollution in Urban Environment Using MARS

Modeling of air pollutants using least square support vector regression, multivariate adaptive regression spline, and M5 model tree models

Developing a model to predict air pollution (case study: Tehran City)

Introduction