1 Introduction

A number of recent studies have shown that air pollutants emitted from various sources pose serious impacts on environment and human health, particularly in urban areas. The U.S. EPA has set six commonly found air pollutants (also known as “criteria pollutants”), which are particulate matter (PM), ground-level ozone (O3), carbon monoxide (CO), sulfur oxides (SO2), nitrogen oxides (NOx), and lead (Pb) (US EPA 2012). Epidemiological studies worldwide have demonstrated that exposure to these pollutants is associated with numerous effects on human health, including increased respiratory symptoms, hospitalization for heart or lung diseases, and even premature death (WHO 2006; Jerrett et al. 2009; Clark et al. 2010; Lenters et al. 2010). Particularly, PM2.5 (particulate matter with aerodynamic diameter less than 2.5 µm) can penetrate deeply into lungs, causing or aggravating a variety of respiratory and cardiovascular illnesses, and can even lead to premature death (Pope et al. 2002; Pope and Dockery 2006; Laden et al. 2006; Turner et al. 2011; Lepeule et al. 2012). NOx (= NO + NO2) as the main precursor of ozone and nitric acid can also lead to particulates that cause respiratory problems and impair visibility. Especially NO2 as an indicator of surface air quality is associated with mortality (Steib et al. 2003; Burnett et al. 2004; Samoli et al. 2006) and respiratory morbidity (Brook et al. 2007). Exposure to CO can reduce the oxygen-carrying capacity of blood, which can cause myocardial ischemia (reduced oxygen to the heart), often accompanied by chest pain (angina) and mortality (Henz and Maeder 2005; Satran et al. 2005; Henry et al. 2006). SO2 is also linked with a number of adverse health effects and mortality (Krzyzanowski and Wojtyniak 1991; Liu et al. 2003).

The historical approach for assessing the health risks of air pollutants has been conducted individually. In fact, we are exposed to a wide variety of pollutants every day and are increasingly aware of potential health implications. Different air pollutants may cause different physical, chemical and toxic characteristics on human health, as mentioned above. Moreover, there are the potential interactions among various factors and the combined impacts on human health. Therefore, the health risk assessment is evolving away from a focus on individual pollutant toward a multi-factor integrated risk assessment involving multiple air pollutants, which is referred as cumulative risk assessment (US EPA 2007). The effects of each pollutant and the interactions among various factors on environment and human health may be inaccurate or uncertain to various degrees and should be considered when various factors and information are taken into account. This means there are some inherent complexities and uncertainties in air pollution integrated risk assessment, especially for human health. Therefore, it is desirable to explore an efficient way to evaluate the multi-factor integrated health risk due to air pollution for decision making in air quality management and planning.

Over the past decades, several stochastic methodologies were developed for assessing the health risks of air pollution (Kontos et al. 1999; Bhattacharya et al. 2000; Economopoulou and Economopoulos 2002; Oettl et al. 2003; Cangialosi et al. 2008; Carnevale et al. 2012). These previous studies were mostly based on stochastic analysis approaches. However, when the uncertain factors, such as pollutants’ physical, chemical and toxic characteristics, media conditions, receptor sensitivities, and dose–response effects, cannot be expressed as probability distributions, such stochastic methods are inapplicable. For example, if the probability of contracting cancer through exposure to site related chemicals cannot be conducted due to relatively small marginal changes in exposure, it is impossible to evaluate the health risk using a dose–response model.

Besides, when multiple factors exist in the risk assessment, their latent interactions are also very important for conducting a more sufficient and reliable assessment. A number of methods, such as screening technique, stochastic optimization, and factorial analysis, have been proposed for dealing with various uncertainties and multi-factor interactions in air quality management (Maqsood and Huang 2003; Li et al. 2006; An and Eheart 2007; Lin et al. 2008; Qin et al. 2008; Lu et al. 2010; Qin et al. 2010; Wang and Huang 2013; Wang et al. 2013). These methods can deal with uncertainties that exist in various forms (e.g., interval numbers, fuzzy sets, and probability distributions). However, they cannot reveal the interactive effects of uncertainties on integrated risk assessment.

Since the publication of a seminal paper by Zadeh in 1965, fuzzy set theory has been established as an ideal method for dealing with various kinds of uncertainty and vagueness (Zadeh 1999). Fuzzy set theory is different from probability theory. Probability theory is trying to make prediction about event from a state of partial knowledge, while fuzzy logic is all about the degree of truth—about fuzziness and partial or relative truth. Many studies have reported the risk assessment associated with environmental problems based on fuzzy set theory. For example, Smith (1995) developed a fuzzy aggregation approach for environmental quality evaluation. Chen et al. (1998) developed an integrated fuzzy risk assessment approaches for evaluating environmental risks derived from petroleum-contaminated sites. Chen et al. (2003) also proposed a hybrid fuzzy-stochastic risk assessment approach for examining uncertainties associated with both source/media conditions and evaluation criteria in a groundwater quality management system. Sadiqa and Husain (2005) developed a fuzzy-based methodology for an aggregative environmental risk assessment of drilling waste. Lopez et al. (2008) developed a fuzzy expert system to characterize the contaminated soil in function of human and environmental risks. Sadiq and Tesfamariam (2009) applied the intuitionistic fuzzy set for analytic hierarchy process (AHP) to handle both vagueness and ambiguity related uncertainties in environmental decision-making process. Topuz et al. (2011) used analytic hierarchy process and fuzzy logic to handle problems caused by complexity of environment and uncertain data for integrating environmental and human health risk assessment due to industrial hazardous materials. Ocampo-Duque et al. (2013) developed a probabilistic fuzzy hybrid model to assess river water quality and compute a water quality integrative index using fuzzy logic reasoning. Mesa-Frias et al. (2014) developed a framework to quantify the uncertainty in the health impacts of environmental interventions and to evaluate the impacts of poor housing ventilation based on fuzzy set theory.

Despite the usefulness of fuzzy set theory, few studies have reported the application of fuzzy set theory to air pollution risk assessment. Li et al. (2008) proposed an integrated fuzzy-stochastic modeling approach for quantifying uncertainties associated with both source/medium conditions and evaluation criteria and assessing air pollution risks. Reshetin (2008) described the application of a formalism of fuzzy sets to model and to assess the risk of carcinogenesis and additional mortality associated with air pollution. Kaya and Kahraman (2009) evaluated the air pollution’s level by using fuzzy specification limits and fuzzy standard deviation to obtain the process capability indices. Wang and Huang (2013) developed an interactive fuzzy boundary interval programming (IFBIP) approach through incorporating the interactive fuzzy programming (IFP) and the interval-parameter linear programming (ILP) methods within a general framework for air quality management under uncertainty. However, none of these studies has applied the fuzzy set approach to the integrated health risk assessment due to multi-factor air pollution.

The objective of this study is to develop a fuzzy set—OWA (FSOWA) approach to assess the integrated health risk associated with multi-source and multi-factor air pollution. A degree of fuzziness is incorporated into the air quality criteria by using the fuzzy sets and therefore the absolute criteria is avoided. The health risk and relative importance of various pollution factors are aggregated by two models (Max–min and Max-product composition) with the consideration of uncertainties. The main advantage of FSOWA is capable of revealing the potential interactions among various pollution factors and quantifying the uncertainties using fuzzy sets and fuzzy member functions for integrated health risk assessment. The developed FSOWA approach is examined with a case study of the state of California based on four criteria air pollutants (PM2.5, NO2, SO2, and CO). The gridded pollutant concentrations in 2008 predicted by the GIS-based multi-source and multi-box (GMSMB) modeling system (Wang and Chen 2013) are used as the inputs of the FSOWA modeling approach to evaluate the integrated health risk.

2 Methodology

2.1 GMSMB modeling system

The GMSMB modeling system was developed by Wang and Chen (2013), which consists of two air quality models: the spatial multi-box model and the multi-source and multi-grid Gaussian model. The conventional box models are usually applied for a few tens of square kilometer areas. In order to extend the GMSMB model to a large area, a multi-box model is employed, which is spatially extended by incorporating with the Geographic Information System (GIS) for simulating area source dispersions. However, the box model simulates the formation of pollutants within the box without providing any information about the local pollutant concentrations. The formulas of the state-of-the-art AERMOD model are then employed in the multi-source and multi-grid Gaussian model for simulating point source dispersions. The GMSMB modeling system is employed to estimate the gridded spatial concentration distributions of air pollutants, which are applied as inputs to the FSOWA modeling approach. More details about the GMSMB modeling system can be found in Wang and Chen (2013).

2.2 Fuzzy set-OWA approach

2.2.1 Fuzzy set theory

Fuzzy set theory, an extension of classical set theory was first proposed by Lotfi Zadeh (1965). The theory provided a mathematical framework for handling categories that permitting partial membership (or membership in degree) to model complex systems that were difficult to model through conventional set theories. A fuzzy set is characterized by a membership function which represents numerically the degree to which an element belongs to the set (Zimmermann 1992). According to Zadeh’s definition (Zadeh 1965), if X is a collection of objects denoted generically by x, a fuzzy set A in X is then defined in terms of a set of ordered pairs of elements x and its membership function:

$$ A = \left\{ {\left( {x,\mu {}_{A}\left( x \right)} \right)\left| {x \in X} \right.} \right\} $$
(1)

where µ(x) is the membership function of x in A. The mapping of the function is denoted by µ A : X → [0, 1], allowing for values from the entire unit interval. The closer the value of µ(x) to unity, the more x belongs to A.

A more convenient notation was proposed by Zadeh (1972). When X is a finite set {x 1, x 2,…, x n}, a fuzzy set on X is expressed as

$$ A = \mu_{A} \left( {x_{1} } \right)/x_{1} + \mu_{A} \left( {x_{2} } \right)/x_{2} + \cdots \mu_{A} \left( {x_{n} } \right)/x_{n} = \sum\limits_{i = 1}^{n} {\mu_{A} } \left( {x_{i} } \right)/x_{i} $$
(2)

For two fuzzy sets A and B defined on the universe X, the classical operations, including intersection, union and complement for a given element x belonging to X are carried out based on the minimum and maximum, i.e.

$$ \mu_{A \cap B} \left( x \right) = \hbox{min} \left\{ {\mu_{A} \left( x \right),\mu_{B} \left( x \right)} \right\}\,\,\,\forall \left( x \right) \in X $$
(3)
$$ \mu_{A \cup B} \left( x \right) = \hbox{max} \left\{ {\mu_{A} \left( x \right),\mu_{B} \left( x \right)} \right\}\,\,\,\,\forall \left( x \right) \in X $$
(4)
$$ \mu_{{\bar{A}}} \left( x \right) = 1 - \mu_{A} \left( x \right)\,\,\,\,\forall \left( x \right) \in X $$
(5)

There are three other operations on fuzzy sets that are important, namely, concentration, dilation, and aggregation. Concentration and dilation modify one set, similar to the complement, whereas aggregation is another connective between sets, similar to union and intersection (Beliakov et al. 2007).

Fuzzy aggregation operator combines several fuzzy sets in a desirable way to produce a single fuzzy set. The aggregation operator on n fuzzy sets, where n ≥ 2, is formally defined by a function F: [0, 1]n → [0, 1], with the properties (Cornelis et al. 2010):

  1. (i)

    \( f\underbrace {{\left( {0,0, \ldots 0} \right)}}_{n - times} = 0 \) and \( f\underbrace {{\left( {1,1, \ldots 1} \right)}}_{n - times} = 1 \)

  2. (ii)

    \( A \le B \) implies \( f\left( a \right)\,\le\,f\left( b \right) \) for all \( A,B \in [0,1]^{n} \)

2.2.2 Fuzzy relation

Fuzzy relations generalize the concept of relations by admitting the notion of partial association between elements of universes (Pedrycz et al. 2011). Given two universes X and Y, a fuzzy relation R is any fuzzy subset of the Cartesian product of X and Y (Zadeh 1971). Equivalently, a fuzzy relation on X × Y is a mapping R: X × Y → [0, 1]. Fuzzy relations play an important role in fuzzy modeling, fuzzy diagnosis, and fuzzy control.

The membership function of R for some pair (x, y), R(x, y) = 1, denotes that the two elements x and y are fully related. Conversely, R(x, y) = 0, means that these elements are unrelated. Whereas the values in-between, 0 < R(x, y) < 1, underline a partial association. The basic operations on fuzzy relations, say union, intersection, and complement, conceptually follow the corresponding operations on fuzzy sets once fuzzy relations are fuzzy sets formed on multidimensional spaces (Zadeh 1975).

A mechanism to construct fuzzy relations is through the use of the concept of Cartesian product extended to fuzzy sets. The concept closely follows the one adopted for sets once they involve the pairs of points of the underlying universes, added with a membership degree (Pedrycz and Gomide 2007).

If U: Z × X → [0, 1] and V: Z × Y → [0, 1] are fuzzy relations on finite universes, X = {x 1, x 2,…, x n}, Z = {z 1, z 2,…, z n}, and Y = {y 1, y 2,…, y n}, represented by (p × n) and (p × m) fuzzy relational matrices [u ki ] and [v kj ], respectively, and R = [r ij ] is the (m × n) fuzzy relational matrix associated with a fuzzy relation R: X × Y → [0,1], then the fuzzy relation becomes:

$$ V = U \bullet R $$
(6)

Denote by U k the kth row of U and by V k the kth row of V, k = 1, 2,…, p. Let R j be the jth column of R, j = 1, 2,…, m. Equation (6) can be rewritten as follows:

$$ \left( \begin{gathered} V^{1} \hfill \\ V^{2} \hfill \\ \vdots \hfill \\ V^{p} \hfill \\ \end{gathered} \right) = \left( \begin{gathered} U^{1} \hfill \\ U^{2} \hfill \\ \vdots \hfill \\ U^{p} \hfill \\ \end{gathered} \right) \cdot \left( {R^{1} R^{2} \cdots R^{m} } \right) = \left( {\begin{array}{*{20}c} {U^{1} \cdot R^{1} } & {U^{1} \cdot R^{2} \cdots } & {U^{1} \cdot R^{m} } \\ \vdots & \ddots & \vdots \\ {U^{p} \cdot R^{1} } & {U^{p} \cdot R^{2} \cdots } & {U^{p} \cdot R^{m} } \\ \end{array} } \right) $$
(7)

where • can be a t-norm or t-conorm, referred as max–min or max-product composition (Zadeh 1971).

2.2.3 Fuzzy set-ordered weighted averaging (FSOWA) approach

When fuzzy set theory is used to produce an aggregate fuzzy set by operating on the membership grades of fuzzy sets, there are two potential pitfalls, exaggeration and eclipsing that are very important in aggregation process (Ott 1978). Exaggeration is a case where individual pollution factor possess lower values (i.e., in an acceptable range), yet the aggregate comes out unacceptably high. Eclipsing is the converse phenomenon, where one or more of the pollution factors are of relatively high value (i.e., in an unacceptable range), yet the aggregate comes out as unacceptably low. These phenomena are typically affected by the method of aggregation (Veiga 1995).

There are many different aggregation operators (Beliakov et al. 2007). To simultaneously reduce both exaggeration and eclipsing, an ordered weighted averaging (OWA) operator (Yager 1988), which is based on averaging aggregation, is employed. Yager (1988) defined the OWA operator: a mapping F: R n → R (where R → [0, 1]) as an ordered weighted averaging operator of dimension n if it is associated with a weighting vector (w 1, w2,…; w n )T, so that

$$ \omega_{i} \in \left[ {0,1} \right],\,\,\sum\limits_{i = 1}^{n} {\omega_{i} } = 1 $$
(8)
$$ {\text{and}}\,\,\,\,\,F\left( {a_{1} ,a_{2} , \ldots a_{n} } \right) = \sum\limits_{j = 1}^{n} {b_{j} } \cdot \omega_{j} $$
(9)

where b j is the jth largest element of \( (a_{1} , \, a_{2} , \ldots,a_{n} ) \). The OWA operators are symmetric aggregation functions that allocate weights according to the input values, thus can emphasize the largest, smallest or mid-range inputs. By using OWA operators, the exaggeration and eclipsing in aggregation function can be reduced.

Generally, triangular fuzzy number (TFN) or trapezoidal fuzzy number (ZFN) are used to represent fuzzy variable. ZFN can be represented by four vertices (a, b, c, d) on the universe of discourse, representing the minimum, most likely interval, and maximum values, respectively. TFN is a special case of ZFN, where b = c. In this study, TFN is used because the relationship between health risk and air pollution factor is considered as linear and relative stable for a period. TFN is defined by two distinct factors, membership grade (i.e. fuzzy relation) and weight coefficient. The membership grade is the degree of each pollution factor belongs to each fuzzy air quality criteria. Whereas the weight coefficient denotes the relative importance of each pollution factor to air quality, which is used to identify the different scales of health impact among various pollution factors. The higher the weight coefficient is, the larger the impact.

Incorporating the OWA operators into the max–min and max-product compositions (Zadeh 1971), we have two fuzzy set-ordered weighted averaging (FSOWA) models for air quality integrated health risk assessment:Max–min composition model:

$$ F = \left( {f_{j} } \right) = \hbox{max} \left\{ {\hbox{min} \left( {w_{1} ,r_{1j} } \right),\hbox{min} \left( {w_{2} ,r_{2j} } \right), \ldots ,\hbox{min} \left( {w_{m} ,r_{mj} } \right)} \right\}\,\,\, \,\,\,j = 1,{ 2}, \ldots ,n $$
(10)

Max-product composition model:

$$ F = \left( {f_{j} } \right) = \hbox{max} \left\{ {\left( {w_{1} \cdot r_{1j} } \right),\left( {w_{2} \cdot r_{2j} } \right), \ldots ,\left( {w_{m} \cdot r_{mj} } \right)} \right\}\,\,\,\,\,\,\, j = 1,{ 2}, \ldots ,n $$
(11)

where F = (f j ) represents the membership grade (possibility) for integrated health risk level j to occur; w i is the degree of importance for pollutant i, and r ij is the membership grade for fuzzy relation between pollutant i and risk level j. Through using these two models, two potential pitfalls, exaggeration and eclipsing, can be simultaneously reduced in the integrated health risk assessment.

The FSOWA approach is a generalized aggregation transformation that provides flexible aggregation ranging between the minimum and the maximum operators. It can quantify the different impact scales of various pollution factors on air quality and stress the maximum effect. For an air quality system containing several pollutants with high concentrations, the integrated health risk level can be obtained through the above models.

2.2.4 Integrated health risk assessment

The integrated health risk caused by multiple air pollutants is assessed based on the gridded spatial concentration distributions predicted by the GMSMB modeling system. This paper is focused on the five stages of integrated health risk assessment: (1) quantification of fuzzy health risk levels using six fuzzy sets based on the U.S. EPA Air Quality Index (AQI) (US EPA 2009a); (2) construction of fuzzy membership functions; (3) calculation of relative importance (i.e. weighting coefficient w i for each pollution factor); (4) construction of fuzzy set-OWA modeling; and (5) assessment of integrated health risk based on the FSOWA modeling. An overview of system framework and five stages of integrated health risk assessment (shaded boxes) are shown in Fig. 1.

Fig. 1
figure 1

Framework of FSOWA approach

2.2.4.1 Quantification of fuzzy health risk levels

The fuzzy health risk levels are represented by the classifying representative values (e i ) and the benchmarks (s i ). According to the Air Quality Index (AQI) made by the U.S. EPA, the air quality is divided into six levels with a yardstick that runs from 0 to 500 (US EPA 2009a). The higher the AQI value is, the higher the risk level of air pollution and the greater the health concern. An AQI value of 50 represents that the air quality is considered satisfactory and air pollution poses little or no risk, which is the level that the U.S. EPA has set as the annual mean value in the NAAQS for these pollutants to protect public health. An AQI value of 100 generally corresponds to the air quality is acceptable; however, there may be a moderate health concern for a very small number of people. While an AQI value over 100 represents unhealthy and over 300 represents hazardous air quality (US EPA 2009a). Using the AQI calculator developed by the U.S. EPA (US EPA 2009b), the AQI values can be converted to the pollutant concentrations, as shown in Table 1. Table 1 only lists four criteria pollutants (i.e. PM2.5, NO2, SO2 and CO), because in this study, the pollutant concentrations are provided by the GMSMB modeling system, which cannot be used for O3 and Pb.

Table 1 AQI values and corresponding concentration intervals of pollution factors (US EPA 2009a)

Based on Table 1, the concentration intervals can be transformed into the fuzzy sets, which are represented with the classifying representative values (e i ) and the benchmarks (s i ) which is the limit of safe level of pollution factor. Six fuzzy sets, i.e. six health risk levels are defined to represent air quality within ‘Good’, ‘Moderate’, ‘Low unhealthy’, ‘Unhealthy’, ‘Very unhealthy’ and ‘Hazardous’. For the first risk level, the upper limit concentration values are taken as the classifying representative values (e 1) since they are close to the annual mean values of the NAAQS for these pollutants (except CO, it’s half of 8-hour mean) (US EPA 2006); while for the rest risk levels, the average concentration values of each interval are taken as the classifying representative values (e i ). According to the AQI, the second risk level is “acceptable and only a moderate health concern for a very small number of people”, while the third risk level means “although general public is not likely to be affected, members of sensitive groups may experience health effects, especially for people with lung disease, older adults and children who are at a greater risk from exposure to ozone and particles in the air”. Thus, the second risk level is considered as the limit of safe level and its upper limit concentration values are taken as the benchmarks (s i ), as shown in Table 2.

Table 2 Fuzzy health risk levels
2.2.4.2 Construction of fuzzy membership functions

The membership function represents the degree of a specified concentration that belongs to the fuzzy health risk levels. Triangular fuzzy number (TFN) is used to determine the membership functions based on the classifying representative values e i of each health risk level. The linear membership functions are shown as following:

$$ r_{m} \left( {c_{i} } \right)\, = \,\left\{ {\begin{array}{llll} {\begin{array}{lll} 1 \\ {\left[ {e\left( {m + 1} \right) - c_{i} } \right]/\left[ {e\left( {m + 1} \right) - e\left( m \right)} \right]} \\ 0 \\ \end{array} } & \begin{gathered} c_{i} \le e\left( 1 \right)\,\,\,or\,\,\,\,c_{i} \ge e\left( 6 \right) \hfill \\ e\left( m \right) \le c_{i} < e\left( {m + 1} \right) \hfill \\ else \hfill \\ \end{gathered} \\ {r_{m + 1} \left( {c_{i} } \right) = 1 - r_{m} \left( {c_{i} } \right)} & {e\left( m \right) \le c_{i} < e\left( {m + 1} \right)} \\ \end{array} } \right. $$
(12)

where r m (c i ) denotes the membership grade of each pollution factor belongs to each classifying representative value; m is the number of health risk level; c i is the pollutant concentration; e(m) denotes the classifying representative value of each risk level. Following the membership functions, the fuzzy function curves for the health risk levels are created, as shown in Fig. 2.

Fig. 2
figure 2

Curves of the membership function for health risk levels

2.2.4.3 Calculation of weighting coefficient

In this study, the relative importance, i.e. weighting coefficient w i of each pollution factor is measured by the corresponding benchmark s i which is the limit of safe level. When the pollutant concentration is lower than the benchmark, it is considered to pose a smaller impact on air quality which means it causes lower health risk level. Conversely, when the pollutant concentration is higher than the benchmark, it is considered to pose a larger impact on air quality and causes a higher health risk level. Thus, the weighting coefficient w i can be calculated using the following formula:

$$ w_{i}^{{}} = \frac{{c_{i} }}{{s_{i} }} $$
(13)

where c i is the predicted concentration of each pollutant; s i is the benchmark of fuzzy health risk level of each pollution factor.

In this case, the weight coefficient w i is larger than 1 when the predicted pollutant concentration is higher than the benchmark. Consequently, the definition of the aggregation function given by F: [0, 1]n → [0,1] should be extended to R: [0, r j ·max (w 1, w2,…, w n )], where r j is the jth largest membership grade of (r 1, r 2,…, n ).

2.2.4.4 Construction of FSOWA modeling

As discussed in Sect. 2.2.3, the two models, i.e. the Max–min composition model and the Max-product composition model are used in this study.

2.2.4.5 Assessment of integrated health risk level

By loading the pollutant concentrations predicted by the GMSMB modeling system, the integrated health risk levels caused by four criteria pollutants can be assessed using the FSOWA modeling approach. The results from two models (Eqs. 10, 11) can be cross-verified by each other and the maximum integrated risk level is taken as the result of health risk assessment.

3 Case study

3.1 Overview of the study area

The state of California is chosen as the study area since there are a wide variety of climates, geographic features, meteorological factors and emission sources in this area. It is located on the West Coast of the United States, which is the most populous and third-largest state with an area of 160,000 square miles (414,000 km2). The capital of the state is Sacramento. The five largest cities are Los Angeles, San Diego, San Jose, San Francisco, and Fresno (California Department of Finance 2009). The diverse geography ranges from the Pacific Coast to the west, to the Sierra Nevada Mountains in the east, to the Mojave Desert areas in the southeast and to the Redwood-Douglas fir forests of the northwest. The center of the state is dominated by the Central Valley, a major agricultural area. The climate is often compared to that of the Mediterranean, due to warm, dry summer, and mild, wet winter. Farther inland, summer is hot and dry, and at higher altitudes the weather is more typical of a four-season cycle with cold, snowy winter. For the purpose of managing air resources on a regional scale, the state of California is divided into 15 air basins by the California Air Resources Board (CARB) based on the similar meteorological and geographic conditions and the state political boundaries, as shown in Fig. 3 (California Air Resources Board (CARB) 2009a).

Fig. 3
figure 3

Map of air basins in the state of California (CARB 2009a)

3.2 Prediction of pollutant concentrations

In the GMSMB modeling system, the state of California is horizontally divided into 10 km × 10 km grid cells (Wang and Chen 2013). The emission inventory data are obtained from the Air Emission Inventory Database of the CARB (CARB 2009b). The surface meteorological data, including ambient temperature, wind speed and direction with frequency distributions, humidity, precipitation and cloud cover measured from over 800 surface meteorological sites, are extracted from the CARB’s real-time Air Quality and Meteorological Information System (AQMIS 2) (CARB 2009c). The upper air meteorological data of monthly average at the heights from 3 m and up from the ground are obtained from the NOAA (National Oceanic and Atmospheric Administration) Upper-Air Data products (NOAA 2010). All of these data are processed as the input to the GMSMB. The annual mean concentrations of four criteria pollutants (i.e. PM2.5, NO2, SO2 and CO) at each grid center are predicted for 2008. According to the predicted concentration intervals, the California air basins are divided into several relative pollution levels (level is defined by a specific concentration range being represented with a color on GIS map), as shown in Fig. 4.

Fig. 4
figure 4

Annual mean spatial concentration distribution maps based on the air basins in the state of California in 2008 (Wang and Chen 2013)

Figure 4a shows that the state of California is evenly divided into four relative pollution levels according to the predicted concentration intervals of PM2.5 in 2008. The highest level (at a range of 15.1–21.6 μg/m3, shown in dark red) is predicted in four regions: (a) the South Coast; (b) the San Joaquin Valley; (c) the San Francisco Bay Area; and (d) the Sacramento Valley. The maximum modeling concentration for these areas is 21.6 μg/m3, which is 1.8 times higher than the California Ambient Air Quality Standards (CAAQS) (12 μg/m3) (CARB 2009d, same as below), and the National Ambient Air Quality Standards (NAAQS) (12 μg/m3) (US EPA 2006, same as below). The second highest PM2.5 pollution level (at a range of 10.1–15.0 μg/m3, shown in salmon color) is found in: (a) the South Central Coast; (b) Mojave Desert Kern; and (c) San Diego. The maximum concentration for these areas is 15.0 μg/m3, which exceeds the NAAQS and the CAAQS by 1.25. The third highest PM2.5 pollution level (at a range of 5.1–10.0 μg/m3, shown in dark pink) is obtained in: (a) the North Central Coast; (b) Mountain Counties; and (c) the North Coast; and (d) Salton Sea. The maximum concentration for these areas is 10.0 μg/m3, which meets the NAAQS and the CAAQS. For the rest regions, the lowest PM2.5 pollution level (0.0–5.0 μg/m3, shown in light pink) are predicted.

Figure 4b presents the predicted CO concentration distribution for the year 2008, which is marked with colors red to pink according to three pollution levels. The maximum modeling result is 8.5 ppm occurring in the South Coast and San Francisco Bay Area, which meets the NAAQS and the CAAQS (8 h value at 9 ppm).

Similarly, Fig. 4c, d present the predicted SO2 and NO2 concentration distributions for the year 2008, marked in red to pink based on the pollution levels. The maximum SO2 concentration is 0.007 ppm found in the South Coast, San Francisco Bay Area and San Joaquin Valley, which is lower than the NAAQS (0.030 ppm for certain areas). The highest NO2 concentrations also occur in the South Coast and San Joaquin Valley, with a maximum of 0.036 ppm, which is just over the CAAQS (0.030 ppm) and is lower than the NAAQS (0.053 ppm).

The modeling results from the GMSMB have been validated with the monitoring values obtained from the U.S. EPA Air Quality System (AQS) Database (US EPA 2010a). Since the annual average monitoring value for CO is not available, CO is not included in the model error analysis. The correlations between the modeling results and the monitoring values are analyzed with R 2 values, which are 0.89, 0.90 and 0.94, for PM2.5, NO2, and SO2, respectively. The modeling results show satisfactory agreement with the monitoring values with slope of 0.84 and intercept of 2.25 for PM2.5, slope of 0.82 and intercept of 0.003 for NO2, and slope of 0.90 and intercept of 0.0002 for SO2.

3.3 Integrated health risk assessment

By loading the pollutant concentrations predicted by the GMSMB modeling system, the integrated health risk levels caused by four criteria pollutants can be assessed using the FSOWA modeling approach. An arbitrary grid in the study area is taken as an example to illustrate the details of computational process. The concentrations of four criteria pollutants, PM2.5, NO2, SO2 and CO in this grid are 18.5 μg/m3, 0.063, 0.045, and 3.3 ppm, respectively. The procedure of integrated health risk assessment is as following:

  1. (1)

    Calculation of the membership grades (r ij ) of pollutant concentrations. The membership grade (r ij ) is the degree of predicted concentration of each pollution factor in the grid that belongs to each classifying representative value of the fuzzy health risk levels (see Table 2). The membership grade matrix R is obtained using Eq. (12):

    $$ R_{{}} = \left[ {r_{ij} } \right] = \left[ {\begin{array}{llllll} {0.6900} & {0.3100} & 0 & 0 & 0 & 0 \\ {0.3750} & {0.6250} & 0 & 0 & 0 & 0 \\ {0.8000} & {0.2000} & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 \\ \end{array} } \right] $$
  1. (2)

    Calculation of the weighting coefficient w i of each pollution factor. The vector of weighting coefficient W is determined by Eq. (13):

    $$ W = (0.5226, 0.8533, 0.3125, 0.3511)$$
  1. (3)

    Calculation of the integrated health risk caused by four pollution factors. Two integrated risk assessment results are obtained using the two FSOWA modeling (Eqs. 10, 11):

    $$ \begin{gathered} F_{ 1} = \, \left( {0. 5 2 2 6, \, 0. 6 2 50, \, 0, \, 0, \, 0, \, 0} \right) \hfill \\ F_{ 2} = \, \left( {0. 3 60 6, \, 0. 5 3 3 3, \, 0, \, 0, \, 0, \, 0} \right) \hfill \\ \end{gathered} $$

The results are illustrated in Fig. 5, which show the relationships between the integrated health risk and the fuzzy health risk levels.

Fig. 5
figure 5

Integrated health risk assessment results using Max–min composition and Max-product composition models. a the solution from the Max–min composition model; b the solution from the Max-product composition model. The shaded parts are the areas with higher membership grade

Figure 5(a) is the solution from the Max–min composition model, which indicates that the integrated health risk is between level 1 (membership grade is 0.5226) and level 2 (membership grade is 0.6250), with a maximum membership grade corresponding to the fuzzy health risk level 2. Figure 5b is the solution from the Max-product composition model, which also shows that the integrated risk is between level 1 (membership grade is 0.3606) and level 2 (membership grade is 0.5333). Both of them show that the maximum membership grades are corresponding to the fuzzy risk level 2, which cross-verifies that the integrated health risk in this grid belongs to the fuzzy risk level 2. That means the air quality is moderate, i.e. it is acceptable; however, there may be a moderate health concern for people who are more sensitive to air pollution.

4 Results and discussion

4.1 Results

Using the GMSMB and FSOWA modeling approach, the air quality integrated health risk assessment result is achieved based on 10 km × 10 km grids for the state of California in 2008, which is visually presented in ArcGIS, as shown in Fig. 6a.

Fig. 6
figure 6

Air quality integrated health risk assessment for the state of California in 2008: a result from this study; b according to the AQI values based on county (US EPA 2010b)

From Fig. 6a, we can see that the air quality in most areas of the state of California belongs to the first fuzzy health risk level (dark green areas), namely, the air quality is considered satisfactory, and air pollution poses little or no risk. Only a few areas (i.e. the South Coast, San Diego, San Joaquin Valley, Mojave Desert and the San Francisco Bay Areas) belong to the risk level 2 (yellow areas), indicating moderate air quality and some health concerns for more sensitive people. There is no area belongs to the rest of risk levels.

The integrated health risk assessment result is compared with the U.S. EPA Air Quality Index (AQI) Report created by county in 2008 (US EPA 2010b). In the AQI report, the days are counted as five categories: “good”, “moderate”, “unhealthy for sensitive group”, “unhealthy”, and “very unhealthy” based on the AQI values. The range of AQI values are varying from 12 to 92 which are visually presented in Fig. 6b. The air quality with 0 < AQI < 45 is considered as good (dark green), 45 < AQI < 50 is considered as potential moderate (light green), and 50 < AQI < 100 is considered as moderate (yellow). Table 3 lists the counties with AQI values higher than 45 and the days of dominant pollutant during 2008.

Table 3 Air quality index statistics for the state of California in 2008 (US EPA 2010b)

Figure 6 shows the air quality integrated health risk assessment from this study (Fig. 6a) is quite consistent with the AQI statistical results (Fig. 6b) in most counties. The differences only occur in a few counties, such as the San Francisco Bay Areas, Imperial, Inyo, Mariposa, and San Luis Obispo. The reason for the difference is probably due to the different air pollutants are used in this study and the AQI statistics. In this study, four criteria pollutants, i.e. PM2.5, NO2, SO2 and CO are used for the air quality integrated health risk assessment. While in the AQI statistics, two more pollutants, i.e. O3 and PM10 are used (US EPA 2010b). The differences are possibly caused by these two pollutants. From Table 3, it can be seen that the PM2.5 is the dominant pollutant of AQI values among the four criteria pollutants except the counties with differences as mentioned above. This further suggests that the differences might be caused by O3 and PM10 that we don’t use in this study. In addition, the FSOWA modeling system is based on the 10 km × 10 km grids, while the AQI statistics are based on the counties, this is probably another reason for the differences.

4.2 Discussion

The purpose of this study is to develop an approach for determining the integrated health risk due to various air pollution factors and different impact of each factor. When determining the integrated risk, there are various uncertainties. They may arise from imprecision in knowledge because of limited information, or from random variability found in the stochastic nature of most real-world variables. It could be argued that the fuzzy-set method provides a better measure for characterization of the uncertainties in circumstances characterized with limited information about statistical parameters or imprecision in knowledge. In this study, the fuzzy health risk levels are derived based on fuzzy set theory, which are defined by the triangular membership function with the classifying representative values and the benchmarks to represent the lower and upper bounds, as well as base point of the fuzzy evaluation criteria, respectively (Table 2). By taking into account the multiple air pollution factors and the relative impact of each pollution factor based on the fuzzy set theory, two models (Max–min and Max-product composition) for the integrated health risk assessment are developed to deal with the uncertainties in the parameters of models. The integrated risk is determined by the membership grade (r i, magnitude) and relative importance (w i, weighting coefficient). The uncertainty of health risk of each pollution factor is treated using fuzzy set, and their integrated health risk is determined based on the OWA operators for ambient air quality. A potential limitation of the fuzzy set approach is that the fuzzy set does not incorporate knowledge regarding correlation and other statistical information in parameters, and this could be a limitation in circumstances when there is sufficient information to incorporate statistical information such as mean, correlations and others (Mesa-Frias et al. 2014).

The developed approach is applied to evaluate the integrated health risk of four criteria pollutants (PM2.5, NO2, SO2 and CO) in the state of California. The result shows that the health risk levels of “good” and “moderate” dominate the air quality in the state of California, which means that the risk levels are not high enough to induce health problems. This is further verified by comparing with the AQI report, which shows that the range of AQI values are varying from 12 to 92 (good to moderate, US EPA 2009a) in the state of California.

The case study illustrates that the proposed FSOWA approach offers a flexible exploitation for assessing air pollution risk to human health. However, some differences are also found between the result from this study and the AQI report. Except the possible reasons mentioned above, the uncertainty in the predicted pollutant concentrations is probably another reason. The air pollution integrated health risk assessment is based on the predicted or monitored pollutant concentrations. The key to improve the assessment is the accuracy of the related pollutant concentrations. The FSOWA modeling approach can be operated separately from the air quality models, which means, it can be combined with any air quality models, such as the U.S. EPA recommended AERMOD, CALPUFF models, or more advanced numerical models, which may probably improve the assessment. In addition, the FSOWA modeling approach can be applied to much broader areas and more pollution factors such as O3 and PM10, which may also improve the assessment to match the situation in the real-world.

5 Conclusion

In this study, a fuzzy set—ordered weighted averaging (FSOWA) approach has been proposed for the integrated health risk assessment associated with multiple air pollution factors and evaluation criteria. FSOWA can handle the uncertainties in the integrated health risk assessment and can also characterize the potential interactions among various pollution factors and the combined impacts on human health. In the final aggregation process, two potential pitfalls, exaggeration and eclipsing, are of paramount importance. Through using a flexible aggregation technique, the tolerance level for these two pitfalls can be incorporated into the integrated health risk assessment.

The FSOWA modeling approach is based on the spatial concentration distributions of various pollution factors. In FSOWA, a degree of fuzziness is incorporated into the air quality criteria by using the fuzzy sets and therefore the absolute criteria is avoided. There is no sharp boundary between different air pollution risk levels, instead, it is fuzzy with implication for health risk levels. The health risk and relative importance of various pollution factors are aggregated by two models (Max–min and Max-product composition) with the consideration of uncertainties. The integration of the FSOWA approach with a GIS-based air quality modeling system offers multiple benefits. GIS implementation could provide essential information about the spatial concentration distributions of air pollutant for health risk assessment and risk area identification, which are very important for air quality management and living condition assessment in urban environment.

The developed approach has been illustrated to quantify the integrated health risk associated with four criteria pollution factors for the state of California in 2008. The results have been compared with the U.S. EPA AQI statistic report, which demonstrates that the developed FSOWA approach has provided an effective, systematic and more realistic way for combining and quantifying fuzzy quantities to achieve a more sufficient and reliable integrated health risk assessment. The main advantage of FSOWA is capable of revealing the potential interactions among various pollution factors and quantifying the uncertainties of integrative impact using fuzzy sets and fuzzy member functions for air pollution. This approach can also be applied to much broader environmental problems, such as surface and ground-water, soil, with more pollution factors and parameters. However, it has some limitations as mentioned in Sect. 4.2. And also, it is only used for linear issues, i.e. pollution factors and risk levels are linearly related. It cannot be used for nonlinear issues, such as a dynamic system. FSOWA could be extended by coupling with other types of fuzzy numbers and risk analysis methodologies to handle various types of uncertainties.