Introduction

Along with the acceleration of urbanization process, air quality has become an important protection factor which influences urban sustainable development and ecological city construction. Particulate matter is one of the major pollutants that affect air quality in urban and even rural areas of the world (Gupta et al. 2006). The particles with aerodynamic diameters less than 2.5 μm are defined as fine particles (PM2.5) (Liu et al. 2005, 2009), while less than 10 μm are called inhalable particles (PM10). The density of PM2.5 ground observation network in China is relatively low, and the distribution of observation stations is uneven, so it’s difficult to reflect the spatial distribution characteristics of PM2.5 according to the ground monitoring data alone.

Remote-sensing techniques, which provide consistent measurements at broad-scale and frequent time intervals, have been increasingly adopted to assess surface level of particulate matter concentration at high spatial and temporal resolutions (Gupta and Christopher 2009a). Martin (2008) has reviewed the state-of-arts satellite remote-sensing sensors for air quality monitoring, and it has been revealed that aerosol remote sensing at visible wavelengths exhibits high sensitivity to boundary layer concentrations. A rigorous review of the sensors’ characteristics led to the hypothesis that moderate resolution imaging spectroradiometer (MODIS) instrument is most likely to achieve the best results and has been utilized to retrieve aerosol optical thickness (AOT) (Kaufman et al. 1997; Tanré et al. 1997). Wang and Christopher (2003) have demonstrated that satellite-derived AOT is a good surrogate for monitoring particulate matter air quality over the earth.

Wang and Martin (2007) have estimated near ground PM2.5 concentration in Beijing area with MODIS 1 km data based on the vertical and humidity correction of AOT inversion; the correlation with observed values is relatively high (R 2 = 0.47). Liu et al. (2005) have used aerosol vertical profiles simulated by atmospheric chemistry model to the vertical correction of satellite remote sensing AOT and extracted the extinction contribution of surface layer aerosol which can reflect better changes of near ground particles (R 2 = 0.46). Van Donkelaar et al. (2006) have demonstrated that the main influence factor between AOT and PM2.5 is the relative vertical profiles of aerosol extinction coefficient. Then, Donkelaar et al. (2010) have given a map of the average PM2.5 concentration from 2001 to 2006; in this map, even the relatively low PM2.5 concentration in east China is about 60–90 μg/m3, while the high value is more than 100 μg/m3. What’s more, the research also points out that the value is still 25 % lower than the real situation in some areas.

The works have considered the influence of various environmental, meteorological factors on AOT–PM2.5 relationship, using multivariate statistical regression model to describe the relationship between near ground particulate concentration and multiple factors including AOT. However, the regression equation is only applicable to the estimation of average conditions, a situation that underestimates high concentration or overestimates low concentration would arise (Gupta and Christopher 2009a). Previous research has shown that the artificial neural network (ANN) can achieve a better inversion of particulate concentration by introducing the meteorological factors in the process of constructing appropriate satellite AOT–PM relationship (Gupta and Christopher 2009b).

In this paper, ANN (Matthew 1990) is developed with satellite, ground, and meteorological data sources to assess the concentration of particulate matter. The spatiotemporal distribution of PM2.5 and PM10 concentration and also the short-term trends over China in the period of 2006 and 2010 have been analyzed. Furthermore, population exposure to PM2.5 of China in 2010 has been discussed and the per capita PM2.5 concentration and the population-weighted PM2.5 concentration have been calculated.

The rest of this paper is organized as follows. In Section “Data and methodology,” we give a brief description on the data sources and the mechanisms and schedules of proposed methods for particulate matter concentration estimation and short-term trends analysis. The spatiotemporal distribution and short-term trends of particulate matter concentration in the past 5 years over China are clarified and discussed in Section “Results and discussion.” Finally, a conclusion is drawn in Section “Conclusions.”

Data and methodology

Study area

For covering particulate matter problem in China, our spatial domain includes mainland China (excludes Hainan, Hongkong, Macao, and Taiwan). To be sure, Hainan province, Hongkong, Macao, and Taiwan are not included in the study area because of the influence of cloud and aerosol type. Macao is also left out of the study area due to the contradiction between its small area and spatial resolution (10 km) of the estimated concentration.

Data sources

Four data sources are used in the study; the details of data sources and their uncertainties are as follows.

  1. 1.

    MODIS atmosphere aerosol product (MOD04_L2). The MOD04_L2 is daily level-2 data, derived from National Aeronautics and Space Administration (NASA) Terra platform. It has been produced at the spatial resolution of a 10 × 10 km (at nadir) pixel array. The MOD04_L2 product is suitable for granules where the minimum solar zenith angle is less than or equal to 72°. Levy and Remer (2007) demonstrated that MODIS AOD product has 10–20 % uncertainty compared with surface station observation.

  2. 2.

    ERA-Interim, which is the latest global atmospheric reanalysis produced by European Centre for Medium-Range Weather Forecasts (ECMWF) (Dee et al. 2011). It now extends back to 1979, and the analysis continues to be extended forward in near-real-time. More detailed description of the ERA-Interim product archive can be found in the paper of Berrisford et al. (2009). Simmons et al. (2010) have found that ERA-Interim data agrees well with Climatic Research Unit and Hadley Centre analyses of monthly station temperature data (CRUTEM3), and the correlations between the CRUTEM3 and ERA-Interim data in North America and Asia exceed 99 %.

  3. 3.

    Ground-observed particulate matter concentration data. Due to less monitoring data of China, some American site monitoring data derived from the US Environmental Protection Agency (EPA) is used in this paper. Considering the fact that particulate matter concentration in China is much higher than USA, ground-observed particulate matter concentration data of China is introduced to achieve a more plausible estimation model.

    The EPA and its partners in the USA set up the air quality monitoring network to collect ground-level particulate matter concentration in air pollution monitoring. Total suspended particulate (TSP), PM10, and PM2.5 concentrations were obtained according to EPA-defined methodology.

    We consider monitoring data as the gold standard here, while acknowledging that the ground measurements are not error-free (instrument error variance of approximately 1.5 based on colocated monitors, relative to a trimmed variance of the measurements of 60) (Christopher et al. 2008). To assess the relationship between PM2.5 and AOD, we matched monitoring data to the nearest AOD pixel, omitting a small number of monitors for which the nearest pixel centroid is closer to another monitor.

    The AOD–PM relationship varies by region and aerosol characteristics, but network generalization can be improved by ensuring that the training data are extensive and comprehensive. An extensive training set would include all geographic regions of the Earth, all seasons, night and day cases, and so forth (Blackwell and Chen 2009). In this paper, the utilization of the US monitoring data along with the Chinese monitoring data is to assure the training dataset contains extensive sampling (that is, high, medium, and low monitoring values).

  4. 4.

    Gridded population dataset, which is derived from the Center for International Earth Science Information Network (CIESIN), contains the distribution of human population. It is a gridded data product that renders global population data at the scale and extent required to demonstrate the spatial relationship of human population and the environment across the globe.

Particulate matter concentration estimation

The ANN algorithm, which is designed for modeling complex relationships between inputs and outputs or to find patterns in data, has been proved to solve this problem (Gupta and Christopher 2009a, b; Yao et al. 2012a).

An ANN is typically defined by three types of layers of units: input, hidden, and output. Each layer can contain multiple neurons. A layer of input units is connected to a layer of hidden units, which is connected to a layer of output units. With the use of input and output sample sets, network structure builds up network-mapping relationship on a given input and output through its training, learning, and adjustment of neural network weights and thresholds. Trained network will be able to grasp the essential characteristics, not only in the input sample dataset but also in the given appropriate output dataset, that is, with the generalization. It is trained using the traditional error back-propagation (BP) algorithm. BP neural network is a forward multilayer network, which was first developed by Rumelhart et al. (1986). Besides, in order to minimize the error sum of squares, the Levenberg–Marquardt (LM) algorithm (Levenberg 1944) was utilized to improve BP network training. LM algorithm is an effective method of nonlinear least squares problem. It is superior to conventional BP algorithm with fast convergence and high approximation accuracy.

In this study, there are eight parameters in the input units of the constructed neural network, including latitude, longitude, single scattering albedo (SSA), aerosol optical thickness (AOT), wind speed, relative humidity, skin temperature (SKT), and planetary boundary layer height. Where, the AOT and SSA are retrieved from MODIS instruments, and the last four parameters are derived from ERA-Interim. There are 16 hidden layers in the constructed neural network model. The performance of the network is evaluated by calculating absolute percentage errors (APE) (Gupta and Christopher 2009a),

$$ \mathrm{APE}=\frac{100\times \langle \left|{Y}_{\mathrm{est}}-{Y}_{\mathrm{obs}}\right|\rangle }{\langle \left|{Y}_{\mathrm{obs}}\right|\rangle } $$
(1)

where, Y est is the estimated particulate matter concentration using trained network, and Y obs is the observed particulate matter concentration in validation data set.

To construct an ANN model for the particulate matter concentration estimation, about 1,078 samples distributed over USA and China have been used. The samples are randomly assigned to be three subsets. That is, 40 % samples are selected for training, 20 % for testing, and remaining 40 % are used for model validation.

Figure 1a, b shows the correlation between estimated results and ground monitoring values on the concentration of PM10 and PM2.5, respectively. The correlation coefficient (Heij et al. 2004) and APE are also calculated between the matched pairs of the estimated result and ground monitoring value to evaluate the estimation accuracy. The correlation coefficients of PM2.5 and PM10 are, respectively, 0.82 and 0.85, while the APEs are, respectively, about 25 and 29 %, respectively. The results demonstrate the feasibility of the proposed ANN model on estimating the concentration of particulate matter, and the constructed model will be utilized in China.

Fig. 1
figure 1

Accuracy of particulate estimation concentration

Short-term trend analysis

We first used least-square method to linearly fit the monthly average atmospheric particulate matters (PM10 and PM2.5). The results can reflect the interannual short-term trends of China. However, singular value may affect the results due to the shortage of the time series. Thus, a nonparameter Mann–Kendall test (Mann 1945) method was utilized to analyze the trends of time series and decrease the effect of possible singular values. Mann–Kendall test is used to assess the significance of the trend. It can be stated most generally as a test for whether estimated results tend to increase or decrease with ground monitoring values (monotonic change). However, the monthly observation sequence is typically seasonal distribution. Hirsch et al. (1982) improved the traditional Mann–Kendall method and developed the seasonal-based method (seasonal Mann–Kendall) to solve this problem. This method could be effective in preventing the disturbance that the abnormal value may have on judging the time series trend (Partal and Kahya 2006). In the meanwhile, it does not require the sample to be normally distributed and will not be affected by certain lack of value in the time series.

In this research, the trend is calculated with the monthly average value of the mentioned 5 years, meanwhile, we have considered the seasonal distribution of PM, so the seasonal Mann–Kendall trend test is used in the research. That is, 60 effective values are involved in the seasonal Mann–Kendall trend test. We have also performed significance test in the research, and the blank area in the figures means the short-term trend in these regions cannot pass through the significance test. The trend is calculated with monthly average value but indicates the yearly short-term changing characteristic. The details about seasonal Mann–Kendall test method can be found in Hirsch et al. (1982) and Yao et al. (2012b).

Population exposure assessment

Based on the yearly average PM2.5 concentration, we calculate an indicator (P-E) for long-term population exposure assessment. It’s calculated with Eq. 2 based on grid computing. Then, a provincial statistics is performed to access the population exposure in the provincial scale.

$$ P-E=\left(P{M}_i-\mathit{\mathsf{threshold}}\kern1em \mathit{\mathsf{value}}\right)\times {P}_i $$
(2)

where, PM i is defined as the ith pixel value of PM2.5 concentration, P i is the ith pixel value of population density. According to the WHO recommended value, the threshold value in Eq. 2 is 10 μg/m3.

In the meanwhile, we also calculate per capita PM2.5 concentration (P-C) with population-weighted PM2.5 concentration (P-W). Maybe, they are not about population exposure but useful to population health assessment while used in conjunction with incidence data of disease. P-C indicates the average experience of PM2.5 in a certain province, while P-W is emphasis on the actual experience to resident, that is, the weightiness of PM2.5 concentration in population density area is larger than suburb or population sparse area. They are, respectively, calculated with Eqs. 3 and 4.

$$ P-C=\frac{\sum_{i=1}^nP{M}_i}{n} $$
(3)
$$ P-W={\displaystyle \sum_{i=1}^n}\frac{P{M}_i\times {P}_i}{\sum_{i=1}^n{P}_i} $$
(4)

where, PM i is defined as the ith pixel value of PM2.5 concentration, P i is the ith pixel value of population density, and n is the total pixel number.

Results and discussion

Parameter effects and sensitivity analysis

Why not present all available inputs and let the neural network determine which are relevant? This approach is reasonable if there are no restrictions on the training set size or the time required for training. This is usually not the case, and the efficiency of the training process can be substantially increased if excessively noisy, unimportant, or redundant inputs are removed (Blackwell and Chen 2009).

In this paper, we used stepwise regression to ascertain the input training parameters, the step-by-step iterative construction of a regression model that involves automatic selection of independent variables. The main approach used in this research is backward elimination, which involves starting with all candidate variables, testing the deletion of each variable using a chosen model comparison criterion, deleting the variable (if any) that improves the model the most by being deleted, and repeating this process until no further improvement is possible (Chatfield 1995).

As for the number of hidden nodes, it has been proven that neural networks with a single-hidden layer are universal approximators capable of representing any real-valued continuous function to arbitrary precision over a finite domain if enough hidden nodes are used. However, networks with multiple hidden layers can sometimes perform better than single-hidden-layer networks, with fewer total nodes (Blackwell and Chen 2009).

The experiment shows that over 16 hidden nodes and 9,000 runs with random sampling provided sufficient data samples for extracting the principal information of the used training data, because the satellite retrieval accuracy of PM concentration almost remained the same as the number of hidden nodes and runs increased subsequently.

Spatiotemporal distribution of particulate matter

PM2.5 spatiotemporal distribution

Figure 2 shows annual average of PM2.5 concentration of China during 2006 to 2010.

Fig. 2
figure 2

Annual average of PM2.5 concentration of China in the past 5 years

Figure 3 shows the diagram of China’s three gradient terrains. The “three gradient terrains” portrays an outline of terrain changes like ladders along west–east direction. The first and second gradient terrains are divided by the profile along Kunlun, Qilian, and Hengduan mountains. The second and third gradient terrains are divided by the profile along Great Khingan, Taihang, Wushan, and Xuefeng mountains. From Figs. 2 and 3, we found that the spatial distribution of annual average PM2.5 concentration coincides with China’s three gradient terrains.

Fig. 3
figure 3

Diagram of three gradient terrains in China

Figure 4 shows the seasonal average of PM2.5 concentration of China in 2010. Seasonal meteorological division method is used, i.e., from March to May is spring (lunar calendar), June to August is summer, September to November is autumn, and December to February next year is winter.

Fig. 4
figure 4

Seasonal average of PM2.5 concentration of China in 2010

The results show obvious seasonal variation in China. The seasonal PM2.5 concentration presents that the north is higher in the spring and winter, lowest in summer. In autumn, the PM2.5 concentration of most provinces in southeast China appears high value. This may be associated with the dust weather in spring; heating system in winter, which increases the concentration of PM2.5, while the sufficient precipitation will decrease the concentration in summer. In autumn, the high value of PM2.5 in southern China may be associated with straw burning after crop harvest.

PM10 spatiotemporal distribution

Figure 5 shows annual average of PM10 concentration of China during 2006 to 2010. During 2006 to 2008, PM10 concentration of China, especially in Xinjiang, southeast coastal areas, and Guangdong province, show a clear downward trend. In 2009, the concentration of Hunan, Hubei, and adjacent provinces increased, and decreased again in 2010. There are several obvious high value regions of China in the past 5 years, such as, most of Xinjiang, Qinghai, Gansu, Ningxia, Hubei, and parts of Inner Mongolia. Some east coast provinces, e.g., Jiangsu and Anhui, are located in the median zone. The results are consistent with the 343 monitoring and statistical data released by Ministry of Environmental Protection of China. Beijing is not in the high value area throughout the country and lower than its neighborhoods.

Fig. 5
figure 5

Annual average of PM10 concentration of China in the past 5 years

Figure 6 shows the seasonal average of PM10 concentration of China in 2010. The seasonal PM10 concentration is a bit like PM2.5 concentration; nevertheless, there are apparent differences about the high value region in autumn between them. PM2.5 high value area appears in the southeast coast cities while PM10 prefers the central regions in the southern China.

Fig. 6
figure 6

Seasonal average of PM10 concentration of China in 2010

Analysis on the spatial difference of P2.5 and PM10

As can be seen in Figs. 2 and 5, there exists an obvious spatial difference between PM2.5 and PM10. As stated in the previous section, most high PM10 concentrate in Xinjiang, Qinghai, Gansu, Ningxia, Hubei, and parts of Inner Mongolia. Yet, the distribution of PM2.5 concentration is consistent with China’s three gradient terrains. We could use Angstrom exponent and explain what is happening.

Figure 7 shows the distribution of Angstrom exponent in China in 2010. Angstrom exponent is the exponent in the formula that is usually used to describe the dependency of the aerosol optical thickness, or aerosol extinction coefficient on wavelength. Angstrom exponent is a useful quantity to assess the particle size of atmospheric aerosols or clouds and the wavelength dependence of the aerosol/cloud optical properties. Kaufman et al. (1997) have demonstrated the Angstrom exponent less than 0.7 for large dust particles, and the value greater than 1.8 for fine particulate matter, or smoke particles.

Fig. 7
figure 7

Distribution of Angstrom exponent in China (2010). Data is from Terra MODIS

Angstrom exponent is inversely related to the average size of the particles in the aerosol: the smaller the particles, the larger the exponent. That is, for a given pixel in Fig. 7, the larger the Angstrom exponent value, the greater the proportion of PM2.5; the smaller the Angstrom exponent value, the greater the proportion of PM10. For example, we can see obvious high value of Angstrom exponent in Heilongjiang province in Fig. 7; the corresponding region demonstrates high PM2.5 concentration (Fig. 2) and low PM10 concentration (Fig. 5). It can explain the spatial difference of PM2.5 and PM10 concentrations in some extent.

Short-term trends of particulate matter

PM concentration changing trends consist of long-term trend and short-term trend. That is, PM concentration changing trends in China may point to a long-term increasing trend, with some short-term variability, which need to be checked in the future research. The short-term trends are not representative of the long-term changes in PM, but important when testing time series models (Foster and Rahmstorf 2011; Thompson et al. 2009).

Short-term PM variability means that individual years can have higher PM concentrations than the previous year, while the underlying decreasing trend continues. Short-term variations in PM may be due to the government behavior. For example, Beijing adopted many prevention and control measures to improve air quality for the 29th Olympic Games in 2008 (Zhang et al. 2010).

We adopted seasonal Mann–Kendall test to detect the short-term trends of PM concentrations. It is a rank-based procedure, which is robust to the influence of extremes and good for use with skewed variables (Partal and Kahya 2006). Figure 8 shows the significant changes of PM2.5 and PM10 of China in the past 5 years. The blank area in the figures means the change trends are not significant in these regions.

Fig. 8
figure 8

Change trends of PM2.5 and PM10 in China from 2006 to 2010

For PM10, most provinces present the tendency of reduction. The declination of concentration is about 10–20 μg/m3. Few of provinces show increase, such as Gansu, Shaanxi, Sichuan, Guizhou, and Guangdong. Besides, in Henan, Anhui, Jiangsu, and Hunan, the concentration is also increasing, and the corresponding level is about 16–30 μg/m3.

PM2.5 shows the same change trend as PM10; most provinces present the tendency of reduction. The declination of concentration is about 3–5 μg/m3. Few of provinces show increase, such as Hebei, Shandong, Anhui, Jiangxi, Guangdong, Guangxi, Sichuan, and Guizhou. The corresponding level is about 8–16 μg/m3.

Population exposure to PM2.5

Chinese long-term population exposure to PM2.5 in 2010 is shown in Fig. 9. The spatial distribution of the results is illustrated in both gird level and provincial level. It can be seen from the provincial statistics figure that municipalities get much higher exposure level than other provinces. Shanghai suffers the highest population exposure to PM2.5, followed by Beijing and then Tianjin, Jiangsu province. On the other hand, the grid-scale map demonstrates the spatial distribution of population exposure to PM2.5. Obviously, there is a totally different population exposure level even in a certain province. Most provincial capitals, such as Guangzhou, Nanjing, Chengdu, and Wuhan, face much higher exposure level than other regions of their province. Moreover, the PM2.5 exposure situation is more serious in southeast than northwest regions for Beijing-Tianjin-Hebei region.

Fig. 9
figure 9

PM2.5 population exposure level of China in 2010

The calculated results of per capita PM2.5 concentration and population-weighted PM2.5 concentration are summarized in Table 1. Provinces with highest per capita PM2.5 concentration are Guangdong, Shanghai, and Tianjin. The following higher places are Shandong, Anhui, and Zhejiang provinces. The high-level areas of population-weighted PM2.5 concentration are Hebei, Chongqing, and Shandong provinces, and followed by Henan, Anhui, Zhejiang, Guangdong, and Guangxi provinces. That is, the per capita PM2.5 concentration in Guangdong, Shanghai, and Tianjin is totally high, but population in the potential high pollution area is sparse.

Table 1 Provincial P-C and P-W PM2.5 concentration statistics of China in 2010 (μg/m3*person)

Although the per capita PM2.5 concentration in Hebei and Chongqing is lower than the high-level area, the population in the potential high pollution area is denser. Shandong province is a special case; it not only locates in the high-level concentration area, but also the population in the potential high pollution area is dense.

Conclusions

The main purpose of this paper is to describe the spatiotemporal distribution and short-term trends of fine particle (PM2.5) and inhalable particle (PM10) concentration over the China in the period of 2006–2010. Several data sets are used, e.g., MODIS atmosphere aerosol product, ERA-Interim, ground-based particulate matter observations, and gridded population dataset. The artificial neural network (ANN) was utilized to estimate the concentration of particulate matter, and seasonal Mann–Kendall test method was utilized to analyze the short-term trends.

Most high PM10 concentration appears in Xinjiang, Qinghai, Gansu, Ningxia, Hubei, and parts of Inner Mongolia. The distribution of PM2.5 concentration is consistent with China’s three gradient terrains. The results also show that the interannual PM10 and PM2.5 concentrations in the past 5 years present downward short-term trend, on the whole. The declinations of PM2.5 concentration and the PM10 concentration in most provinces are, respectively, about 3–5 and 10–20 μg/m3, while a fraction of provinces appear in the increasing trend of 8–16 and 16–30 μg/m3.

Municipalities get much higher exposure level than other provinces. Shanghai suffers the highest population exposure to PM2.5, followed by Beijing and then Tianjin, Jiangsu province. Most provincial capitals, such as Guangzhou, Nanjing, Chengdu, and Wuhan, face much higher exposure level than other regions of their province. Moreover, the PM2.5 exposure situation is more serious in southeast than northwest regions for Beijing-Tianjin-Hebei region. The per capita PM2.5 concentration shows that the high-level regions concentrated in Guangdong, Shanghai, and Tianjin while the population-weighted PM2.5 concentrated in Hebei, Chongqing, and Shandong provinces in 2010.

Further improvements on the models should consider the optimization of parameters by using more robust historical data to train and select parameters. The proposed method could be used as a supplementary tool in aid of station monitoring. Moreover, it can also be used to discuss the human health effects of particulate matter while integrated with respiratory tract disease morbidity data.