Introduction

Deterioration of air quality is a consequence of rapid urbanization and industrialization worldwide, especially in developing countries. As a result, in recent years, air quality has become societal and scientific issue of concern all over the world. In India, atmospheric aerosol load in the last decade (2000–2010) has increased rapidly (Dey and Di Girolamo 2011). Fine particles (PM2.5) present in ambient air have more detrimental health effects than coarse particles (PM10) because of their small size which aids them to reach and settle in the lungs. Exposure to PM2.5 over long period has a potential to pose premature death as a major health risk (Brauer et al. 2012; Pope et al. 2002). Health risks associated with exposure to PM2.5 have resulted in loss in life expectancy of 6.3 ± 2.2 years for Delhi region (Ghude et al. 2016).

Water-soluble ionic species, mineral dust, and carbonaceous aerosols are the constituents of atmospheric aerosols. Carbonaceous aerosols include high organic carbon (OC) fraction ranging from polar to non-polar organic compounds and low elemental carbon (EC). In India, major and variable fraction of PM10 (≈ 8%) and PM2.5 (≈ 12%) mass is composed of the organic carbon (Satsangi et al. 2012; Ram and Sarin 2010). Atmospheric organic aerosols are of potential importance due to their substantial environmental and health effects (IPCC 1995; Jacobson et al. 2000; Kawanaka et al. 2004, Monks et al. 2009).

The most significant sources for the organic compounds present in ambient air are fossil fuel combustion, biomass burning, plastic burning, secondary organic transformations, and marine sources. Organic matter present in ambient aerosols is composed of various classes of organic compounds which include aliphatic hydrocarbons (alkanes, alkanols, alkenoic acids) aromatic hydrocarbons, and anhydrosugars. These organic compounds in combination with volatile species are used as markers for the emission sources, transport pathways, and receptor location (Simoneit 1989; Schauer et al. 1996).

Numerous studies have reported the chemical composition of fine particles especially over Northern India and Indo-Gangetic Plains (IGP) (Sharma et al. 2003; Chowdhury et al. 2007; Bi et al. 2008; Pavuluri et al. 2010; Ram and Sarin 2010; Ram et al. 2012; Giri et al. 2013; Bisht et al. 2015; Tiwari et al. 2016; Dumka et al. 2017). Delhi, the capital of India, is among the top ten worst polluted cities in world (WHO 2014) with limited published data (Chowdhury et al. 2007; Singh et al. 2011; Pant et al. 2015) for composition of organic constituents primarily n-alkanes, PAHs, and phthalates in fine ambient aerosols (PM2.5). Some investigations have performed for the aerosol-associated organic compounds over the Delhi region (Sharma et al. 2003, 2007; Miyazaki et al. 2009; Singh et al. 2011; Yadav et al. 2013a, b; Gupta et al. 2017). Recent studies have reported the composition and source contribution of organic compounds present in PM10 over Delhi (Gupta et al. 2018; Gupta and Gadi 2018). Source apportionment studies with trace elements, ions, and OC and EC concentrations have been reported at Delhi (Chowdhury et al. 2007; Tiwari et al. 2013; Sharma et al. 2016; Sharma and Mandal 2017; Jain et al. 2017a, b). However, there is a dearth of studies reported in National Capital Region (NCR), India, on the measurement of organic constituents of PM2.5 to trace the possible sources. Rapid urbanization has emerged as foremost aspect for the poor air quality over the NCR, India. The NCR comprises of National Capital Territory (NCT) of Delhi and twenty two districts from adjoining states of Haryana, Rajasthan, and Uttar Pradesh. In previous studies, most of the NCR region was unmonitored as the monitoring sites were situated in urban areas.

The paper presents a detailed investigation on source identification of organic compounds present in PM2.5 across the NCR of India. In this study, measurement sites were located in Delhi and two adjoining states in the NCR of India (Uttar Pradesh and Haryana). The overall quantification and source analysis of PM2.5 with special focus on organic constituents and their spatial and temporal variation, distribution, and dispersion in the NCR is not attempted so far. The concentration of three organic compound classes, n-alkanes (C11–C35), polycyclic aromatic hydrocarbons (PAHs), and phthalates present in PM2.5 samples of Delhi and NCR region has been reported. The use of organic compounds to trace the source is advantageous since they exhibit distinct characteristics of pollutant sources. Thus, the spatial and temporal variation of aerosols with focus on estimation of organic compounds would be able to give a more realistic picture of the associated sources. Furthermore, source investigation depends on the various complex processes occurring in the atmosphere (Zhang and Wexler 2008; Buseck and Schwartz 2014). Thus, there is a need to identify the local, regional, and continental sources of the pollutants in the region along with the advection to different parts of the region. Air mass trajectory analysis has been employed to demarcate the common transport pathways of pollutants at various locations in India (Norman et al. 2001; Rana et al. 2009; Budhavant et al. 2015; Sharma et al. 2016; Sharma and Mandal 2017; Gawhane et al. 2017; Sen et al. 2017). The transport routes of PM2.5 to Delhi and surrounding states in the NCR have also been presented to determine regional and remote continental sources through back-trajectory analysis.

Materials and methods

Study area

In this study, three sampling sites were selected, one in Delhi and two in the adjacent states, Haryana and Uttar Pradesh. PM2.5 samples were collected at three different sites in the NCR during January 2016–June 2016.

Sampling site 1

Delhi, the capital of India, is spread over an area of 1483 km2 and is 160 km from the South Himalaya. Sampling was carried out in the campus of the Indira Gandhi Delhi Technical University for Women (IG), situated in the east of Delhi (28.39° N and 77.13° E). The University is located in the heart of Old Delhi area, which is the busiest area with respect to population as well as traffic density. Due to high population density around the site, along with vehicular emissions, many anthropogenic activities like biomass burning, plastic burning, and solid waste burning are also contributing to particulate matter. The sampler was located on the rooftop of the building at a height of about 6 m above ground level (Fig. 1). The Inter State Bus Terminal (ISBT) is just 400 m away from the campus, which operates bus services in Delhi and adjoining seven states. High traffic jams occur frequently during morning and evening peak hours. Lot of vehicles contribute to heavy traffic density of around 500–600 vehicles per hour (Singh et al. 2011).

Fig. 1
figure 1

Map of sampling site

Sampling site 2

Multani Mal College, Modi Nagar (MN), Uttar Pradesh, is situated adjacent to the Ghaziabad Meerut National Highway No. 34. The traffic congestion occurs for several hours during a day in the surrounding areas. Modi Nagar city comes under urban agglomeration and metropolitan region. It is approximately 90 km in the north-east of Delhi at latitude of 28.50° N and longitude of 77.34° E from site 1. This site is surrounded by educational institutes, Modi sugar mills and brick kiln small industries. Sampling was done on the rooftop of the building at a height of about 15 m above ground level.

Sampling site 3

Central University of Haryana, Mahendragarh (HR), is an institutional area with wide variety of flora. In 2013, Mahendragarh district of Haryana was added in the NCR list which emphasized on the urban development of this region. There is no urban agglomeration inside the district. As per census 2011, 85.9% of the population of district resides in rural villages. The area is surrounded by agricultural plantations and moderately dense forests. The main bus stand is 15 km away from the site which results in low traffic density in the vicinity of the site. Sampling was carried out in the campus of the University which is situated in 488 acres of land at Jant-Pali Village, Mahendragarh district of Haryana. Sampling was done on the rooftop of the building at a height of about 15 m above ground level bearing coordinates 28.21° N and 76.8° E.

Sampling period and methodology

PM2.5 samples were collected simultaneously at three different sites in the NCR twice or thrice a week during January 2016–June 2016. Fifty to sixty samples were collected during the sampling period at each site. Based on the meteorological parameters (Table 1), the sampling period has been divided into two seasons, i.e., winter (January–February) and summer (March–June). The winter period in this study presents the effect of late winter (January–February). Simultaneous sampling could not be performed at all sites in the beginning of winter (November) due to technical snags in the samplers. However, the following paper would cover annual spatio-temporal variation for various organic constituents in fine ambient aerosols over the NCR of India (Shivani et al., in preparation). Sampling for ambient PM2.5 was performed for 24 h each day starting at 10:30 a.m. (local time GMT + 5:30). PM2.5 samples were collected by a fine particulate sampler (Envirotech Model: APM 550 with flow rate, 16.6 lpm). Quartz fiber filter (QFF) (Pallflex) (47 mm) was used for collection of PM2.5 samples. QFFs were pre-cleaned by baking at 550 °C for 5 h before sampling. Before and after sampling, QFFs were stored in a Secador desiccator (Tarsons) under controlled temperature and 25–40% relative humidity (RH) for 24 h to remove moisture of the filter surface. Filters were transported in cassettes, sealed in polyethene bags to and from the sampling site. The filters were weighed twice using microbalance with resolution ± 10 μg (Sartorius Model: GC1603S-OCE), before and after sampling. Exposed filters were packed in aluminum foil to protect from sunlight and stored in the a deep freezer (Haier) at − 25 °C until further analysis.

Table 1 Meteorological parameters during the study period (source: IMD, Delhi, Lucknow, and Chandigarh) (NA, not available)

Extraction of organic compounds in PM2.5 samples

The extraction of organic compounds in PM2.5 samples was done using the ultrasonication technique in the laboratory. The organic compounds in the filter papers were extracted two times with 15 ml of dichloromethane (HPLC grade) using an ultrasonicator (Citizen) for 15 min. The volume of the extract was then reduced to 1 ml using a rotary evaporator at 30–40 °C. The extract was transferred into vials after filtering through a membrane filter (PVDF 0.5-μm microsyringe) and was stored in a deep freezer at − 25 °C for further analysis. The extract was then injected into gas chromatograph (GC) (Shimadzu, GC-2010 Plus) system equipped with a Rtx-5 molecular sieve column (30 m × 0.25 mm i.d., 0.25 μm film thickness) for analysis of organic compounds. The initial temperature of 50 °C was held for 5 min, increased at a rate of 5 °C/min–150 °C held for 5 min, then increased at a rate of 8 °C/min–250 °C held for 10 min, 10 °C/min–300 °C held for 5 min. The identification of organic compounds was based on matching their retention time with external standards. Three classes of organic compounds were identified using the following standards: for n-alkanes [40147-U C8-C40 Alkanes Calibstd (Sigma-Aldrich)], for PAHs [polycyclic aromatic hydrocarbons 16 solution (36979) (Sigma-Aldrich)], and for phthalates [4S8231-SS EPA phthalate esters mix (Sigma-Aldrich)]. Five-point calibration curves were obtained at 0.2 ppm, 0.4 ppm, 0.6 ppm, 0.8 ppm, and 1 ppm for different standards for the quantification of target compounds. The correlation coefficients (R2) were > 0.98 for the calibration curves. The standard deviation for the duplicate analyses of the target compounds was below 15%.

Air mass backward trajectory analysis and cluster analysis

The 5-day backward trajectories of air mass were simulated using Hybrid Single Particle Lagrangian Integrated Trajectory (HYSPLIT) Model (Draxler and Rolph 2003) available at National Oceanic and Atmospheric Administration (NOAA) website during the study period. In order to ascertain the origins and transport routes of PM2.5 from emission sources to the sampling sites, 5-day backward trajectories were employed starting at 05:30 h universal time coordinated (UTC) and arriving at a height of 500 m above ground level at the receptor site. Trajectory analysis was also performed using Trajstat software to assemble the air mass trajectories during the study period. Cluster analysis was employed to combine the nearest trajectories coming from the similar routes to form the clusters (Wang et al. 2009). Trajectories were clustered using the angle distance which particularly determines the direction from which the air masses arrives the receptor site.

Principal component analysis

Principal component analysis (PCA) is a powerful statistical approach for analyzing the pattern in a high dimensional data. PCA transforms a large number of variables in the original data set using mathematical projection into a small number of uncorrelated variables called principal components with retaining main features of the data. PCA was performed using Statistical Package for Social Sciences (SPSS) version 20 for the concentrations data set of PAHs and phthalates. Principal components were obtained using varimax rotation with the constraint that the maximum of the variance was explained by the first component and the remaining variance by the subsequent components. Kaiser Meyer and Olkin (KMO) and Bartlett test of sphericity were employed to check the adequacy for number of samples collected. KMO value was 0.6 which is acceptable for the analysis of the data. The significance level of Bartlett test of sphericity was less than 0.05 which indicated the sufficient adequacy of the samples. The number of extracted components to be retained for analysis was decided by the scree plot. The components and their corresponding eigenvalues were plotted in the decreasing order and only components with eigenvalues > 1 were retained. PCA results can be interpreted as the contributions of each principal component to the total variance of the variables.

Results and discussion

Spatial variation of PM2.5 concentration

PM2.5 samples were collected at IG, MN, and HR sites simultaneously in the National Capital Region (NCR), India. The average PM2.5 mass concentrations during the study period were 128.5 ± 51.5, 86.3 ± 28, and 74.5 ± 28.7 μg m−3 at IG, MN, and HR sites respectively. The concentration of PM2.5 has been found to exceed the permissible limits of NAAQS given by the Central Pollution Control Board (CPCB), India, most of the time during the sampling period except during summer (Fig. 2). Monthly average PM2.5 concentration values for three sites demonstrated that the IG site was the most polluted site in the NCR during the sampling period. The PM2.5 mass concentration varied within the range 43.5–519.3 μg m−3, 39.7–226.6 μg m−3, and 32.1–205.8 μg m−3 at IG, MN, and HR respectively with maximum during winter and minimum during summer. The average PM2.5 mass concentration, 128.5 ± 51.5 μg m−3, at IG site was higher than the value 50.6 ± 20.4 μg m−3 reported in earlier study (Singh et al. 2011). HR site showed lower PM2.5 levels, which may be due to less emission sources in the vicinity of the site. Most of the region in Mahendragarh district of Haryana is undeveloped and less urbanized. The possible sources in the surroundings of HR site are epicuticular waxes from plants, biomass burning for cooking practices, petrol driven auto rickshaws, and diesel-driven commercial vehicles along with construction activities in and around the campus. High PM2.5 levels were measured at IG and MN sites which are possibly due to the potential emission sources in the surrounding areas. The potential emission sources for the two sites, IG and MN, are similar having major contribution from vehicular emissions. The sugar industry situated 2 km from MN site also would have contributed to particulate pollution in the ambient air (Sahu et al. 2015). Small brick kiln industries located around the MN site may also enhance the PM2.5 concentrations. The enhanced levels of PM2.5 indicated that the emission sources have increased in the recent years. Recent studies at Delhi for PM2.5 are comparable with the values in this report and indicated that traffic is a major contributor to the PM2.5 (Sharma et al. 2016; Sharma and Mandal 2017; Jain et al. 2017a). The probable sources which contributed towards higher PM2.5 concentrations in Delhi are vehicular emissions, resuspension of road side dust, industrial emissions, and burning of biomass. This can be further corroborated by study which says that the odd–even strategy marginally changed the concentrations (markers) of vehicular emissions (Sharma et al. 2007). The odd–even strategy seemed to be successful in attaining the low pollution levels in the air only for few hours in day time. The goal could not be achieved because of the high emissions from the heavy-duty vehicles which were allowed to ply across the region during nighttime (08:00–11:00 h) (Kumar et al. 2017).

Fig. 2
figure 2

Monthly variation of average PM2.5 concentration at IG, MN, and HR sites during the sampling period

It was concluded that the rapid urbanization over Delhi and Modi Nagar contributed to the high PM2.5 levels. Less developed area around the Mahendragarh district was accountable for the low PM2.5 levels.

Temporal variation of PM2.5 concentration

Temporal variation in PM2.5 levels is reported for two seasons (winter and summer) during the sampling period. As expected, seasonal variation was found to be similar for all sites with maxima in winter and minima in summer (Fig. 3), indicating the advection of air mass from identical source region. Higher PM2.5 concentrations were observed in winter at IG (223.54 ± 121.1 μg m−3), MN (126.8 ± 4.5 μg m−3), and HR (113 ± 29.8 μg m−3) sites. This may be due to low wind speed and high humidity (Table 1) in addition to shallow boundary layer resulting in less removal of pollutants. The high PM2.5 concentration during winter was associated with strong atmospheric stability and low inversion layer, accordingly trapping pollutants near the surface (Bisht et al. 2015). The increased number of biomass burning activities during winter to meet the domestic fuel consumption for cooking and uncontrolled open fires and solid waste burning for heating purposes also contributed to high PM2.5 concentration (Bisht et al. 2015, 2016; Dumka et al. 2017; Vaishya et al. 2017). Furthermore, higher mixing height during summers and stronger prevailing winds due to thermal circulations accounted for the lower PM2.5 concentration. Wind turbulence results in more dispersion of pollutants during summer in comparison to winter (Mantis et al. 2005). In addition to the atmospheric stability, less wet deposition due to the lack of precipitation also reduces the dispersion of pollutants during summer (Perrino et al. 2011).

Fig. 3
figure 3

Seasonal variation of average PM2.5 concentration at IG, MN, and HR sites during the sampling period

Organic compounds

The classes of organic compounds identified in this study are n-alkanes, polycyclic aromatic hydrocarbons (PAHs), and phthalates. The concentrations of organic compounds identified in the PM2.5 samples are presented in the Table 2.

Table 2 Concentration of organic compounds identified in PM2.5 for IG, MN, and HR sites during the sampling period

n-Alkanes

n-Alkanes are non-polar aliphatic hydrocarbons present in the ambient atmosphere emitted from mixed sources: utilization of petroleum product and natural vegetation waxes. The latter source consisted primarily of epicuticular wax from plants. Mean concentration of n-alkanes (C11–C35 homologues series) associated with PM2.5 was calculated and used to estimate the diagnostic parameters for the source identification. Mean concentration of n-alkanes was 111.0 ± 66.7, 132.9 ± 87.5, and 145.5 ± 103.4 ng m−3 during the study period at IG, MN, and HR sites respectively (Table 2). Comparatively, high concentration of n-alkanes was observed at HR which may be due to the substantial emissions from thick vegetation layer in the surroundings. Low n-alkanes concentration at IG site may be associated with less vegetation surrounded by huge transport network. Lower n-alkane concentrations were observed during summer at MN and HR sites whereas much difference in concentrations was not observed for IG site (Fig. 4). This can be explained with high %PNA for IG site in comparison to the other site which indicates that the contribution from petrogenic sources is consistent throughout the study period, since the traffic density remains uniform throughout the year. The less significant difference in the concentrations during both the seasons may also be due to lesser data during the winter period.

Fig. 4
figure 4

Average concentrations of total n-alkanes, PAHs, and phthalates (ng m−3) present in the PM2.5 of IG, MN, and HR sites during winter and summer

n-Alkanes have source contribution from natural vegetation waxes and petroleum utilization products. Carbon preference index (CPI) was calculated and used as a diagnostic ratio to ascertain the sources of n-alkanes (Simoneit and Mazurek 1982; Rogge et al. 1993a). CPI is obtained using the following expression

$$ \kern0.75em \mathrm{CPI}=\frac{\sum_{i=13}^{i=35}{\mathrm{C}}_i}{\sum_{j=12}^{i=34}{\mathrm{C}}_j} $$
(1)

where i denoted odd carbon number and j denotes even carbon number n-alkane concentration. The n-alkanes had CPI values of 1.4, 1.8, and 2.2 at IG, MN, and HR respectively during the study period. The value greater than 2 indicates the maximum biogenic influence due to emissions of epicuticular wax from plants surface (Rogge et al. 1991; Percy et al. 1994; Singh et al. 2003). The contribution of wax n-alkanes to total n-alkanes is determined by subtracting the average of next higher even and previous lower even carbon number concentration from the concentration of odd n-alkanes. The negative values were considered as 0. The concentration of wax n-alkane (CnWNA) and its percentage (WNA%) were calculated by using the following relation (Simoneit et al. 1991)

$$ \kern1em {\mathrm{C}}_{\mathrm{n}}\mathrm{WNA}=\kern0.5em {\mathrm{C}}_{\mathrm{n}}-\left(\frac{{\mathrm{C}}_{\mathrm{n}+1}+{\mathrm{C}}_{\mathrm{n}-1}}{2}\right) $$
(2)
$$ \kern0.75em \mathrm{WNA}\%=\frac{\sum {\mathrm{C}}_{\mathrm{n}}\mathrm{WNA}}{\sum \mathrm{NA}}\times 100 $$
(3)

where ƩCnWNA is the sum of wax n-alkanes concentrations which comes out from all odd n-alkanes and ƩNA is the sum of concentrations of all n-alkanes.

The contribution of petroleum products was calculated as percentage of petrogenic alkanes, PNA%, using the following relation

$$ \mathrm{PNA}\%=100-\mathrm{WNA}\% $$
(4)

The percentage contribution of wax n-alkane, WNA%, was 8.7, 13.8, and 28.6 for IG, MN, and HR respectively. The increasing trend in the CPI values and WNA% for three sites indicated that the petrogenic contribution was lower for the highest CPI value and WNA% (Fig. 5). This was revealed from the PNA% calculation [91.3% (CPI = 1.4), 86.2% (CPI = 1.8), and 71.4% (CPI = 2.2) for IG, MN, and HR sites, respectively] that the highest contribution from petrogenic sources was at IG site. It is evident from these values that the influence of biogenic sources was highest for the HR site whereas for petrogenic sources, it was highest for IG site. Carbon number with maximum concentration (Cmax) was observed for odd carbon number C35 at IG and MN sites whereas it was C29 for HR site. The odd carbon numbered congeners C29, C31, and C33 were dominant at HR site with highest concentration for C29, indicating the emissions from the leaf plant wax and other vegetation-related mixed sources (Rogge et al. 1993b, 1994). This is evident from the vegetation cover present around the sampling site which accounts for these emissions. n-Alkanes, which act as sunproof waxes, are used as additives in the tire solution during manufacturing for protection against oxidizing agents and cracking produced as an effect of UV light (Rogge et al. 1993a). Hence, the predominance of high molecular weight n-alkane (≥ C30 with maximum at C35) at IG and MN sites indicated the emissions of products formed by abrasion of tire material (Rogge et al. 1993a). In addition to CPI and WNA%, Cmax also corroborated the inference that the contribution of petrogenic sources was high for IG site in comparison to other two sites. The percentage contribution of biogenic and petrogenic sources for a particular site explained that the major fraction was from petrogenic sources at all sites.

Fig. 5
figure 5

CPI, WNA%, and PNA% of n-alkanes present in PM2.5 at IG, MN, and HR sites during the study period

Polycyclic aromatic hydrocarbons

Polycyclic aromatic hydrocarbons (PAHs) are present ubiquitously as pollutants in the environment. Incomplete combustion of any organic material is responsible for the formation of PAHs in the ambient air. PAHs are mutagenic and carcinogenic in nature. The USEPA listed 16 PAH priority pollutants based on toxicity potential for human exposure. Out of these PAHs, the USEPA considers seven, i.e., benzo(a)anthracene, chrysene, benzo(a)pyrene, benzo(b)fluoranthene, benzo(k)fluoranthene, dibenz(a,h)anthracene, and indeno(1,2,3-cd)pyrene, as probable human carcinogens (NTP 2005). In this study, 16 PAHs, naphthalene (Nap), acenaphthene (Ac), acenaphthylene (Acy), anthracene (Anth), phenanthrene (Ph), fluorene (Fl), fluoranthene (Flth), pyrene (Py), benzo(a)anthracene (BaA), chrysene (Chy), benzo(a)pyrene (BaP), benzo(b)fluoranthene (BbF), benzo(k)fluoranthene (BkF), benzo(g,h,i)perylene (BghiP), dibenz(a,h)anthracene (DahA), and indeno(1,2,3-cd)pyrene (IP), priority pollutants were identified in the PM2.5 samples.

Total PAHs concentration, ƩPAHs, was 287.9 ± 187.4, 305 ± 220.5, and 158.8 ± 100.7 ng m−3 during the study period at IG, MN, and HR sites respectively (Table 2). The concentration of individual PAHs ranged from 2.4 to 60.2, from 2.5 to 55.7, and from 1.8 to 33.1 ng m−3 at IG, MN, and HR sites, respectively. The particulate-bound PAHs, fluorene (Fl), fluoranthene (Flth), pyrene (Py), benzo(a)anthracene (BaA), chrysene (Chy), benzo(a)pyrene (BaP), benzo(b)fluoranthene (BbF), benzo(k)fluoranthene (BkF), benzo(g,h,i)perylene (BghiP), dibenz(a,h)anthracene (DahA), and indeno(1,2,3-cd)pyrene (IP), during the study period were similar at all sites. In winters, PM2.5 samples have high concentration of PAHs in comparison to summer at all sites. This may be attributed to less fuel consumption along with high wind speeds and inversion heights. In addition to this, lesser ΣPAHs concentration during summer is probably due to the increased dispersion as a result of higher photodecomposition of PAHs. This result is similar to the inferences reported by previous study at IG site (Singh et al. 2011). During winter, total PAHs concentration, ƩPAHs at IG site (343.5 ± 243.9 ng m−3) was three times higher than the ƩPAHs concentration (96 ± 39.6 ng m−3) reported by the previous study at IG site (Singh et al. 2011). The increase in ƩPAHs during summer at IG site (238.6 ± 75.7 ng m−3) than previous study (45.8 ± 22.1 ng m−3) revealed that the PAHs emission sources (vehicular emissions and biomass burning) have been increased in the recent years which contribute significantly in the deterioration of air quality. This can be corroborated by the data, which states that almost 10.1 million vehicles were registered in Delhi up to December 2016 (Transport Dept, Delhi Govt 2016), as compared to the previous report in which 6.1 million vehicles were registered up to 2009 (Transport Dept, Delhi Govt 2009). PAHs levels on this site were found to be higher than the other studies (Chowdhury et al. 2007; Pant et al. 2015) (Table 3).

Table 3 Organic compounds concentration (ng m−3) in PM2.5 reported in Indian studies

Concentrations of seven carcinogenic PAHs (Chy, BaA, BbF, BkF, BaP, DahA, and IcdP) were also calculated and are listed in Table 2. Ʃ7PAHs concentration for seven carcinogenic PAH was 181.9 ± 121.3 ng m−3, 214.6 ± 169.6 ng m−3, and 94.5 ± 64.5 ng m−3 at IG, MN, and HR sites respectively. The percentage loading of seven carcinogenic PAHs (Ʃ7PAHs) in the total PAHs (ƩPAHs) was 63.1%, 70.4%, and 59.5% at IG, MN, and HR sites respectively. High loadings observed at all sites may probably be due to vehicular combustion (Singh et al. 2011). PAHs reported in this study are classified on the basis of number of aromatic rings present in the compounds. In this study, two to six aromatic rings were present in the PAHs, therefore categorised in four groups, two to three rings (Nap, Ac, Acy, Anth, Ph, Fl), four rings (Flth, Py, BaA, Chy), five rings (BaP, BbF, BkF, DahA), and six rings (IP, BghiP). The percentage contribution of the four groups to the total PAHs is presented in Fig. 6. It is clear that the five rings associated PAHs had highest contribution to the total PAHs at all sites. Similar observations were also reported by previous study (Singh et al. 2011).

Fig. 6
figure 6

Percentage contribution of PAHs rings to the total PAHs concentrations in the ambient air at IG, MN, and HR sites

PAHs are categorized as persistent organic pollutants which accounts for their long residence time in the atmosphere and long-distance transport from one place to other. It is earlier reported that transport routes of air masses showed association with the PAHs concentration (Tan et al. 2011). In order to examine relation of PAHs concentration between sampling sites, the correlation analysis of five- to six-ring PAHs was performed between two sites, since high molecular weight PAHs can travel long distances. The correlation among IG and MN sites is good (R2 = 0.35–0.60), which indicates that the emission sources of five- to six-ring PAHs may be similar for these sites. On the other hand, five to six-ring PAHs at HR site showed minimum correlation with IG and MN sites (R2 = 0.10–0.20) which revealed that emission sources may be different from other two sites.

Molecular diagnostic ratios were calculated to distinguish between different sources of PAHs present in the ambient air. Ratios between different pairs of PAHs, IP/(IP+BghiP); BaP/(BaP+Chy); IP/BghiP; and Phth/(Phth+Anth), were used to investigate the PAHs origin. The values are listed and compared with literature reports in Table 4. The ratios, IP/(IP+BghiP) 0.52–0.62 and IP/BghiP 1.08–1.64, indicated that the diesel emissions were dominant at all sites. In addition to dominant diesel emissions, BaP/(BaP+Chy) 0.72–0.81 suggests that gasoline emissions also contributed to the PAHs evaluated in the study. Phth/(Phth+Anth) 0.09–0.52, used to distinguish the biomass burning and fossil fuels emissions, indicated that biomass burning emissions were dominant at all sites. The higher ratio of Phth/(Phth+Anth) 0.52 revealed that biomass burning practices were maximum at HR site as compared to other two sites. This may be due to extensive use of biomass fuels (fuelwood, agricultural residue, etc.) for cooking purposes in the villages around the HR site (Gadi et al. 2003). The BaP/BghiP ratio higher than 0.6 indicated the contributions of traffic emissions towards PAH sources (Pandey et al. 1999). Molecular diagnostic ratios calculated in this study suggested that high PAHs levels in the ambient air originated from mixed contribution of vehicular exhaust and biomass burning practices. It was found that these outcomes are similar to the observations reported in the previous study (Singh et al. 2011).

Table 4 Diagnostic ratios of PAHs present in ambient air of IG, MN, and HR sites

Phthalates

Phthalates, also known as phthalic acid esters (PAEs), are used as plasticizers in various industrial applications and cosmetics products. They are not chemically bonded to polymer matrix (Bošnir et al. 2003) and hence easily escape from the matrix to be found universally in the atmosphere (Kong et al. 2013; Ma et al. 2014). Studies on health effects of phthalates highlighted their adverse impacts on the population (Wolff et al. 2010; Pant et al. 2011; Desdoits-Lethimonier et al. 2012; Tranfo et al. 2012). In this study, the concentrations of six phthalates, dimethyl phthalate (DMP), diethyl phthalate (DEP), di-n-butyl phthalate (DBP), butyl benzyl phthalate (BBP), bis(2-ethylhexylphthalate) (BEHP), and di-n-octyl phthalate (DOP), were calculated and are listed in Table 2. The total concentrations of phthalates (ƩP6), 68.3 ± 44.0, 70.4 ± 40.4, and 64.5 ± 50.6 ng m−3 at IG, MN, and HR sites, respectively, were comparable for all sites during the study period. Similar concentrations revealed that identical sources were accountable for the phthalates at all sites. BEHP and DOP were the dominant phthalates with maximum concentration in all samples at all sites. The predominance of BEHP with maximum concentration corresponds to the usage as additives in PVC plastics products, manufacturing various polymers, and the emissions from plastics burning (Simoneit et al. 2005). Phthalate levels in the study suggested that the emissions could be from the burning of waste disposal, release of industrial emissions, vehicular exhaust (Lelieveld et al. 2001; Fu et al. 2010), interior parts of the vehicles (Haddad et al. 2009), and indoor sources (Shi et al. 2012). Almost comparable phthalate levels at IG (68.3 ± 44 ng m−3) and MN (70.4 ± 40.4 ng m−3) sites suggested that the emission sources were mainly from identical anthropogenic activities, plastic burning and vehicular and industrial emissions being the major sources. HR site is a semi-urban location with comparatively lesser industrial activities and lower traffic density. The agricultural activities are predominant in the area around HR site. Thus, comparable levels of phthalates at HR site (64.5 ± 50.6 ng m−3) may be majorly due to the emissions from plastic containers or plasticizers formulation used in fertilizers and insecticides in agricultural areas around the site (Kong et al. 2012; Lenoir et al. 2012). Seasonal variations of phthalates were observed with high concentrations during summer in comparison to that in winter at all sites. This may be due to increased vaporization of phthalates from plastics because of the high temperature during summer as compared to winter. Similar findings were also observed in the studies reported in China (Wang et al. 2006).

Health risk assessment

Human exposure to phthalates in PM2.5

There are various routes for human exposure to phthalates in the environment which include inhalation, dietary intake, and dermal absorption (Clark et al. 2011). Human exposure assessment for phthalate levels in PM2.5 was performed via inhalation pathway. It was assumed that maximum exposure to phthalates in the body is through inhalation. Thus, human exposure of phthalates was calculated as daily intake (DI, ng kg−1 day−1) for different age groups which include infants (< 1 year), toddlers (1–3 years), children (4–10 years), teenagers (11–18 years), and adults (> 18 years) via inhalation using the following relation (USEPA 1997; Zhang et al. 2014)

$$ \mathrm{DI}=\left({\mathrm{C}}_{\mathrm{i}}\times \mathrm{IR}\times \mathrm{EF}\right)/\left(\mathrm{BW}\right) $$
(5)

where Ci is the concentration of measured phthalates (ng m−3) in air, IR is the inhalation rate (m3 day−1), EF is exposure fraction (dimensionless) which indicated the time fraction people spent outside, and BW is body weight (in kg). IR, EF, and BW have different values for different age groups. IR for infants, toddlers, children, teenagers, and adults was taken as 4.5, 7.6, 10.9, 14.0, and 13.3 m3 day−1 respectively (EPA 2002). EF of 0.21 was used for toddlers and children and 0.12 for infants, teenagers, and adults (EPA 2002). BW was assumed to be 5 kg for infants, 19 kg for toddlers, 29 kg for children, 53 kg for teenagers, and 63 kg for adults (USEPA 1998).

Daily intake of individual phthalates and total phthalates present in air was calculated using the above-mentioned parameters and is presented in Fig. 7. The values of DI were found to be comparable at all sites for corresponding age groups. Infants were influenced higher than the other age groups with maximum value of DI both for individual phthalates and total phthalates at all sites. DI estimated for infants was 7.37, 7.60, and 6.90 ng (bw kg−1) day−1 at IG, MN, and HR sites respectively with high exposure due to phthalates measured in IG and MN sites. Adults were at low risk with minimum DI value (1.63–1.78 ng (bw kg−1) day−1) in comparison to other age groups. The high DI value for infants and low for adults in spite of same exposure fraction EF for both are because the same concentration of phthalates was exposed to both groups with different body weights and inhalation rates. The low IR and BW for infants than for adults demonstrated that there is serious health hazard to small body mass due to exposure to phthalates. The order for DI values among different age groups followed as infants > toddlers > children > teenagers > adults and the ratios of DI values were 1:0.77:0.73:0.29:0.23. This observation is comparable to the results reported in similar studies conducted at Shanghai, China (Li et al. 2018), and Tamil Nadu, India (Sampath et al. 2017). The ratio indicated that the DI values for infants are four times higher than that for the adults when exposed to same level of pollutants. Among the individual phthalates, BEHP and DOP have shown the highest DI values at all sites. BEHP has been classified as carcinogen (class B2) and BBP as possible carcinogen (class C) by the USEPA (Alatriste-Mondragon et al. 2003). DI value of BEHP and DOP was highest with contribution of 82%, 85%, and 80% to the daily intake of total phthalates for IG, MN, and HR respectively.

Fig. 7
figure 7

Daily intake (ng (bw kg)−1 day−1) of individual phthalates and total phthalates present in ambient air through inhalation for different age groups at IG,MN, and HR sites

Health risk associated to PAH exposure

The health risk associated with exposure to measured PAHs levels in the PM2.5 was evaluated in terms of lung cancer risk (LCR). The toxic potency of each PAH was assessed using Toxic Equivalency Factor (TEF) approach (Nisbet and Lagoy 1992). The established values for TEF of each PAH in the literature were used in this study. The carcinogenic potency of each PAH measured in this study could be expressed in toxic equivalents relative to benzo[a]pyrene. BaP being one of the most potent carcinogens is used as a reference compound to evaluate the potential carcinogenicity of individual PAH. Thus, PAHs concentrations were transformed to benzo[a]pyrene equivalents, BaPeq, and summed using the following relation

$$ \mathrm{Total}\ {\mathrm{BaP}}_{\mathrm{eq}}=\sum {\mathrm{C}}_{\mathrm{i}}\times {\mathrm{TEF}}_{\mathrm{i}} $$
(6)

where Ci and TEFi are the measured concentration of PAHs and the corresponding toxic equivalency factor, respectively. The BaPeq for individual PAHs and total BaPeq for 16 PAHs and seven carcinogenic PAHs are listed in Table 5. The cancer risk associated with PAHs exposure was determined for BaPeq employing the unit risk factor [URF (ng m−3)−1] using the following relation (Bian et al. 2016)

$$ \mathrm{Lung}\ \mathrm{cancer}\ \mathrm{risk}=\mathrm{URF}\times \mathrm{Total}\ {\mathrm{BaP}}_{\mathrm{eq}} $$
(7)
Table 5 Benzo[a]pyrene equivalents, BaPeq, and toxic equivalency factor for total PAHs and seven carcinogenic PAHs, Ʃ7PAHs

URF expresses the lung cancer risk when the individual is exposed to 1 ng m−3 BaPeq concentration. It was proposed by World Health Organization that the lifetime exposure (70 years) to PAHs resulted in URF of 8.7 × 10−5 ((ng m−3)−1) (WHO 2000). This implies that if 100,000 people are exposed to 1 ng m−3 BaPeq concentration, 8.7 cases of lung cancer are expected. Total BaPeq calculated from Eq. (6) were 69.76, 87.51, and 36.18 ng m−3 for IG, MN, and HR sites respectively. The findings of present study when compared with the results reported for China (40 ng m−3; Cui et al. 2018) showed higher values for IG and MN regions and comparable values for HR region. The lung cancer risks, employing the URF of 8.7 × 10−5 (ng m−3)−1, are 606.9 × 10−5, 761.3 × 10−5, and 314.8 × 10−5 for IG, MN, and HR sites for lifetime exposure to PAHs. These values suggested that 6069, 7618, and 3148 people may suffer from lung cancer if one million people are exposed to 69.76, 87.51, and 36.18 ng m−3 of BaPeq PAH concentration for 70 years. Lung cancer risk associated with exposure to seven carcinogenic Ʃ7PAHs (BaPeq) concentration was 602.5, 757.9, and 312.5 for IG, MN, and HR sites, which is nearly equal to that of total BaPeq (16 PAHs). Hence, the concern for health risk assessment revealed that the seven carcinogenic PAHs are contributing the most towards the lung cancer risk. The estimated cancer risks (6.0 × 10−3, 7.6 × 10−3, and 3.1 × 10−3) were found to be higher than the range recommended by the USEPA (10−3) for lifetime exposure to carcinogen pollutants (Rodricks et al. 1987).

Source apportionment of organic compounds

PCA was performed for the source apportionment of PAHs and phthalates present in PM2.5 using Statistical Package for Social Sciences (SPSS) version 20. The number of variables for PCA analysis was 22 which included the 16 PAHs and six phthalates. These 22 variables were transformed to principal components (PCs) using varimax rotation. PCs having eigenvalue > 1 were retained for the extraction of most characteristic information. The results from PCA study are presented in Fig. 8. Four principal components were retained which explained the 61.8%, 59.6%, and 58.6% of the total variance of the data for IG, MN, and HR respectively.

Fig. 8
figure 8

PCA results showing plots of PC1 vs PC2, PC1 vs PC3, and PC1 vs PC4 for sampling sites

IG site

PC1 has explained the 20.5% of the total variance with loadings from five- to six-ring PAHs. The high contribution from BbF, BaP, BghiP, and IP (> 0.5) indicated the emissions of PAHs from vehicular exhaust in the ambient air. PC2 was witnessed with high loadings of Acy, Ac, Fl, Anth, Ph, and Py which described the 14.7% of the variance. This indicated the PAHs emissions from residential heating to meet the need of fuel consumption. High loadings of phthalates on PC3 with 14.6% of the variance implied the emissions from plastic burning and waste disposal of plastic materials. PC4 showed high loadings on Chy, BkF, and DahA with 11.9% of the variance. This can be explained from the emissions associated with cooking activities. The sampling location is surrounded by the hostel mess at a distance of 50 m. Cooking emissions may contribute to the PAHs levels in the environment. PCA results are similar to the aforementioned molecular diagnostic ratios, which summarized that the vehicular emissions and biomass burning emissions are the major sources of the PAHs levels in the ambient atmosphere.

MN site

PC1 has explained the 21.4% of the total variance with high loadings from Ph, Fl, BaP, BghiP, and IP (> 0.5). This indicated the emissions of PAHs from vehicular exhaust in the ambient air. PC2 was witnessed with high loadings of Ch, BaA, BkF, and DahA which described the 13.9% of the variance. This indicated the PAHs emissions from industrial activities. The sugar industry situated 2 km from MN site may also have contributed to PAHs levels in the ambient air. High loadings of phthalates on PC3 with 13.3% of the variance implied the emissions from plastic burning and waste disposal of plastic materials. PC4 was observed with 10.9% of the variance and high loadings on Py, Anth, and Ac. This indicated the PAHs emissions from domestic heating.

HR site

High loadings of Py, Chy, BaP, BgP, and IP on PC1 showed 16.4% of the variance. This can be attributed to the vehicular emissions in the vicinity of the site. PC2 described 16.6% of the variance with high contribution from Flth, BaA, BkF, and DahA which indicated the PAHs emissions associated with cooking activities. Cooking emissions from the hostel mess in the vicinity of the sampling site may contribute to the PAHs levels in the ambient air. High loadings of Acy, Ac, Fl, Anth, and Py on PC3 with 13.5% of the variance accounted for the emissions from biomass and agricultural waste burning practices in the surrounding areas. PC4 was observed with 12.6% of the variance and high loadings of phthalates which indicated the emissions from plastic burning, waste disposal of plastic containers, or plasticizers formulation used in fertilizers and insecticides in agricultural area around the site.

Trajectory analysis

The 5-day backward trajectories of air masses parcel were simulated using Hybrid Single Particle Lagrangian Integrated Trajectory (HYSPLIT) Model for the three receptor sites at IG, MN, and HR (Fig. 9a). Air masses parcel during the winter season approached mainly from the Punjab, Haryana, Himachal Pradesh, and Pakistan regions to the receptor sites. Long-range transport of air masses was also observed from Afghanistan and Iran regions towards the IG receptor site but not with high pollutant concentration. At HR receptor site during the winter season, in addition to air masses parcel from the Punjab, Haryana, and Pakistan regions (through Thar Desert), the air masses also approached from Rajasthan through Thar Desert. During the summer season, backward trajectories for all sites showed the different origins of air mass parcel towards the receptor site in comparison to other. Air mass parcel during summer season originated from Pakistan travelling through Punjab and Haryana and from Thar Desert towards the IG receptor site. In addition to transport from adjoining states, long-range transport has also contributed to the accumulation of pollutants at the IG receptor site, which originated from Myanmar crossing through Bangladesh region. Long-range transport from North Atlantic Ocean through Mediterranean sea, Syria, Iran, Afghanistan, and Pakistan towards MN site was also observed during summer. Trajectories of air mass during summer at MN and HR receptor sites originated majorly from Arabian Sea through Thar Desert and from Bangladesh through Uttar Pradesh. These inferences are similar to the results for PM2.5 reported by studies conducted over Delhi region (Lodhi et al. 2013; Singh and Beegum 2013; Bisht et al. 2015; Jain et al. 2017a, b; Sharma et al. 2017).

Fig. 9
figure 9figure 9

5-day backward air mass trajectories using HYSPLIT (a) and clustered trajectories (b) plots for winter and summer seasons over sampling sites

The transport routes of air masses showed association with the PAHs concentration as reported by Tan et al. (2011). To study the effect of transport of air masses on PAHs concentrations, 5-day backward trajectories were grouped into two to three clusters for the sampling period (Fig. 9b). During winter, two clusters were obtained for all three sites which show that air masses originated mainly from Pakistan, Punjab, and Haryana. This indicates that the PAHs emission sources were local and regional during winter. In contrast to this, during summer, backward trajectories were clustered into two (HR site) and three groups (IG and MN sites). High wind speed during summer resulted in long-range transport of air masses which carried the PAHs to the sampling sites. This may account for less difference in the PAHs concentration for both seasons in addition to the short period of sampling during winter.

Conclusion

The study examined the quantification of organic compounds associated with fine ambient aerosols (PM2.5) over National Capital Region (NCR), India. Elevated PM2.5 levels than the prescribed limits of NAAQS given by the CPCB, India, were observed during study period (January–June 2016) for all three sites (Delhi, Modinagar, and Mahendragarh) over the NCR of India. The concentrations of n-alkanes (C11–C35), polycyclic aromatic hydrocarbons (PAHs), and phthalates compounds classes were reported in this study. The levels of n-alkanes were highest at HR site with maximum contribution from biogenic sources in comparison to other sites. Diagnostic ratios, CPI, WNA%, and PNA% confirmed the high influence of petrogenic sources over IG and MN site in comparison to HR site. PAHs emission sources have increased in the recent years which are contributing significantly in the deterioration of air quality. Molecular diagnostic ratios for pairs of PAHs indicated that the high PAHs levels in the ambient air originated from mixed contribution of vehicular (diesel, gasoline, and traffic emissions) exhaust and biomass burning practices. Phthalate levels in the study suggested that the emissions could be from waste disposal, release of industrial emissions, vehicular exhaust, and plastic usage in interior parts of the vehicles and containers. DI values calculated for phthalate levels in PM2.5 were four times higher for infants than for adults when exposed to same level of pollutants. The estimated cancer risks, 6.0 × 10−3, 7.6 × 10−3, and 3.1 × 10−3 for exposure to PAHs, were found to be higher than the range recommended by the USEPA (10−3) for lifetime exposure to carcinogen pollutants. PCA study revealed that the vehicular emissions, biomass burning, and plastic burning were the major sources of the PAHs and phthalates over the sampling sites. Air mass backward trajectories concluded the local, regional, and long-range transport of PM2.5 and PAHs during the study period. These findings will contribute in the development and implementation of suitable pollution reduction strategies over NCR region.