Introduction

Rapid urbanization and industrialization over the past decades has had a dramatic impact on air quality worldwide. Risks on human health can be posed by the ambient air pollutants such as suspended particulate matters and volatile organic compounds that originate from automobile exhaust, combustion exhaust, industry processes, and domestic activities (Wong et al. 2002; Yeh et al. 2011). The mobile sources account for 82 % of the air pollution in Malaysia (Al Madhoun et al. 2010). There has been a marked increase in motor vehicles ownership in Malaysia. In 2013, there were 22,901,325 registered vehicles running on the roads (Environment Quality Report 2013).

Automobiles and industrial emissions are sources of the volatile organic compounds (VOCs; Ghazali et al. 2010; Hsieh et al. 2005; Kim Oanh et al. 2008; Zielinska et al. 2010). Non-methane hydrocarbons (NMHCs) are an important group of VOCs in the urban areas because of their reaction with the hydroxyl radical (OH) in the formation photochemical compounds which are health hazard when exposed over a long period of time (Baker et al. 2008; Tang et al. 2007; O’Donoghue et al. 2007; Cerón-Bretón et al. 2014). In addition, NMHC can also act as precursor for secondary pollutants formation such as O3 (Chen et al. 2009; Duan et al. 2008; Kumar et al. 2008; Sadanaga et al. 2012; Tan et al. 2012; Tiwari et al. 2010).

NMHC is emitted from both anthropogenic and biogenic sources such as vegetation and seawater (Bauri et al. 2015). Major anthropogenic sources are related to fossil fuel combustion (vehicle exhaust, heat generation, and industrial processes), due to evaporation of storage and distribution of fuels and solvent use (Liakakou et al. 2009; Saito et al. 2000; Sauvage et al. 2009).

Vehicular exhaust emissions are the most significant sources of NMHCs in urban areas (Guo et al. 2004; Baldauf et al. 2009). There are few non-negligible minor but numerous municipal sources such as leakage from liquefied petroleum gases, natural gas, dry cleaning agent, incinerator, etc. (Chang et al. 2006). Moreover, there are several indoor sources, including off-gassing from furniture made of wood-pressed products, paints, floor varnishes glues, smoking, cooking, and consumer products (Liu et al. 2014).

So and Wang (2004) examined the NMHC spatial distribution and seasonal variation for different areas in Hong Kong found that the highest levels of NMHCs were at the roadsides and the lowest levels were observed at a rural site. In term of seasonal variations, the former showed the highest NMHC concentrations in summer while the latter showed the lowest which is due to strong evaporation of alkanes in the hot summer weather condition especially in urban environment.

Forty-eight NMHC species were measured during the study of Saito et al. (2009) in Nagoya, Japan. The author reported that the concentration of NMHC was high from November to February and low from June to August. The pattern of the seasonal variation was influenced mainly by that of alkanes emitted from traffic.

NMHC concentrations in suburban region of south–central China were investigated by Zhang et al. (2009); samples were collected weekly and analyzed using gas chromatography (GC) mass spectrometry. The author stated that there was a seasonal variation in the NMHC, where it was higher in winter than summer. Furthermore, the analysis identified that vehicular emission is the dominant source for NMHC species.

The spatial concentration of NMHC in the atmosphere of the Taipei metropolitan area was monitored by Ding and Wang (1998). A total of 20 species were identified and quantified in 50 samples which were analyzed using GC/flame ionization detector (FID). The results showed that gasoline-fueled car and motorcycle emissions are the two major sources of NMHC in Taipei, and a high correlation between NMHC concentration and traffic emission was reported when compared with other major cities.

Rappengluck and Fabian (1999) monitored NMHC in several locations in Munich by using online gas chromatography methods. Low NMHC values compared with other cities worldwide was revealed by the results. The data suggest that fuel evaporation and solvent releases can be added to traffic emissions as sources of summertime NMHC inventories.

The previous literatures showed lack of studies about prediction of NMHC concentrations in the ambient air to help in the decision making process in Malaysia. In addition, the contribution of the meteorological factors to the variance of NMHC concentration is yet to be quantified, and this will be investigated in the study. Investigation on NMHC trends in Penang Island, which has an equatorial weather condition and is considered a mixed development area (urban and industrial zones), will be an added value to the understanding of NMHC behavior under different weather conditions.

This study examined the diurnal variations and the best probability distribution describing the NMHC concentrations in Penang Island. The probabilities of different percentiles (50, 60, 70, 80, 90, 95, and 99) were calculated, and the return period was also predicted. Factor analysis was performed to investigate the contribution of the meteorological factors (temperature, humidity, wind speed, and wind direction) to the NMHC concentration.

Methods

Monitoring stations

The hourly and daily average concentrations of NMHC were monitored from the air quality monitoring station at Universiti Sains Malaysia [5.3569°N, 100.3014°E and Distance to Tide Gauge is 8301 m] in Penang Island for 2 years (2005–2006); Fig. 1 shows the location of the monitoring station. The station is situated in an urban area on the north-western coast of the peninsula of Malaysia, experiencing an equatorial tropical warm and humid climate, with an average ambient temperature of 29 °C (Penang Weather Web 2014). The potential NMHC surrounding the station are traffic emissions and the nearby Bayan Lepas industrial area.

Fig. 1
figure 1

Location of NMHC monitoring station in Penang Island

Measurement

The measurement of ambient NMHC concentrations was done using the Synspec ALPHA M/TNMHC analyzer (model 115) which is gas chromatograph equipped with FID, where it contains a compact oven with a column that separates methane from total non-methane hydrocarbons (Synspec Webiste 2010). Carbon monoxide levels were monitored using non-dispersive infrared Analyzer from Teledyne Advanced Pollution Instrumentation (Teledyne API Website 2015), and the meteorological elements were measured using meteorology sensors from Vaisala (Vaisala Website 2015). The data were collected over a period of 2 years from 2005 to 2006. In addition, other data such as carbon monoxide (CO) and meteorological parameters, i.e., ambient temperature, humidity, wind speed, and direction, were also collected simultaneously. In order to obtain a good quality and comparability of the measured data and harmonisation of the instruments, quality assessment procedures were carried out before and after the monitoring campaign.

Modeling and statistical analysis

Modelling of atmospheric dispersion is an important tool for regulators, policymakers, and environmental managers, where it is used in health impact assessments for applications such as transport planning and assessment, emission abatement, and regulatory interventions (Deary and Uapipatanakul 2014).

Statistical model such as probability distribution is a useful tool to model the air pollutants data, because air pollutants data are usually random variables (Md. Yusof et al. 2010). In this study, probability plots were applied to select the appropriate statistical distribution of the NMHC concentration. The log-normal, Weibull, and Gamma distributions were considered because these are the most popular probability density functions used in representing atmospheric concentrations (Ding and Wang 1998; Papanastasiou and Melas 2009). The analysis was carried out using statistical software, MATLAB, Minitab 14, and SPSS Version 15.

Five indicators were employed to check the goodness of fit for the tested distributions which were prediction accuracy, absolute error, coefficient of determination, root mean square error, and index of agreement. In addition, Kolmogorov–Smirnov test was also used as a criterion to verify the goodness-of-fit (Tu et al. 2007).

Furthermore, the contribution of the meteorological factors (temperature, humidity, and wind speed and wind direction) to the NMHC concentration variance was investigated using the factor analysis. In the analysis, the sines and cosines of the wind directions were used; this is because wind direction is a circular function (Burgoyne et al. 1993).

Results and discussion

Table 1 presents the descriptive statistics of the mean, standard deviation, as well as the range of the NMHC concentrations observed at Penang Island for both 2005 and 2006.

Table 1 Descriptive statistics of NMHC levels

Figure 2 presents the diurnal variations of the NMHC concentrations in 2005 and 2006 which showed that the concentrations were fluctuating over the 24-h period with the peak occurred from 8:00 am—10:00 am, coincided with the morning traffic rush hour.

Fig. 2
figure 2

Diurnal variation of NMHC

As in Table 2, at the island site, a significant Pearson’s correlation coefficient, 0.98 > r > 0.88 between NMHC and CO concentrations, and this illustrates that both NMHC and CO have a common source which is the vehicular emissions that contribute to this finding as. A similar finding was also reported by Christensen et al. (1999) and Zhang et al. (2009).

Table 2 The correlation between NMHC concentrations and CO emissions

Figures 3 and 4 present the diurnal variations of the NMHC concentrations and temperature in 2005 and 2006 during the south west monsoon (June to September), while Fig. 5 presents the diurnal variation of the NMHC concentration and temperature during the North East monsoon (November 2005–March 2006). The trends in the Figs. 3, 4, and 5) showed that the NMHC has a maximum and secondary peak around 10:00 and 22:00, respectively, in both years. The former peak is due to the traffic flow during the rush hour while the latter is due to stable weather condition (the range of average wind speed is between 2 and 4 m/s) which limits the dispersion of the pollutant. On the contrary and as expected, a lower NMHC concentration was noted at elevated ambient temperature which is believed to be due to the reaction of NMHC with the hydroxyl radicals (OH) resulting from the photolysis of ozone (the range average of O3 levels dropped from 0.050 to 0.010 ppm) in the presence of sunlight and water vapor in the atmosphere. A negative correlation between NMHC and temperature as discussed previously support this argument.

Fig. 3
figure 3

Diurnal variations of NMHC and temperature during South-West Monsoon in 2005

Fig. 4
figure 4

Diurnal variations of NMHC and temperature during South-West Monsoon in 2006

Fig. 5
figure 5

Diurnal variations of NMHC and temperature during North-East Monsoon in 2005–2006

As depicted in Fig. 3, the NHMC levels during south–west monsoon in 2005 reached average maximum value of 0.37 ppm in the morning, where the average temperature was 24.5 °C. While in the afternoon this was 0.1 ppm and 32.5 °C, respectively, a reduction of 73 % of NMHC, with an increment of 25 % in the ambient temperature, further illustrating the influence of the ozone photolysis on the removal of NMHC from the atmosphere.

Similarly, the NHMC concentrations during the south–west monsoon in 2006 reached average maximum value of 0.41 ppm in the morning, where the average temperature was 24.5 °C, while in the afternoon, NMHC level decreased to 0.17 ppm with the temperature increased to 32 °C (Fig. 4). This means that the NMHC concentration was reduced by 69 % as a result of 23 % increase in the temperature as obtained in the previous year.

Again, 60 % reduction of NMHC concentration with an increase of 25 % in ambient air temperature was observed during the north–east monsoon period (Fig. 5), where the NMHC concentration peaked at 0.25 ppm and temperature of 24.5 °C in the morning, while this was 0.10 ppm and 32.5 °C, respectively, in the afternoon.

The trends observed in Figs. 3, 4, and 5 clearly showed the negative relationship between NMHC concentration and temperature due to the reaction of NMHC with the hydroxyl radicals (OH) which result from the photolysis process of ozone in the presence of sunlight and water vapor as reported by Bonsang et al. (1990), where a fast and high reactivity of NMHC with OH radicals take place (Coll et al. 2010; Nakashima et al. 2010), and this is considered to be the most important atmospheric removal process of NMHC (Rudolph et al. 2002).

The results of the statistical distribution for the NMHC concentration in the area surrounding the station in the island in 2005 show that the NMHC concentrations distribution was well-represented by Weibull where it gives high prediction accuracy (99.6 %) and low errors (1.8 %) as shown in Table 3; these results are similar with the finding of Ding and Wang (1998) which reports that the concentrations of C2–C6 hydrocarbons were better fitted by Weibull distribution.

Table 3 Statistical distribution for NMHC

In 2006, lognormal was the best distribution to describe NMHC concentrations with 99.35 % accuracy. Moreover, the values of Kolmogorov–Smirnov (K-S) test in Table 3 emphasizes these results where, in 2005, the lowest K-S value was 0.0444 for the Weibull distribution (the lowest the value the better the goodness of fit), and in 2006 the value was 0.041 for the lognormal distribution. It is noticed that the distributions were different in 2006 and 2005 (lognormal and Weibull respectively); this due to the variations of NMHC levels, where the concentrations in 2006 worsen as result of the Sumatra forest fires. These results agreed with the findings of Schrimpf (1998).

The cumulative density function plots in Figs. 6 and 7 show the theoretical and observation graphs in the Weibull distribution for the year 2005 and lognormal distribution for the year 2006, where they have the highest accuracy and lowest error comparing to the other distributions and moreover, a high degree of matching between theoretical and observation graphs shown in Figs. 6 and 7.

Fig. 6
figure 6

cdf Plot for NMHC (2005)

Fig. 7
figure 7

cdf plot for NMHC (2006)

According to the US standard, the threshold value for NMHC is 0.24 ppm (Bazaca et al.1983; Zainal 2003); the results in Table 4 show that, in 2005, the predicted number of days exceed the threshold were 29 (the actual = 26 days) while, in 2006, the predicted days that exceed the threshold were 78 (actual = 73 days).

Table 4 The probability of the NMHC to exceed certain limits

The results in Table 4 show that, in 2005, for the 50 percentiles (50 % of the NMHC concentrations were less than 0.1665 ppm), the predicted days to exceed the NMHC concentration of 0.1665 ppm is 175 where the actual days where 177 and the return period 2 (mean the exceedance of the NMHC concentration of 0.1665 occurs every 2 days). While in 2006, the 50 percentiles (50 % of the NMHC concentration were less than 0.2060 ppm), the predicted days to exceed the NMHC concentration of 0.2060 ppm is 183 where the actual days were 177 and the return period 1.99 (mean the exceedance of the NMHC concentration 0.2060 ppm will occur every 2 days).

Weather parameter contribution

The contribution of the meteorological parameters (temperature, humidity, wind speed, and wind direction) to the variations of NMHC concentration was investigated using factor analysis with extracted principal component after normalized Varimax rotation. After applying Varimax rotation, loading factors greater than 0.5 were considered as significant in the interpretation of the obtained results.

The factor analysis results show that, in the year 2005, temperature and humidity were the main contributor where they account for 29.3 % of the total variance of NMHC concentrations while in 2006; they are account for 28.3 % of the variance. Wind speed was the second factor in both 2005 and 2006 where it contributed by 17.1 and 18.3 %, respectively, as shown in Table 5. Finally, it can be noticed that wind direction was the lowest contributor to NMHC concentrations variance, and this was due to the existence of two nearby sources (main roads) of traffic emissions. These results agreed with contribution of weather factor to the concentration of CO, NO, NO2, O3, smoke, and SO2 in the study of Statheropoulos et al. (1998).

Table 5 Contribution of the weather parameters to the NMHC Variance

Conclusion

The statistical distributions of NMHC concentrations that were collected from the monitoring station in Penang Island were studied over the period 2005–2006. The analysis shows that for the NMHC concentrations surrounding the station in the island were well represented by Weibull distribution in 2005 and lognormal distribution in 2006 with an accuracy of 99.6 and 99.4 %, respectively.

The predicted number of days exceeds the US standard for NMHC (0.24 ppm) were 29 days in 2005 and 78 days in 2006, while the actual days were 26 days in 2005 and 73 in 2006. The Pearson’s correlation analysis results show a good mutual correlation (r > 0.85) between NMHC concentrations and CO emissions which indicate that they originate from a common source which is vehicular emissions.

The factor analysis results showed that temperature and humidity were the main contributor to the variance of NMHC concentrations, where they contribute by 29.3 % in 2005 and 28.3 % in 2006. These results show the seriousness of the issue and reflection of the negative impacts of the public health which should be the motivated force for urgent action to reduce emission to the international acceptable limits.