Introduction

The trend of studies shifted increasingly in the last few decades from focusing on carbon- and sulfur-based air pollutants, such as carbon monoxide, carbon dioxide, and sulfur dioxide to become aware of tropospheric photochemical reactions and its capability to transform primary air pollutants into secondary air pollutants. In view of this scenario, ground-level ozone (O3), which is one of the major photochemical oxidants produced by these photochemical reactions, has gained prominence (Alghamdi et al. 2014).

At ground level, O3 is a major component of photochemical smog, and it has always been associated with negative impacts toward human health, vegetation growth, and materials lifetime. O3 is also considered a greenhouse gas that contributes to global warming (Reid et al. 2008; Alghamdi et al. 2014). Various studies have reported that O3 plays a significant role in tropospheric chemistry, because O3 is one of the principle precursors of hydroxyl radical (OH), which controls the oxidizing power in the lower atmosphere (Duenas et al. 2004; Singla et al. 2012). Ground-level O3 is produced by a series of complex photochemical reactions between its precursors, such as nitrogen oxides (NOx) or volatile organic compounds (VOCs), and incoming solar radiations (Seinfeld and Pandis 2006; Ghazali et al. 2010; Tsakiri and Zurbenko 2011).

Several studies reported that O3 exhibits strong diurnal variations which are controlled by various processes, including photochemistry, physical/chemical removal, and the rate of deposition and transport (Ghazali et al. 2010; Alghamdi et al. 2014). Elevated levels of precursor emissions from various anthropogenic activities, such as transportation and industrialization, can lead to an increase in O3 concentrations in ambient air. In addition to the level of precursor emissions, complex meteorological conditions also contribute to the large diurnal differences as well as seasonal and yearly variations. Meteorological conditions play significant roles in O3 concentration variations and build-up process (Ozbay et al. 2011; Gvozdic et al. 2011; Gorai et al. 2014). In most places, the increase in O3 concentrations results from an increase in solar radiations and temperature as well as a decrease in wind speed and relative humidity (Kovac-Andric et al. 2009; Tsakiri and Zurbenko 2011; Toh et al. 2013).

Ship emissions are becoming a prominent source of air pollutants in cities with major ports (Vutukuru and Dabdub 2008; Song et al. 2010). Song et al. (2010) stated that ship emissions are one of the anthropogenic sources of various primary air pollutants, including nitrogen oxides (NOx), sulfur oxides (SOx), carbon monoxide (CO), and volatile organic compounds (VOCs). On a global scale, the concentrations of NOx emitted by international shipping industries are estimated to be approximately 7 Tg (Corbett and Koehler 2003). The expansion in this industry increases the production of O3 and particulate matter (PM) which has negatively impacted on the air quality issue. Thus, an urgent action is needed to monitor and assess how this industry affects the air quality. However, no published material in Malaysia has reported on this problem. Rather, most studies conducted in port cities are focused on water quality.

The objective of this study was to assess the diurnal variability of ground-level ozone in three port cities in Malaysia. The study aimed to provide insight into the variability of one of the criteria pollutants in Malaysia. Several multivariate techniques, such as cluster analysis (CA) and principal component analysis (PCA), were applied in the study.

Material and methods

Site descriptions

Malaysia is located near one of the busiest shipping lanes in the world. The Straits of Malacca is a narrow stretch of water (approximately 800 km long) between Peninsular Malaysia and Sumatra, Indonesia. For economic reasons, this strait is the primary shipping channel between the Indian and Pacific Oceans, handling about a quarter of the world’s trade goods. Port Klang, Port of Penang, and Port of Pasir Gudang are three major and famous ports in Peninsular Malaysia that are situated along the Straits of Malacca. The descriptions and locations of these port cities are provided in Table 1 and Fig. 1, respectively. Klang covers an area of 626.8 km2, with a coastline of 53.7 km. The rapid growth of Port Klang has greatly contributed to the developments of the Klang area. According to Haris and Aris (2013), Port Klang is the busiest port in Malaysia and the 14th busiest port in the world. In 2012, the port received and departed approximately 15,000 vessels, handling approximately 169 million tons of cargo, which is significantly higher than that of any other port in Malaysia.

Table 1 Descriptions of selected port cities
Fig. 1
figure 1

Locations of the port cities in the Peninsular Malaysia

The Perai monitoring station located in Seberang Perai Tengah district was selected to represent the Penang port, given that their locations from each other are in close proximity. The total population in 2010 of Seberang Perai Tengah, which covers an area of 738 km2, is 362,820; this number accounts for 23.8 % of the total population of Penang. Penang port is one of the oldest ports in Malaysia. In 2012, it handled a total of 6650 arrivals and departures of vessels, which is equivalent to 43 million tons of cargo. Port of Pasir Gudang is one of the famous ports of Johor, a southern state of Peninsular Malaysia. Port of Pasir Gudang is the fourth busiest port in Malaysia after the Ports of Klang, Penang, and Tanjung Pelepas (DSM 2012). In 2010, Port of Pasir Gudang handled about 28 million tons of cargo and received nearly 5000 international and local vessels (Ministry of Transport 2010). These numbers increased in 2012, with the total number of departures and arrivals of vessels reaching up to 12,228. The major industries that drive the economy of Pasir Gudang are transportation and logistics, shipyard industries, petrochemical industries, and oil palm storage and distribution. DSM (2012) reported that during the 2010 census, the population of Pasir Gudang was approximately 43,000.

These cities experience a tropical climate characterized by uniform high temperature ranging from 22 to 24 °C during nighttime and from 27 to 30 °C during daytime. The mean annual rainfall in these cities is 2670 mm (Ghazali et al. 2010; Md Yusuf et al. 2010), and the relative humidity ranges from 70 to 90 %.

Measurements and instrumentations

Continuous hourly air quality data were obtained from the Air Quality Division of the Department of Environment, Ministry of Natural Resources and Environment of Malaysia. Ambient air changes in Malaysia are monitored continuously by ambient air monitoring systems that have been installed all over the country. The current status of air pollution is systematically reported by the API system on an hourly basis. The obtained secondary data are regularly subjected to standard quality control processes and quality assurance procedures (Mohammed et al. 2013). The procedures used for continuous monitoring are in accordance with the standard procedures outlined by internationally recognized environmental agencies such as the US Environmental Protection Agency (Latif et al. 2014).

Hourly O3 concentration was monitored using the Model 400E UV Absorption Ozone Analyzer (DoE 2010). The analyzer utilizes the Beer-Lambert Law, which is based on the internal electronic resonance of O3 molecules with absorption of 254-nm UV light in measuring low ranges of O3 concentration in ambient air (Ghazali et al. 2010; Mohammed et al. 2013). Changes in ambient NO2 and NO concentrations were collected using the Model 200A NO/NO2/NOx analyzer (Ghazali et al. 2010; Latif et al. 2014). This analyzer applies chemiluminescence detection principles to detect NO2 and NO concentrations in ambient air and has been proven to provide sensible, stability, and ease of use for ambient or dilution continuous monitoring (DoE 2010). BAM-1020 (Beta Attenuation Mass Monitor) was used to monitor ambient PM10. Concentrations of SO2 and CO were monitored using Teledyne API Model 100A/100E and Teledyne API Model 300/300E, respectively. Meteorological parameters were monitored using Met One 062 sensor for temperature, Met One 083D sensor for relative humidity, Met One 010C sensor for wind speed, Met One 020C for wind direction, and Scientech Model UV-S-290-T for UVB measurements (Latif et al. 2014).

Data analysis

Diurnal variations of O3 concentrations, its precursors, and meteorological parameters in selected port cities were determined from continuous hourly average monitoring records from 1 January to 31 December 2009. Daytime is perceived as the time between sunrise and sunset (Clapp and Jenkin 2001), which is in between 7 a.m. to 7 p.m. (Mohammed et al. 2013), whereas nighttime is the time from 7 p.m. to 7 a.m.

In this study, descriptive statistics is calculated to overview variations in O3 concentrations and selected variables, and then, bivariate correlation analysis is utilized to determine correlation between variables at the respective cities. After that, CA is performed using hourly average O3 concentrations, followed by PCA to determine contribution of each selected variables toward variations in O3 concentrations. All the analysis in this study is analyzed and calculated using the Statistical Package for Social Science (SPSS) statistical software version 20.

CA is employed to determine the spatial variations of hourly O3 variations. CA is a multivariate technique that is used in this study to categorize O3 concentrations into different groups with the primary purpose of data summarization (Özbay et al. 2011; Dominick et al. 2012). In this way, the objects that belong in a group are similar, whereas the objects from different groups are characteristically different. In other words, CA maximizes the similarity of cases within each cluster while minimizing the dissimilarity of hourly average O3 concentrations between groups (Pires et al. 2008; Lau et al. 2009). In this analysis, each diurnal O3 concentration is considered a separate unit before it is connected by Ward’s method and squared Euclidean distance. Ward’s method is chosen to be applied in this study because this method used an analysis of variance approach to evaluate the distance between clusters in an attempt to minimize the sum of square (SS) of any two clusters (Shrestha and Kazama 2007). Meanwhile, the similarity between hourly ozone concentrations is measured using squared Euclidean distance, which can be calculated using Eq. 1 (Sharma and Kumar 2006). Even though the Euclidean distance is the most familiar and commonly used distance measured, the squared Euclidean distance measure is chosen in the study since there are large distances corresponded to many dissimilar items that were expected to occur in O3 analysis. These approaches have been used by several researchers in studying air and water pollution (Shresta and Kazama 2007; Lau et al. 2009). The classifications of objects are illustrated by a dendrogram (tree diagram), which shows the measured similarity or distance between any two variables.

$$ {d}_{ij}^2=\left[{\displaystyle \sum_{k=1}^p{\left({X}_{ik}-{X}_{jk}\right)}^2}\right] $$
(1)

The optimum number of clusters is determined using the difference in distance values as the optimum point is where clear declamation between differences in distance is recorded. Then, CA is repeated using the selected number of cluster to evaluate the selected number of clusters.

Pearson’s coefficient (r) is used to measure the linear association, strength, and direction of the relationship between the selected variables (Elbayoumi et al. 2014) and O3 concentrations. Pearson’s r can be calculated using Eq. 2 (Özbay et al. 2011).

$$ r=\frac{{\displaystyle \sum \left(x-\overline{x}\right)\left(y-\overline{y}\right)}}{\sqrt{{\displaystyle \sum {\left(x-\overline{x}\right)}^2{\displaystyle \sum {\left(y-\overline{y}\right)}^2}}}} $$
(2)

where x and y are the selected variables and \( \overline{x}\;\mathrm{and}\;\overline{y} \) are the means of the variables.

PCA is a multivariate technique that has gained recognition for its capability to deal with voluminous data such as air pollution. In the present study, PCA is utilized to statistically explore the relationship among O3, its precursors, and selected meteorological parameters. Various researchers have reported the successful application of this technique in variable reduction and its capability to detect the most significant variables with minimum loss of the original information (Shresta and Kazama 2007; Özbay et al. 2011; Dominick et al. 2012; Elbayoumi et al. 2014). According to Kovač-Andrić et al. (2009), the PCs are extracted in ascending order, where the first component (PC1) represents the largest variation in the dataset. PCs are generally expressed as Eq. 3.

$$ {\mathrm{PC}}_{\mathrm{i}}={l}_{1i}{X}_i+{l}_2{X}_2+\dots +{l}_{mi}{X}_m $$
(3)

where PC i is the ith principal component and l mi is the loading of the observed variable X m .

The significant variables for each component are determined based on the loading. In this study, only a factor loading that is greater than 0.4 is considered significant (Ul-Saufie et al. 2013) and the analysis is validated using 50 % of the dataset.

Result and discussion

In situ measurement

The results show that, as of the end of 2009, the concentrations of ozone in the study areas were still below their respective permissible values as recommended by Malaysian Ambient Air Quality Guidelines (MAAQG) as summarized in Table 2. The concentration of O3 at all sampling stations was relatively far below the permissible maximum value for an average time of 1 h (100 ppb). Overall, the station at Klang exhibited the highest average concentration of O3 (20.3 ± 18.2 ppb), followed by the stations at Perai (15.4 ± 15.8 ppb) and Pasir Gudang (14.4 ± 13.1 ppb). A relatively small number of exceedances (i.e., 9 and 4 h for Klang and Pasir Gudang, respectively) were observed during the monitoring year, as illustrated in Fig. 2. Furthermore, the number of exceedances in Klang was higher than those in the other two cities. Aside from the emissions from the ship industry and vehicles, the O3 concentration is high because of conducive atmospheric conditions and emissions from other industrial activities. High daytime O3 concentrations were observed in the majority of the urban stations especially the ones that located near to Klang, such as Kajang Gombak, Shah Alam, and Petaling Jaya where the concentration of daytime O3 frequently surpassed the limits imposed by the MAAQGs (Azmi et al. 2010; Ghazali et al. 2010; Latif et al. 2012; Banan et al. 2013). The average levels of O3 in the study areas were also within the range of O3 concentration measured in several other towns in Malaysia as recorded by Latif et al. (2012) and Ghazali et al. (2010).

Table 2 Detailed statistics of pollutants and meteorological parameters in the study area
Fig. 2
figure 2

Exceedances of the guideline limit at the port cities

Diurnal variations of O3 concentrations

The study of the diurnal variations of O3 would provide valuable information on the sources and transport of O3 as well as the effects of its chemical formation/destruction. The shapes of O3 cycles are strongly affected by the levels of its precursors (NOx and VOCs) and meteorological conditions (temperature and solar radiation) (Nielsen 2004; Papanastasiou et al. 2007). The diurnal variations in O3 concentrations during the study period are graphically illustrated in Fig. 3 with the use of box and whisker plots. This figure shows that the O3 diurnal variation of each site exhibited a similar pattern, but the magnitudes of the variations differed. The diurnal pattern of O3 for each site is characterized by a maximum concentration in the afternoon and a minimum during nighttime; this result is consistent with previous findings (Kobayashi et al. 2007; Turias et al. 2008; Jones and Kirby 2009). This variation between day and night generally coincides with the amount of higher solar radiation intensity during the day, which is the favorable condition for powering photochemical reactions. In photochemical reactions of O3 formations, solar radiation with a wavelength of less than 400 nm has enough energy to photolyze NO2 into NO and atom oxygen (O) (Seinfeld and Pandis 2006; Ghazali et al. 2010).

Fig. 3
figure 3

Box and whisker plots of hourly average diurnal variations of ozone concentrations

A unimodal O3 peak is observed in the three sites, and the highest O3 levels are found in Klang, followed by Perai and Pasir Gudang, as shown in Fig. 3c. Minimum values of O3 concentrations appear during nighttime and early morning hours (near sunrise), with the lowest concentrations being consistently measured at 8 a.m. This scenario is triggered by NO titrations. During morning rush hours (normally occur from 6 a.m. to 9 a.m.), high concentrations of NO from vehicles and industrial activities were being released (Jiménez-Hornero et al. 2010; Reddy et al. 2011), thereby accelerating NO titration rates in ambient atmosphere. Increase in NO titration eventually promotes the reduction of O3 concentrations, as NO titrations are the most significant sink reaction toward ground-level O3 (Ghazali et al. 2010; Latif et al. 2012; Banan et al. 2013; Alghamdi et al. 2014).

The time of sunrise is a turning point of diurnal O3 variation as O3 concentration rises gradually just after the sun rises and reaches its maximum levels between 2 p.m. and 4 p.m.. After which, O3 concentrations decrease progressively until the evening and then keep on decreasing more gradually, maintaining low values overnight because of the lack of solar radiation. The diurnal plot indicates that nighttime O3 concentrations decreased significantly from 7 p.m. to 7 a.m. Thus, O3 concentrations at night are relatively low and more constant (Qian et al. 2014). These low concentrations are primarily attributed to the absence of photochemical reactions, which convert O3 precursors into O3 (Zeng et al. 2010). O3 concentration during nighttime can also be reduced by chemical loss via NO titration and deposition. In addition, the further reduction of O3 concentrations during nighttime is also attributed to the reactions between O3 and NO2, which produce dinitrogen pentoxide (N2O5) and nitric acid (HNO3) (Tiwary and Colls 2009).

The variability of O3 in the three port cities was further investigated by conducting a hierarchical agglomerative CA using the hourly average O3 concentrations over the period of 2009. Dendrograms in Fig. 4 illustrate the classification of the monitoring stations based on hourly O3, and concentrations of O3 for each hour in different cluster groups are summarized in Table 3. Results confirm that the ascending trend of O3 concentrations for Klang is G1 (7.18 ppb), G2 (18.65 ppb), G4 (32.38), and G3 (40.05 ppb); for Perai is G1 (6 ppb), G3 (23.5 ppb), and G2 (34.0 ppb); and for Pasir Gudang is G1 (5.54 ppb), G2 (15.78 ppb), G4 (25.4 ppb), and G3 (24.65 ppb). These results are consistent with diurnal patterns illustrated in Fig. 3. As shown in figure, the highest O3 concentration hours were from 12 p.m. to 3 p.m. for Klang, from 1 p.m. to 4 p.m. for Perai, and 1 p.m. to 2 p.m. for Pasir Gudang. Therefore, the Malaysian government devised the State Implementation Plan (SIP) for ground-level ozone control across the entire region in order to apply a targeted control strategy for port cities during the afternoon from 12 p.m. to 4 p.m. and during the morning rush hours to reduce the O3 precursor pollutants.

Fig. 4
figure 4

Dendrogram of cluster analysis for ground-level ozone. a Klang. b Perai. c Pasir Gudang

Table 3 Summary of O3 concentrations for each cluster group

Monthly variations of O3 concentrations

The monthly variation of mean daytime and nighttime O3 concentrations during the study period is graphically presented in Fig. 5. The highest daytime average O3 concentrations were observed between January and May in Klang and Perai, whereas the lowest concentrations were observed between June and August. These observations are consistent with the results of the regional analysis of satellite O3 data (Ernest et al. 1991; Aynsley 1999). In addition, HSE (2010) reported that the seasonal patterns of O3 concentration in China exhibited an annual maximum in March and a minimum in October. The monthly variation in Pasir Gudang exhibited a similar trend to those of Perai and Klang between March and May but behaved differently between September and November. Every year, Malaysia experiences four distinct monsoonal changes characterized by the northeast monsoon (November to March), the southwest monsoon (June to September), the first intermonsoon (April to May), and the second intermonsoon (October to November) (Md Yusuf et al. 2010). The monthly variation of O3 was high during January, February, March, and April (northeast monsoon and first intermonsoon). These periods are generally associated with the wet season and are supposed to have lower pollutant concentration. However, during the northeast monsoon and first intermonsoon, the UVB degree was high, ranging from 716 and 955 J/m2h. Sunlight is the essential ingredient for completing the photochemical reaction for O3 formation. At the same time, temperatures ranging from 30.6 to 33.2 °C are strongly correlated with O3 formation. Furthermore, the presence of high concentrations of carbon monoxide, the principal sink of OH which is the main tropospheric oxidant and plays a key role in determining the oxidizing power of the atmosphere, between 8 a.m. and 10 a.m. in the three locations, also accelerates the reaction, as measured by the nitric oxide oxidation, and results in ozone formation. Several researchers have reported the strong correlation between CO and O3 concentrations (Ma et al. 2012; Zhang et al. 2013).

Fig. 5
figure 5

Contour plots of monthly variations of ozone concentrations (2009). a Klang. b Perai. c Pasir Gudang

Association of O3 with meteorological parameters and NOx

According to the MAAQG, the prescribed limit for a 1-h averaging time of NO2 is 170 ppb. Figure 6 illustrates the box plots of NO2 and NO concentrations for the three port cities. NO2 concentrations ranged from 1 to 75 ppb for Klang, from 1 to 52 ppb for Perai, and from 1 to 62 ppb for Pasir Gudang. Klang exhibited the highest mean NO2 concentrations at 20 ppb, followed by Pasir Gudang and Perai. Furthermore, the box plot for NO2 shows occurrences of extreme values, particularly in Klang and Pasir Gudang sites. Meanwhile, the box plot for NO shows that NO concentrations ranged from 1 to 165 ppb for Klang, 1 to 114 ppb for Perai, and 1 to 173 ppb for Pasir Gudang.

Fig. 6
figure 6

Box and whisker plots of NO2 and NO concentrations

The diurnal variations of the hourly average of O3 and its precursors (NO2 and NO) are illustrated in Fig. 7. O3 production was photochemically driven; thus, the diurnal characteristic of O3 concentrations displayed an increasing trend after sunrise, reaching the maximum around noon, and minimum concentrations were recorded in the afternoon. In the troposphere, NOx plays the most important role in forming and destroying O3 concentration (Seinfeld and Pandis 2006; Alghamdi et al. 2014).

Fig. 7
figure 7

Composite diurnal plots of O3, NO2, and NO concentrations

O3 production rate increases at low NOx until a maximum is reached and then decreases at high NOx (Lin and Chen 2014). This pattern occurs because high NOx promotes the removal of OH radicals as a result of the reaction of OH with NO2(Gray and Schwarzenbek 2014). Meanwhile, as the sun goes down in the evening and during nighttime, the photochemical processing of O3 is stopped because of the absence of photochemical reactions, and the O3 that remains in the atmosphere is then consumed by deposition (Zaatari et al. 2014) and/or reaction with NO, which acts as a sink for O3. The decrease in O3 during the early morning hours from 6 a.m. to 8 p.m. is mainly due to the increase in traffic flow (rush hours) and fresh NO emissions.

The maximum NO2 and NO concentrations were recorded at 8 p.m. and 8 a.m. The magnitude and trends of NO2 and NO concentrations are similar among the three cities. Ghazali et al. (2010) posited that the typical NO2 diurnal trends in Malaysia show two significant peaks in the early morning (9 a.m. to 10 a.m.) and in the evening (8 p.m. to 10 p.m.), among which the second peak is lower because of emission intensity and prevailing meteorological parameters. Meanwhile, Banan et al. (2013) reported that the typical patterns of NO concentration diurnal trends are similar to those of NO2 concentration, in which relatively high concentrations observed at night and peak concentrations in the morning are attributed to vehicle emissions.

Figure 8 shows the diurnal variation of O3 with UVB and temperature. Sunlight is the essential ingredient for completing the photochemical reaction for O3 formation. At ground level, photons with wavelengths less than 400 nm can break NO2 molecules, forming atoms of O, which later combine with O molecules to produce O3. Photons with wavelengths greater than 400 nm do not have sufficient energy to break NO2 molecules, whereas photons with wavelengths shorter than 280 nm are efficiently absorbed by stratospheric O3 and are obstructed from entering the troposphere (Tiwary and Colls 2009). UVB radiations may vary according to location, season, time of the day, and weather conditions (Lee et al. 2010). The results of this study demonstrate that O3 diurnal variations were similar to the variations of UVB and temperature. Malaysia receives sunlight at around 7 a.m. (Mohammed et al. 2013). The increase in sunlight intensity directly increases the temperature and, at the same time, promotes photochemical reactions. In the present study, the maximum UVB radiation was measured at 800 J/m2h at 2 p.m. in Pasir Gudang. At the same time, O3 and temperature reached their peak concentration and condition at 29 ppb and 32.5 °C, respectively. The same trend was observed in Perai and Klang. Unlike solar radiation, temperature has an indirect effect toward O3 formation. The results indicate that during daytime, temperature is increased as solar radiations increase. These findings conform to the proposition of Banja et al. (2012), who concluded that temperature is strongly correlated with O3 because of light intensity. Between 7 p.m. and 7 a.m., zero UVB radiations were observed, and temperature ranged from 22 to 25 °C. At night, O3 concentrations are expected to be at their lowest because of the absence of photochemical reactions.

Fig. 8
figure 8

Composite diurnal plots of O3 concentrations, temperature, and incoming solar radiation

Bivariate correlation and PCA

Table 4 shows the Pearson correlation matrices of the variables in the study of the data for Klang, Perai, and Pasir Gudang. Ozone concentrations were negatively correlated with NO, NO2, CO, PM10, and RH in the three port cities. This result was expected given that NO, NO2, and CO are known precursors of ozone, indicating that a rise in ozone concentration is associated with a drop in the levels of these pollutants. Meanwhile, PM10 is one of the most notable criteria pollutants because it can alter the photolysis rates of several trace gases. Bian and Zender (2003) claimed that high PM10 levels in ambient air can trigger light scattering of solar radiations and reduce the solar radiation intensity that reached ground level. Reduction in solar intensity stopped photochemical reactions and diminished O3 concentrations. The direct relationship between RH and rainfall or wet condition contributed to the negative correlation between O3 and RH. High RH condition enhanced O3 destruction as a result of the reduction in photochemical efficiency and the increase in wet position process (Kovac-Andric et al. 2009; Toh et al. 2013). Furthermore, ozone concentrations were positively correlated with SO2, temperature, wind speed, and solar energy. The largest in magnitude among these factors was temperature, followed in order by solar energy and wind direction. The correlation coefficients between O3 and PM10, CO, and CO2 in the monitored cities vary in magnitude depending on changes in environmental and local conditions, including temperature and relative humidity.

Table 4 Pearson correlation matrix of O3 with precursors and meteorological parameters

However, because the signal to noise ratio in raw process data from environmental systems is frequently low and the environmental variables are usually highly correlated, PCA should be conducted to assess variables independently, to reduce dimensionality, and to explain the relationships among complex datasets (Kim et al. 2010). Based on spatial variations, three PCAs were constructed for the three monitoring cities. The sufficiency of the monitoring data for PCA was assessed using Kaiser-Meyer-Olkin (KMO) and Bartlett’s tests. According to Özbay et al. (2011), these tests are applied to examine the hypothesis that the variables are uncorrelated in the population. The results of these tests are shown in Table 5. The KMO result showed that the values for all cities were greater than 0.5, which indicated that the data were sufficient for PCA. Meanwhile, Bartlett’s test of sphericity showed that the selected variables were significantly (p > 0.001) related to one another and suitable for factor analysis.

Table 5 Kaiser-Meyer-Olkin (KMO) and Bartlett’s test

Before extraction using PCA, ten linear components, namely, NO2, NO, SO2, CO, PM10, O3, temperature, relative humidity, UVB, and wind speed, were selected. After the extractions using PCA, two, three, and three PCs were selected for Klang, Perai, and Pasir Gudang data, respectively. A spatial variation was observed in the loaded component of each PC and in the total variance explained by each PCA. In practice, only loadings with absolute values greater than 0.4 are selected for principal component interpretation (Abdul-Wahab et al. 2005; Ul-Saufie et al. 2013). Varimax rotation is used in order to gain better understanding and interpretation of the data (Dominick et al. 2012). Relatively small factor loadings (less than 0.4) caused SO2 and RH to be suppressed from PCA in Klang and Pasir Gudang, respectively. The numbers of PCs that could explain the total variations in the data were 63.3, 76.7, and 72.2 % for Klang, Perai, and Pasir Gudang, respectively. The number of retained PCs and the cumulative percent variance for each city are shown in Table 6. In Klang city, the first PC accounted for 45.1 % of the total variation in the data (Table 6). It is loaded heavily on indoor and NO2, NO, CO, and PM10 and corresponds to the general correlation between O3 and precursor pollutants. The second PC, which accounted for 18.2 % of the total variation, loaded heavily on meteorological parameters (temperature and relative humidity).

Table 6 Total variance explained by the selected variables

Furthermore, Table 7 demonstrates stronger contributions toward PC1 from meteorological parameters (47.7 and 39.5 % for Perai and Pasir Gudang, respectively). The contribution toward PC2 is stronger for pollutant parameters (NO2, NO, and CO) in Perai, accounting for 15.1 %. In Perai and Pasir Gudang, the third PC, which accounted for 13.8 and 13.5 % of the total variation, respectively, loaded heavily on SO2, PM10, and NO2. Although NO2 is known as the main precursor toward O3 formations, the results show that NO2 in PC3 contributed 13.6 % of O3 variations in Pasir Gudang together with SO2 concentrations. This uncommon result is presumably caused by the nighttime variations between O3 and NO2 concentrations. During nighttime, NO2 acts as a sinking agent to O3 concentrations through HNO3 production (Colls 2002).

Table 7 Rotated component matrix using varimax rotation

By comparing the results from the three PCAs, the number of PCs increased from 2 to 3 when using Perai and Pasir Gudang datasets, and these three PCs explained 76.2 % of the variation, which was greater than the average of the Klang PCs (63.3 %). Furthermore, differences between Klang PCs and Perai and Pasir Gudang PCs can be observed by examining how the pollutants and meteorological variables were loaded. Pollutant concentration parameters have a greater influence on O3 formation in Klang than in the other two cities. This difference is due to variation in the numbers of received and departing vessels in the three ports, among which Port of Klang is the busiest in Malaysia.

The analysis is validated using 50 % of the dataset, and result showed that the differences in percentage of total variance are very small which are 0.16, 0.01, and 0.66 % for Klang, Perai, and Pasir Gudang, respectively. The validation result showed that the analysis gives consistent result even the number of dataset used is different.

Conclusion

This study analyzed the concentrations of O3, NO2, and NO measured in three port cities, namely, Klang, Perai, and Pasir Gudang over 365 days in 2009. The study showed that the concentration of ozone in the three ports was still below the maximum permissible values prescribed by the MAAQG. Overall, the station at Klang exhibited the highest average concentration of O3 (20.3 ± 18.2 ppb), followed by the stations at Perai (15.4 ± 15.8 ppb) and Pasir Gudang (14.4 ± 13.1 ppb). The results of CA indicate that the diurnal cycle of ozone concentration has a midday peak (1:00 p.m. to 3:00 p.m.) and lower nighttime concentrations. The diurnal pattern of surface ozone concentration is strongly influenced by meteorological conditions and prevailing levels of precursors (NOx) and CO. This finding was also affirmed by the results of bivariate and multiple correlation analysis, as meteorological parameters loaded heavily in the first components in PCA analysis in Perai and Pasir Gudang. The concentrations of NO, CO, and O3 were relatively high during January–May compared to other months of the year. Highest and lowest average O3 levels were observed in February and November at 65.69 and 31.89 ppb, respectively. Overall, the data show the ozone midday peak in the region to be, in many ways, similar to other areas although there are some significant differences due to differences in weather parameters. Further studies including different industrial locations and daytime nighttime ozone variation would lead to a deeper understanding of the ozone climate of the region. In addition, further investigation using an appropriate risk analysis and human is needed due to population exposure to the elevated levels of ozone from January to May. Collection of VOC data unavailable to this study would also assist in data interpretation.