Introduction

Over the last several decades, there has been increased attention on the heavy metal contamination associated with highways, because of the associated health hazards and risks (Nazzal et al. 2012). The concentrations of heavy metals and toxic elements in roadside soils and dust can provide valuable information about pollution levels in urban and industrial areas as, in most cases, such concentrations reflect the extent of the emissions of these elements from anthropogenic sources (Fergusson 1990; Harrison et al. 1981). Lead in particular is an ubiquitous environmental pollutant and its presence in soils has been extensively studied, and attributed to the use of alkyl-lead compounds, such as antiknock additives in petrol (Gratani et al. 1992). Many studies have been reported on the contamination of roadside environments with various elements, especially heavy metals, such as Cu, Fe, Cr, Zn, Pb, and Ni. These elements are released into the roadside environment as a result of combustion, mechanical abrasion and normal wear and tear (Carlosena et al. 1998). Lagerwerft and Specht (1970) attribute the presence of Ni to gasoline and of Cd and Zn to tires and motor oil. Mn like Pb is used as a vehicle fuel additive (Loranger and Zayed 1994). A trend for higher concentrations of heavy metals to be present on streets where traffic is more likely to undergo stop-start maneuvers, such as at traffic lights, has been noted by Abu-Rukah (2002).

Webster et al. (1994) applied multivariate geostatistics to provide a more objective assessment of the sources of some heavy metals in topsoil, based not only on visual inspection of the concentration maps, but also on a quantitative analysis of the spatial variability of the elements and their relationships on different spatial scales. Moreover, they used multivariate methods to compare the results of PCA carried out on the concentration data with the experimental indicator variogram applied to some categorical information, to relate the concentration of heavy metals to the geology and land use of the area.

The present study deals with important and very busy highways in and around the city of Toronto, in the province ON, in Canada. Specifically, major 400 series highways (401, 400, 404, and the Don Valley Parkway) within the Greater Toronto Area (GTA) are considered (Fig. 1). The objectives are to assess roadside dust contamination by heavy metals in the Greater Toronto area (GTA). The multivariate geostatistics approach used here in the present research formulates hypotheses regarding the main sources of contamination in the roadside dust, and reveals scale-dependent variations of chemical soil properties.

Fig. 1
figure 1

Map of sampling and study area in the Greater Toronto Area. Values on horizontal axis denote latitudes, and on vertical axis denote longitudes

Materials and methods

The study area

Toronto is located in southern Ontario on the northwestern shore of Lake Ontario (Fig. 1). With over 2.5 million residents, it is the fifth most populous city in North America. Its metropolitan area, with over 5 million residents, is the seventh largest urban region in North America. Toronto is at the heart of the Greater Toronto Area (GTA), and is part of a densely populated region in Southern Ontario known as the Golden Horseshoe, which is home to over 8.1 million residents—approximately 25 % of Canada’s population (Table 1; Population of Census Metropolitan areas 2006).

Table 1 Characteristics of selected highways in the Greater Toronto Area, (Source: Population of census metropolitan areas 2006)

The 400 series [400, 401, 404, and Don Valley Parkway (DVP) highways] of highways make up one of the primary road networks in the south of the province, and they connect to numerous border crossings with the US, the busiest being the Detroit–Windsor Tunnel and the Ambassador Bridge (via Highway 401) and the Blue Water Bridge (via Highway 402). The primary highway along the southern route is Highway 401 (also called the Highway of Heroes), the busiest highway in North America and the backbone of Ontario‘s road network, tourism, and economy (Thurston 1991). The highway lengths are 817.9 km for the 401, 209 km for the 400, 36.8 km for the 404, and 15 km for the Don Valley Parkway; and they have traffic flow of vehicles per day 500,000 (401); 120,000–200,000 (400); 80,000–120,000 (404); and 60,000–100,000 (DVP) (Table 1), mostly trucks and microbuses, (Population of Census Metropolitan Areas 2006).

Geology, terrain, soils, and land use

The bedrock geology of ON is variable in lithology, structure and age, although approximately 61 % of the province is underlain by Precambrian rock of the Canadian Shield (Thurston 1991; Fig. 2). It is composed mainly of felsic intrusive rocks forming the rocky Severn and Abitibi uplands (Bostock 1970).

Fig. 2
figure 2

Surficial geology of ON. (Adapted from Forest Landscape Ecology Program 1996)

To the north of the Shield, in the area generally referred to as the Hudson Bay lowlands, the bedrock is composed of carbonate sedimentary formations. The clastic and marine carbonate bedrock of southern Ontario is interrupted by the Frontenac Axis, a southern extension of the Shield, which intersects the St. Lawrence Seaway east of Kingston. The Frontenac Axis has different forest cover and land use patterns than areas to either the west or east, due to its uneven terrain and shallow acidic soils, both characteristic of the Canadian Shield (Fig. 2).

Soils are formed by the physical and chemical weathering of bedrock and glacial parent material, and are continually modified and shifted by water, wind and gravity. Where glacial action has scoured away overlying deposits, the soils of Ontario closely reflect the underlying bedrock. Other soils reflect the tills and other morainic and lacustrine materials deposited by advancing and retreating ice sheets and their meltwater. The Canadian System of Soil Classification (Agriculture Canada 1987) is a standard series of orders and component great groups by which soils can be identified and described. Six of the soil orders in this classification are predominant in ON. These are the organic and related organic cryosolic soils in northern parts of the province, brunisols in the northwest part of the Shield and south of the Shield, podzols over much of the central and southern Shield, luvisols in the Claybelt and over much of southern ON, and gleysols in poorly drained areas and in the Claybelt lacustrine deposits. Regosolic soils are dominant only in a thin band along the southwest shore of Hudson’s Bay. Figure 3 illustrates the soil orders and great groups that occur most extensively in ON, based on the composition within the Soil Landscapes of Canada mapping units (Agriculture and Agri-Food Canada 1996).

Fig. 3
figure 3

Dominant soil orders and great groups in ON, based on the Soil Landscapes of Canada units (Data from Agriculture and Agri-Food Canada 1996)

From the early 1990s to the early 2000s, the total area of settlement and developed land in the GTA has increased by 513 km2. Meanwhile, the area of agricultural land and naturally vegetated land has decreased by 114 and 423 km2, respectively. As evident in Fig. 4, most of the land use/cover changes, similar to population change, have occurred at the urban–rural fringe within the northern portion of the GTA. In this area, agricultural lands and naturally vegetated lands have been converted to new settlement or development areas.

Fig. 4
figure 4

Land use maps of Greater Toronto Area for the 2000s. (Source Chen 2010)

Collection of samples and geochemical analyses

A total of 42 road dust samples were collected from four highways in the Greater Toronto Area highways (GTA), Ontario in Canada: 401, 400, 404 and Don Valley parkway (Fig. 1). The road dust samples were stored in self sealed polyethylene bags, carefully labeled and taken to the laboratory. The sampling sites are located at distances of 0–3 m from the roadside. In the present study, the contamination of sediments with particle size fractions below 2 μm were investigated using the pipette method (Gee and Bauder 1986) in which a sample is pipetted at different times and various depths of the suspension of the sample in a measuring cylinder. The pipetted suspension is condensed and dried, and the mass ratio of the pipetted fraction is determined by weighing. Then, 0.5 g of the pipetted fraction was digested using 4 ml of HNO3 (65 %), 2 ml of HF (40 %), and 4 ml of HClO4 (70 %). The solution of the digested samples was analyzed with an Atomic Absorption Spectrometer (PYE UNICAM SP9) for lead (Pb), zinc (Zn), cadmium (Cd), nickel (Ni), chromium (Cr), copper (Cu), manganese (Mn), and iron (Fe). For quality control, all sediment samples were analyzed in triplicate and mean values were calculated. In addition, analytical blanks were run in the same way as the samples and concentrations were determined using standard solutions prepared in the same acid matrix to monitor the possibility of sample contamination during digestion and subsequent analysis. The absorption wavelength and detection limits, respectively, were as follows: 228.8 nm and 0.0006 ppm for Cd; 240.7 nm and 0.007 ppm for Co; 324.7 nm and 0.003 ppm for Cu; 248.3 nm and 0.005 ppm for Fe; 279.5 nm and 0.003 ppm for Mn; 232.0 nm and 0.008 ppm for Ni; 217.0 nm and 0.02 ppm for Pb; and 213.9 nm and 0.002 ppm for Zn.

The accuracy of the atomic absorption spectrometer measurements was assessed by analyzing the standard reference material NIST, SRM 1646. The calculation of the different statistical parameters was performed using the SPSS (Statistical Program for the Social Sciences) software package.

PCA is used to reduce a large number of variable parameters (identified in water samples) to a small number of principal components (Güler et al. 2002; Astel et al. 2008). More concisely, PCA has been used linearly which combines two or more correlated variables into one. Varimax normalized rotation was applied to the principal components to reduce the contribution of significantly minor variables, leaving for consideration only factors with eigen values greater than one.

Hierarchical cluster analysis (HCA) (Güler et al. 2002; Astel et al. 2008) was used to determine if the selected metals can be grouped into statistically distinct groups (clusters). The Ward’s method was used as amalgamation rule to obtain the hierarchical associations. The obtained data were standardized (z scores) and the Euclidean distance was used as similarity measurement. Classification results of the HCA are generally presented in a graphical form called “dendrogram”.

Results and discussion

Metals concentrations

Minimum and maximum concentrations, the mean values and standard deviations for each of the analyzed metals, are presented in Table 2. In general, the concentration of the various metals varies widely in the studied highways. The mean concentrations for the road dust samples are higher than their background values, suggesting that the presence of these metals in road dust around the GTA are influenced by high concentrations of traffic flow are thought to be the main cause of elevated heavy metals concentrations, although the parent material mineralogical and chemical composition in addition to the different types of land use could be the other causes. Some of the elements examined are discussed in detail below.

Table 2 Statistical summary of heavy metal levels (in ppm) for the collected roadside dusts

Lead

The critical concentration in soils of lead is between 100 and 400 ppm, and the global measured lead concentration in surface soils is estimated as 25 ppm (Table 3). In the study area, the concentration of lead ranges between 32 and 378 ppm; this range is much greater than the corresponding values in the world soils. Almost all locations show higher values than the average world soils (Alloway 1990; Table 3). The highest values are found along Highway 401 at location 2; the Don Valley Parkway exhibits the next highest values, especially at locations 36 and 38. The actual concentrations of lead along the highways sides are variable and depend on various factors, such as site, traffic factors, prevailing wind and humidity (Audat 2000; Jiries et al. 2001).

Table 3 An average crustal abundance and average world soils (ppm)

Zinc

The concentration of zinc in the investigated highways ranges from 39 to 394 ppm (Table 2). Most of the locations exhibit higher values than the average world soil (Alloway 1990) value of 90 ppm (Table 3). Location 13 along Highway 400 has the highest value of 394 ppm, followed by locations 29 and 30 along the Don Valley Parkway, which have concentrations of 340 and 342 ppm, respectively. According to Ellis and Revitt (1982), Jiries et al. (2001), zinc may be derived from mechanical abrasion and oil leaks from vehicles; so the high concentrations in the studied highways are likely related to high traffic movements.

Cadmium

The average natural abundance of cadmium in the earth’s crust is 0.2 ppm (Wedephol 1978), but much higher and much lower values have also been cited depending on a large number of factors (Howari et al. 2004). The concentration of cadmium in the investigated samples along the selected highways ranges from 0.046 to 0.050 ppm, which is less than the concentration in average world soil as shown in Table 4 (Alloway 1990; Table 3). Location 1 along Highway 401 has the highest value of the investigated samples, at 0.0540 ppm. The presence of cadmium is believed to be associated with tire wear, as indicated by Ellis and Revitt (1982).

Table 4 Average heavy metal concentration (in ppm) in urban soils from different cities in the world

Calcium

The average world soil normal concentration is 2,200 ppm (Alloway 1990; Table 3). The concentration in the studied roadside dust samples from the selected highways within GTA ranges from 14,476 to 353,520 ppm. Location 25 along Highway 404 has the highest value at 353,520 ppm, followed by location 22 along Highway 404 at 215,575 ppm.

The statistics calculated for the data sets provide information about the frequency distribution of the concentrations of chemical elements in the roadside dust; the results are summarized and compared with some reference values in Table 2. Comparison of data shows that the average concentrations of investigated heavy metals for analyzed samples were higher than their corresponding values of average world soils (Table 4).

Multivariate geostatistics

Pearson’s correlation coefficients of heavy metals in roadside dust of the Greater Toronto Area (GTA) are listed in Table 5. A finding is described as statistically significant when the probability is less than 0.05 (p < 0.05) and as highly statistically significant when (p > 0.05), the coefficient of correlation is not significant. A positive correlation between Cr–Cd, Mn–Fe, and Fe–Cu, and negatively correlation between Zn–Cd, Mn–Cd, Zn–Cr, Pb–Zn, and Ni–Zn), but few of them are significant at 95 and 99 % confidence levels. Cd, Cu, Fe, Pb, Mn, and Ni are significantly positively correlated with each other, which may suggest a common origin, such as traffic flow or industrial activities. In addition, Pb exhibits a very weak positive correlation with Cd, Cu, and Fe, while Ni exhibits a very weak positive correlation with Cd, Cu, Fe, and Mn. The source for Pb may be heavily traffic activities in the study area. Nickel occurs naturally in the earth crust. It is found in all soils; in the environment, Ni is primarily found combined with oxygen and sulfur as oxide or sulfides. In the study area, Ni sources are attributed to nature and oil burning and traffic flows. The reason is the observed correlation for the various elements based on the geochemical behavior of chemical species and/or the possible anthropogenic inputs to the road dust samples.

Table 5 Pearson’s correlation coefficients between heavy metals in the road dust samples (n = 42)

PCA is applied to assist in identifying the sources of pollutants. By extracting the eigenvalues and eigenvectors from the correlation matrix, the number of significant factors, and the percent of variance explained by each of them, are calculated using the software package SPSS 15. Table 6 displays the three components. The first component explains approximately 42.69 % of the total variance and is loaded heavily with Cu, Fe, and Mn. The source of this component may be industrial and traffic. This observation is also evident from the presence of various metal processing industries in the area in addition to the traffic flow. Component 2 is loaded with Cr and Pb and Ni accounts for 16.39 % of the total variance. The source could be the lithology in the area with traffic. Component 3 is correlated very strongly with Cd, which has a high loading value (0.97), and explains 14.38 % of the total variance. The source of this factor may be contributions mainly from traffic, especially trucks and cars, the agricultural and cultivated lands on the side of the studied highways.

Table 6 Total variance explained and rotated component matrices for heavy metals

Before performing a cluster analysis, the variables are standardized by means of z-scores; then Euclidean distances for similarities in the variables are calculated. Finally, hierarchical clustering is determined by applying Ward’s method with the standardized data set. The results of the cluster analyses for the variables are shown in Fig. 5 as a dendrogram. The cluster has two larger subgroups: the first contains only the variables Fe, Mn, Cu, Cr, Ni, and Pb, and the second includes Cd and Zn. As shown in Table 5, a correlation coefficient was performed on chemical parameters using the weighted-pair group average based on the correlation coefficient (Pearson’s coefficient). This method is appropriate to evidence correlation. Furthermore, subgroup 1, Fe and Mn are very well correlated with each other and form another cluster with Cu. Secondly, the association of Cr with Ni at later stage with Pb (Fig. 5). The results fully confirm the attribution of the metals in the three components and defined with the PCA. For subgroup 2, Fe, Mn, and Cu are again associated in cluster, while the second cluster is formed by Cr and Pb, Zn is isolated and joined to the Cr–Ni, Cd–Zn cluster later. One more, the results are very good and are in good agreement with the findings of the PCA analysis (Fig. 5).

Fig. 5
figure 5

Cluster analysis results

The t test is used to assess whether the mean heavy metals in Highway 401 are statistically different from those of Highways 400 and 404. The results show that the mean differences in heavy metal levels in these three highways are not statistically different at the 95 % confidence interval. In this case, the p value is greater than 0.05 and the t statistic values are less than those of critical t values (Tables 7, 8). The results of the t test indicate that the sources of metal pollution in these highways are similar.

Table 7 Heavy metal comparison in the road dust samples of 400–401
Table 8 Heavy metal comparison in the road dust samples of 404–401 highways using student’s t test

There are several tests of significance, but only the Wilks’ lambda is presented here. In this regard, the smaller the lambda for an independent variable, the more that variable contributes to the discriminant function. Lambda varies from 0 to 1, with 0 meaning group means differ (thus the more the variable differentiates the groups), and 1 meaning all group means are the same. The structure matrix table in SPSS shows the correlations of each variable with each discriminant function. These simple Pearsonian correlations are called structure coefficients or correlations or discriminate loadings. When the dependent has more than two categories there will be more than one discriminate function. The correlations then serve like factor loadings in factor analysis, that is, by identifying the largest absolute correlations associated with each discriminant function the researcher gains insight into how to name each function.

Once divergent chemical signatures are determined and associated with a contamination event, an objective is to determine the chemical or chemicals responsible for the divergent signature. Simple statistics (mean, min., and max.) for each chemical within each observed signature are relevant diagnostics (Anderson et al. 2009). Figure 6, a plot of observations obtained by discriminate functions, shows that the analyzed chemicals for the selected highways in the study area perfectly shows four grouped centroid. Based on the results from Tables 7, 8 and 9, Cu, Fe, and Zn are identified as the elements of the highest importance for the discriminations representing the four highways. For these, the lowest p values are observed (0.07). The next steps in the discriminate analysis are carried out for the elements exhibiting the highest significance in the discrimination process; the results (Table 6) show that Cu, Fe, Cd, Mn and Pb have the highest values (0.70, 0.79, 0.76, 0.97, and 0.67, respectively).

Fig. 6
figure 6

The plot of observations in the space of discriminate variables

Table 9 Characterizations of canonical functions

The primary value of discriminate analysis and its ability to determine contaminants of concern among a suite of measured chemicals, along with the most important differences between site-related and reference subsets. From the quantitative and qualitative assessments performed here for chemical contaminants, most of the elements analyzed are demonstrated to be contaminants in the study area.

Conclusions

Total heavy metals with more concentrations are found in roadside dust and soils along selected major highways in the Greater Toronto Area. The concentrations are higher than the maximum concentrations of the corresponding elements in the average world soil. The assessment of pollution in the GTA highways reveals some significant environmental situations, where increased heavy metal concentrations result from various processes acting at different spatial scales. The variation in the metal concentrations in the roadside dust and soils have both natural and anthropogenic origins.

The correlation coefficients show positive correlations for most of the heavy metal pairs, exhibit except Zn–Cd, Mn–Cd, Zn–Cr, Pb–Zn, and Ni–Zn. PCA demonstrates that the three eigenvalues are below 1, and suggests that the contamination sources are related to processing industries, traffic and soils in the area. Cluster analysis identifies the presence of two bigger subgroups. The multivariate geostatistical techniques applied support environmental studies by helping to distinguish between geological and anthropogenic causes of pollution, and allowing hypotheses to be formulated on the probable sources of pollution.