1 Introduction

Metals in marine environments have both natural and anthropogenic origins. Since metals are not subject to bacterial degradation they are an essentially permanent addition to the sea and, as a consequence, they get accumulated in the sediments (Clark 2001).

Assessing the input of metals in the environment as a result of human activity is complicated by the very large natural input (i.e. erosion of ore-bearing rocks, wind-blown dust, and volcanic activity and forest fires; Clark 2001). Factors that influence metal concentration in the sediments are related to: (1) mineralogical and granulometric composition; (2) red-ox state of the metal; (3) pH and Eh of environment; (4) adsorption and surface precipitation processes (Sadiq 1992).

Since metals participate in various biogeochemical mechanisms (Lee et al. 2006), they have good mobility and can be bio-accumulated by marine organisms such as planktonic or benthic foraminifera (Samir and El-Din 2001), algae, and plants (Tranchina et al. 2005) and so bio-magnified up the food chain with the top predators receiving the largest dose of conservative substances (Clark 2001), sometimes with damage effects for humans beings (i.e. Hg poisoning in Minamata disease, see Sindermann 2006).

Industrial and human activities are more and more localized near the coastal areas and, as a result, in recent years metal concentrations in coastal marine environment have increased. A constant monitoring of pollutant concentrations (including metals) in sediments is needed to evaluate the marine coastal environment condition. Many authors have recently focused their attention on sediments from industrialized coastal areas finding that metal concentrations in sediments are strictly influenced by human activities and that a study of metal distributions in sediments can be a good tool to evaluate the degree of environmental marine pollution (Zonta et al. 1994; Bellucci et al. 2002; Di Leonardo et al. 2006).

In this work we present data of Cr, Cu, Hg, Pb and Zn concentrations measured in the mud fraction of 30 sediment samples collected in the Palermo Gulf, for which area no systematic study is available in literature. This bay represents, in fact, an area where both low and high impact human activities (such as a harbour area with dockyards) are present. Enrichment factors for the fine sediment fraction (Galloway 1979) have been calculated for each metal to evaluate whether the metal concentrations can be considered as part of natural background or not. Extensive statistical analysis using ratio matching technique (Anders 1972; Poulton 1989; Villaescusa-Celaya et al. 2000), hierarchical clustering, minimum spanning tree and principal component analysis have been performed on analytical data to assess the geographical distribution of pollutants and their relationships (Villaescusa-Celaya et al. 2000; Ruiz 2001; Guillen et al. 2004).

2 Methods

2.1 Environmental and Geological Settings

Sampling area and sites are depicted in Fig. 1.

Fig. 1
figure 1

Location of the sampling stations. Please note that label GP9 (GP10) is a common label for the three sampling sites on its left

The Gulf of Palermo is delimited by the Monte Pellegrino promontory on the Western side (prevalently constituted by calcareous limestones and calcareous breccias) and by Monte Catalfano on the Eastern side, dominated by dolomitic rocks (Abate et al. 1978). Four small rivers flow into the Gulf of Palermo: the Kemonia, the Papireto, the Oreto and the Eleuterio rivers.

Kemonia and Papireto, characterized by a limited flow rate, were merged, canalized and buried together with city sewage and presently flow into the eastern part of the harbor of Palermo named The Cala.

The Oreto river flows into the middle part of the Gulf and its waters carry clays coming from upper Triassic, clays and quartz from Numidian Flysch, quartz and clays from calcarenites and claystones of lower-middle Pleistocene. These waters are mixed, in the last 2 km of the river, with industrial sewage.

The Eleuterio River flows into the eastern part of the Gulf, and several domestic sewages discharge in these waters. This river flows over clays and silty-clays of Numidian Flysch, clays and calcarenites of lower-middle Pleistocene. The waters with domestic sewage are usually discharged in the gulf without any depuration treatment. An important oil collector is also present in the middle part of the gulf (Bandita area). Palermo has also an important harbor area in which dockyards for ships building and repairing are present.

Modified Atlantic Water (MAW) stream runs along the northern part of Sicily coast from West to East; this stream distributes the pollutants from the Palermo harbor area all the way up to Cape Mongerbino, on the Eastern side. As a consequence, the distribution of metal pollutant depends on both the location of pollution sources as well on the marine current direction. It is therefore possible to assume that at least two areas, with relevant differences in metal concentrations, should exist. One area, from Cape Priola up to the area located at the north of harbor, should be characterized by low anthropogenic impact, while the remaining part of the gulf should be characterized by higher levels of metal concentrations.

2.2 Sampling Methods

Locations of sampling sites were determined with a Garmin® 12 channels GPS (Global Positioning System). A total of 24 transects (following the coast line) were investigated in the Gulf of Palermo (samples with GP label) (Fig. 1). Each sampling point is marked by the GP label, followed by two numbers: the first one (from 1 to 23) indicates transect, the second one (from 1 to 3) indicates sample bathymetry (1 corresponding to 10 m, and so on). Sampling stations were chosen to obtain good area coverage (see Fig. 1). A total of 30 sediment samples were collected and studied. Samples were collected on the soft-bottom undisturbed marine sediment, using a 4-kg Van Veen grab (Bergin et al. 2006). About 1 kg of sediment was taken away after digging and only the upper part of the sediment (0–5 cm) was collected; samples were stored in polyethylene bottles. Sediment samples were dried in laboratory at 55°C up to a constant weight and stored in hermetically closed polyethylene bottles until analysis.

Sampling sites in the background area of Termini Gulf (labelled by GT), were chosen randomly.

2.3 Sample Treatment and Metal Analysis

Effects of grain size on metal concentration in sediments are noteworthy in literature (Rubio et al. 2000). Since our samples are characterized by very different grain size (see Table 1), we chose to analyze the mud fraction (Calmano and Wellerhaus 1982; Villaescusa-Celaya et al. 2000). The <63 μm fraction is the most used in order to compare metal concentrations in sediment from different areas (see Villaescusa-Celaya et al. 2000 and references therein). Dried sediment samples were wet sieved using distilled water on 63 μm nylon sieve to obtain the mud fraction (<63 μm). Sieving therefore accomplished two tasks: first, normalization of the data, with the aim of comparing sites with sediment of different grain size composition, and second, to show evidence of influences of particles which are accepted by benthic organisms (Langston et al. 1999). The <63 μm sediment fraction was dried at 55°C for 48 h and then stored in hermetically closed polyethylene bags until measured.

Table 1 Grain size percentages of sediments collected in the Gulf of Palermo

Flame Atomic Absorption Spectrophotometry (FAAS) “pseudo-total metal contents” analysis was used as the main analytical technique. A quantity of 1,000 mg of sediment was digested in an open cavity microwave system (CEM Star system 2) using the following procedure: 20 ml of HNO3 65% and 10 ml of H2SO4 96% were first added to the 1,000 mg sample aliquot and heated for 5 min at 75°C. The temperature was then raised to 85°C and kept at this value for 10 min. Afterwards the temperature was raised to 95°C for 10 min, and then kept at 106°C for 7 min, at 115°C for 15 min, at 120°C for 10 min and at the mixture boiling point for 15 min (Man et al. 2004). Then, 15 ml of H2O2 30% were added and the last four temperature steps described above were repeated. All reagents were of Merck® Suprapure® grade. The term “pseudo-total” (Manta et al. 2002) accounts for the not complete silicate dissolution at the end of the process (Cook et al. 1997). Digested samples were cooled, filtered through 0.45-μm pores, and diluted up to 100 ml with water (18 MΩ cm Smeg® WP4100/A10).

Cr, Cu, Fe, Pb and Zn concentrations were measured by a Varian® SpectrAA 220 FS flame atomic absorption spectrophotometer. The spectrophotometer was equipped with a deuterium background corrector. Cr, Cu and Pb were measured using the standard addition methods. Zn was measured after a calibration curve was obtained by external standards and diluting samples 1 to 11 for analysis. Hg was measured using the Varian® SpectrAA 220 FS coupled with the continuous flow vapor generator (Varian® VGA-77) and a SnCl2 solution as a reducing agent for Hg vapor releasing from sample solutions. Working solutions of metals were prepared using 1,000 mg l−1 (Merck®) standard solutions of each metal. All samples were analyzed in duplicate. All glassware were previously soaked overnight with 10% HNO3 solution and then rinsed with distilled and deionized water.

The National Research Council Canada PACS-2 (marine sediment) was used as certified reference material to test the repeatability of the measurements and to evaluate precision and accuracy of measurements (Table 2). For some metals (Cr and Pb) recovery was not complete because digestion method used did not destroy silicates.

Table 2 Results, expressed as mean ± standard deviation, for ten measurements performed on certified reference material

In order to determine if the values of the metal measured in the fine fraction of the Palermo Gulf sediment are the result of external factors (i.e. anthropogenic input sources), we calculated the enrichment factor for each metal. We computed the enrichment factor (E f) using the following equation (Galloway 1979; Villaescusa-Celaya et al. 2000):

$$E_{\text{f}} = \left( {\frac{{\left[ M \right]_m }}{{\left[ M \right]_{{\text{ref}}} }} - 1} \right){\text{x100}}$$
(1)

where [M] m is the metal concentration of the sample and [M]ref is the average concentration of the same metal in sites from a reference area, and assumed to be the background value. According to this equation, a value near zero or negative indicates no metal enrichment in the sample, while high positive values indicate a metal enrichment in the sample with respect to the control area.

The reference concentrations of the metals for the area of study were obtained from the mean values of selected metals in the control area (GT; see Fig. 1).

2.4 Statistical Analysis

Extensive statistical analysis has been performed on the metal data.

First, principal component analysis (see, for instance, Murtagh 2000) was used on the raw data in order to identify relevant contributions to metal concentration variability in sampling sites.

Next, in order to assess the geographical distribution of our results, a statistical methodology, known as ratio-matching, was used. This method allows the sorting of samples on the basis of their perturbation caused by a possible source of contaminants, and is based on the assumption of similar behaviour of the metal ratios in samples of common origin, regardless of the dilution by inert materials. This technique was developed in the early seventies (Anders 1972), and later modified (Poulton 1989), and it has been used in the evaluation of heavy metals in the fine fraction of coastal sediments (Villaescusa-Celaya et al. 2000). This technique serves to build a similarity matrix on which clustering analysis will be performed (see, for instance, Anderberg 1973).

Assuming that m analytical parameters have been measured for each of the n samples, n triangular X matrices are built with elements:

$$X_{ij}^\alpha = \frac{{C_i^\alpha }}{{C_j^\alpha }}$$
(2)

where \(C_i^\alpha \) is the concentration of parameter i in the sample α.

Next, the

$$\left( {\begin{array}{*{20}c} n \\ 2 \\ \end{array} } \right) = \frac{{n!}}{{2!(n - 2)!}}$$
(3)

triangular Y matrices are built from the ratios:

$$Y_{ij}^{\alpha \beta } = \frac{{X_{ij}^\alpha }}{{X_{ij}^\beta }}$$
(4)

From the Y matrices a similarity Z matrix is then calculated. In the Anders’ 1972 paper (Anders 1972) the elements of the Z matrix were calculated as the proportion of the elements \(Y_{ij}^{\alpha \beta } \) meeting the following “matching criterion”:

$$\frac{1}{M} \leqslant Y_{ij}^{\alpha \beta } \leqslant M$$
(5)

where M is a parameter, of the order of unity, to be chosen on the basis of quite arbitrary considerations.

Here we follow the modifications proposed by Poulton (1989) and calculate the elements of the symmetric n × n similarity Z matrix as:

$$Z_{\alpha \beta } = \left( {\begin{array}{*{20}c} m \\ 2 \\ \end{array} } \right)^{ - 1} \sum\limits_{i = 1}^m {\sum\limits_{j = i + 1}^m {\frac{1}{{\left| {\ln \left( {Y_{ij}^{\alpha \beta } } \right)} \right| + 1}}} } $$
(6)

The elements of the Z matrix range between 0 and 1, 0 for totally different samples and 1 for equal samples (when all the ratios in the Y matrices are equal to 1 the logarithmic term disappears and the summation simply gives the binomial coefficient). Clustering analysis can now be performed on the above defined Z similarity matrix in order to classify sampling sites.

Among the possible clustering algorithms, one popular approach used to detect the information associated with a similarity matrix is given by the hierarchical ones. Consider a set of n objects and suppose that a similarity measure between pairs of elements is defined. The similarity measures can be arranged in a n × n similarity matrix, such as the above defined Z matrix. Hierarchical clustering methods allow organizing the elements in nested clusters. The result of the procedure is a rooted tree or dendrogram giving a quantitative description of the clusters thus obtained. It is worth noting that hierarchical clustering methods can also be applied to distance matrices, rather than to similarity ones.

A large number of hierarchical clustering procedures can be found in the literature. For a review about the classical techniques see for instance Anderberg (1973). In this paper we focus our attention on the average linkage cluster analysis (ALCA) in order to obtain the dendrogram graphically describing the hierarchical organization of the sampling sites.

An additional analysis tool, the average linkage minimum spanning tree (ALMST), recently introduced (Tumminello et al. 2007), was also used to describe the topological organization of the sampling sites. The ALMST is a generalization to the average linkage of the usual minimum spanning tree (MST; West 2000), which is based on single linkage.

Given a fully connected network, a widely used subgraph of the complete network is the minimum spanning tree (MST) which is the spanning tree of shortest length. The MST construction protocol is deeply related to the single linkage cluster analysis (SLCA; Gower and Ross 1969). The number of links retained in the MST is n 1 for a system of n elements and the tree is a connected graph. In the construction procedure for the MST such links are naturally identified by making use of the fact that the distance between two clusters is defined in an unique way by the distance between the closest elements belonging to two different clusters.

When using ALCA, it is not straightforward to associate a spanning tree to the dendrogram. This is due to the fact that in the ALCA the distance between clusters is defined as the mean distance among the elements of the clusters and therefore, in general, the distance between clusters is not the distance between any pair of elements of the set. The above quoted technique (Tumminello et al. 2007), allows to extract a topological tree named ALMST associated with the ALCA. By construction, the ALMST has the same number of links as the MST.

3 Results and Discussion

Metal concentrations, for the Palermo Gulf samples, are reported in Table 3. These data reflect the assumed degree of pollution caused by the distribution of pollution sources (Fig. 1). Zn, Cu Pb, and Hg concentrations are higher in the area of the Gulf where human activities are present. The highest Cu and Zn concentrations (698.0 and 751.8 μg g−1 respectively) were measured in GP09.1; this sampling point is located inside the Cala area where the domestic sewage, coming from the city of Palermo, flows; furthermore, this area is an enclosed zone in which the circulation of water is restricted and so the water rate of exchange is very low (see Fig. 1). Cu and Zn show a great variability within the Gulf [with a coefficient of variation (CV) of 155.1 and 85.1 respectively]. Other sampling sites characterized by high levels of Cu and Zn are those inside the harbour and also near to the dockyards (GP08.3, GP09.2, GP09.3, GP10.1, GP10.2 and GP10.3). Cu and Zn concentrations measured in samples from this area are responsible of their CV increasing. It is worth noting that Cu and Zn are used in marine paints as anti-fouling agents.

Table 3 Metal concentrations, in μg g−1, measured in the <63 μm sediment fraction, in the Palermo Gulf samples

The highest Cr concentration value (87 μg g−1) was found at GP12.3, located in the area close to the Oreto River mouth. This might be due to the chemical behaviour of this element which in an oxidizing environment may be absorbed onto clays (Richard and Bourg 1991), or could form CrO3 which rapidly precipitates (Morel et al. 1975). On the other hand, Cr shows relatively homogeneous concentrations in the sediments studied (CV = 38.6).

Hg levels are high in the area between station GP.07 and station GP.13 (see Fig. 1), where anthropogenic input is present (domestic sewage, harbour, dockyards and the Oreto River runoff). This element shows a high variability all over the Gulf (CV = 98.5) Levels of pseudo-total metal concentration found in the sediment from the Gulf of Palermo, in the area covered by transects ranging from GP.7 to GP.13, are generally high if compared with those of natural Sicilian sites (Tranchina et al. 2004; Tranchina et al. 2005), and even if compared with those measured in the other stations sampled inside the gulf.

Lead concentrations are generally low in sediments from North-Western and North-Eastern areas of Palermo Gulf and increase in the central part of it (CV = 80.6). The low concentrations of Pb in surface sediments can be attributed to a decrease in Pb input in the environment caused by a lowering in the use of this element. As demonstrated by a previous study (Tranchina et al. 2005) carried out in Sicily coastal marine environments, the main source of Pb in those ecosystems was the air Pb emission due to leaded gasoline combustion. Since the main source of lead has been removed, than Pb levels in recent sediments of the area lowered too (Rizzo et al., submitted for publication).

In order to evaluate metal contamination of sampling sites we have calculated the enrichment factors (Ef; see the previous section) for the selected metals; it allows evaluating the degree of metal contamination of sediments. Metal mean concentration in shale is well known (Turekian and Wedepohl 1961), but owing to geochemical variability of the rocks outcropping in each area, it is very difficult to establish [M]ref values for sediments in the Mediterranean Sea. We calculated the Efs using as [M]ref a mean value calculated, for each metal, from values measured in the fine fractions of sediments collected in the Gulf of Termini (GT), located far from pollution sources. Results obtained for Termini Gulf are reported in Table 4. These data exhibit a smaller variability than those from the Palermo Gulf. The CV values here are in the range 28.8–58.5.

Table 4 Metal concentrations, in μg g−1, measured in the <63 μm sediment fraction, in the Termini Gulf samples

The Efs calculated for Cr, Cu, Hg, Pb, and Zn at each site are reported in Fig. 2. As expected, Enrichment factors calculated for sample GP.9.1 are the highest (with the exception of Cr) and range from 600% for Zn to 3,400% for Hg. Moreover, Efs decrease as the distance from this site increases. Sites located in the Western and in the Eastern parts of the Gulf have generally lower Ef values than those located in the central part of the Gulf.

Fig. 2
figure 2

Enrichment factors for the metals under considerations, and for each sampling site

For a better comparison among areas (see Table 3) we calculated the Ef mean values for each metal, grouping the sites according to their geographical position. Results are shown in Fig. 3. The harbor area is enriched in all metals (with the exception of Cr) with Ef mean values of about 800, 1,600, 250, 200 for Cu, Hg, Pb and Zn respectively. The enrichment steeply decreases from this area to the Western one (NW) and gradually to Eastern ones up to background values of Ef in the North Eastern area (NE). Oreto River area is the most enriched in Cr, other areas (HA, CA, NE) show similar Ef mean values (around 35% over the background value), and only the North West area (NW) it shows, for this element, levels comparable to background ones.

Fig. 3
figure 3

Mean enrichment factors for the metals under considerations, averaged over the North-Western, Harbour, Oreto River, Central and North-Eastern areas. See Table 3 for details

Principal component analysis has been carried out on the raw data shown in Table 3. This kind of analysis allows reducing the number of variables to be considered for a complete description of the data variability. This is accomplished through the diagonalization of the metal correlation matrix. The matrix eigenvectors are the new variables, while the ratio of the single eigenvalue to their sum represents the relative amount of the total variability it is able to describe.

The first two (out of five) principal components are found to describe about 95% of the total variability (80% for the first one, and 15% for the second one). Moreover four of the original variables (the metal concentrations) are found to have similar projection values on the first two principal components, while Cr stands alone from this point of view. This suggests that Cr pollution sources are different from those of the other elements.

In Fig. 4 we show the positions of the sampling sites in the plane defined by the first two principal components. Ranges are (−3.2, 3.2) for the first component on the horizontal axis and (−2, 2) for the second component on the vertical axis. Site GP09.1, the site with higher metal concentrations, is not shown in this plot, because its coordinates are out of the selected ranges. The meaning of the different symbols used in this plot will be explained below. It is worth noticing that sites GP11.3, GP12.3 and GP13.3 are located close to the Oreto River mouth and show the highest values of Cr concentration ratios with respects to other elements. Sites located at negative values of both components are those showing the lowest values of metal concentrations.

Fig. 4
figure 4

Sampling site distribution in the plane defined by the first two principal components. Different symbols refer to different clusters shown in the dendrogram of Fig. 6

As mentioned in the previous Section, ratio matching technique has been used and a Z similarity matrix has been calculated. In Fig. 5 the distribution of the similarity coefficients is shown. This distribution has a mean value of about 0.7, which indicates a large level of similarity among sites. Moreover, the right tail of the distribution is fatter than the left one, i.e. on the side of higher similarity values.

Fig. 5
figure 5

Distribution of the elements of the similarity matrix, as obtained by the ratio matching technique

In Fig. 6 we show the dendrogram resulting from ALCA (see previous section) performed on the Z similarity matrix. It gives quite a reliable representation of both geographical site distribution and similarity in metal ratio concentrations. First of all, the GP09.1 site does not cluster with any other site. This is expected because it is located at the pipe output of the main city sewage, and in an area with low water circulation and exchange. In some cases, geographically close sites will tend to cluster together if they also have high similarity values. This can be easily seen in the dendrogram observing the clustering of sites from GP09.2 to GP10.3. On the other hand, clustering can also occur among sites which are geographically separated but having similar values of metal ratio concentrations and, presumably, similar levels of pollution. This is shown in the large cluster formed by sites from GP01.3 to GP23.3. Finally, we also observe clustering between sites from GP20.3 to GP24.2, which are geographically close, with high similarity values and also characterized by low levels of pollution. Here the geographical aspect seems to be predominant also because these sites share similar geochemical features. The order of sampling sites provided by the dendrogram can be used to re-analyze the similarity matrix Z. In Fig. 7 we show a map of the sampling area and sites. To the aim of helping the visualization of site sampling chemical characteristics, the site symbols are now different, according to the values of the similarity coefficients and following the hierarchical tree clustering, shown in Fig. 6. The classification emerging from the dendrogram has been used to label the sampling sites in Fig. 4, where different symbols refer to different clusters.

Fig. 6
figure 6

Dendrogram, as obtained from average linkage cluster analysis, of the sampling stations in the Palermo Gulf

Fig. 7
figure 7

Sampling area and sites depicted according to similarity matrix and clustering of Fig. 6

The above results are also confirmed by the ALMST, in Fig. 8. In fact the shape of the ALMST resembles that of a main stem which splits in two branches. The main stem contains sites which are all close to the harbour area. The sampling sites appearing on the right branch are geographically located at the two ends of the Palermo Gulf and quite far away from the harbour area, which presumably is the most polluted area in our set. The sampling sites appearing on the left branch are geographically located away from the harbour area. However, their distance from the harbour is not large enough, compared to the sites appearing in the other branch. Therefore the sites on the left branch presumably are under the influence of the pollution coming from the harbour.

Fig. 8
figure 8

Spanning tree of the sampling sites, as obtained from the average linkage cluster analysis

In particular, starting from the bottom of the tree, it can be observed that the samples in this part of the tree correspond to the stations close to the harbour area and they are the ones with larger values of metal concentration (GP09.1, GP08.3, GP09.2 and GP09.3). Moving up, we find the sites GP07.3 and GP08.3, belonging to the industrial harbour. From the point of view of ALMST they are on the border between the previous group and the group of samples located in the central part of the Gulf (GP10.1 to GP12.3). Later we find a ramification with samples of the central part (GP99.3 to GP14.3) and, on the left, it finishes with stations of the NW part of the Gulf (GP05.3 and GP04.3). In the right side of the base of the Y (GP99.3), stations of the NW part of the Gulf are grouped also (GP06.3, GP01.3, GP02.3 and GP03.3). The tree finishes with two branches, both in the SE part of the Gulf: GP23.3 up to GP22.3 and GP19.3 up to GP20.3.

The ALMST therefore provides a clear picture of the relative role played by both geographical localization and pollution in the topological organization of the sampling site within clusters.

4 Conclusions

An extensive study of metal (Cr, Cu, Hg, Pb and Zn) concentrations in surface soft bottom sediments in the Gulf of Palermo has been carried out. The results of the measurements show that accumulation areas are present. These areas are consistent with the contamination sources distribution. In fact, stations near and inside the harbour and stations near the Oreto River mouth are the most polluted. For each considered metal, enrichment factors have been calculated, using the mean values obtained by measurement of sediment collected in areas far from pollution sources (Gulf of Termini), as background values. They make evident the existence of pollution gradients in the Gulf of Palermo. Statistical methodologies, namely principal component analysis and cluster analysis, performed on similarity matrix obtained by the ratio matching technique, provided further insight into the sampling area description in terms of both geographical and pollution distribution, particularly through the similarities in metal ratio concentrations.