Introduction

Nowadays, there is an increasing demand from national and international institutions for new techniques and methodologies to assess marine contamination and to develop and implement criteria that protect areas of potential environmental risk. In this sense, the international OSPAR Commission (OSPAR 2000) and the International Council for the Exploration of the Sea (ICES 2002) reported the limitations of traditional monitoring programs. The need for an integrative assessment in which chemical contamination is related to the observed biological effects was highlighted. At Spanish level, some guidelines are provided for the management of dredged material in harbours, with special mention to the importance of evaluating their biological impact (CEDEX 1994; DelValls et al. 2004). Besides, the European Water Framework Directive (WFD; EU WFD (2000)) aims to establish a new legislative framework for Community action in order to manage, use, protect and restore continental, transitional (estuaries) and coastal waters in Europe (Borja 2005; Quevauviller 2007). In this context, the integration of both, biological effect measurements and traditional chemical methods, is being considered increasingly important for the proper assessment of environmental quality and for the management of contaminated materials (Borja et al. 2004; Chapman and Anderson 2005; Fernández et al. 2008).

The present work is part of a collaborative project in which the research team is working to develop methodologies to measure marine environmental impact by means of chemical, biochemical and ecotoxicological tools. This study presents the results obtained from the chemical and the ecotoxicological analyses accomplished on harbour sediments. Chemical data have been integrated with toxicity data from three types of bioassays: standard screening Microtox® test (with marine bacteria Vibrio fischeri), 10-day acute survival test with marine amphipods (Corophium sp.) and 48-h sea urchin larvae embryogenesis success test (Paracentrotus lividus). The organisms used for toxicity tests were selected due to their ecological relevance and representativeness in the Atlantic littoral, as well as, because of their sensitivity to several types of organic and inorganic microcontaminants (Bellas et al. 2005; Fernández and Beiras 2001). Besides, the bioassays applied in this study are routinely used for the characterisation of dredged material and decision-making in harbours of the Basque Country (Belzunce et al. 2008).

The integration of the results has been accomplished by (1) a weight of evidence approach by means of a tabular matrix, that has been demonstrated to be a practical, reliable and predictive tool for assessing sediment quality (e.g. Chapman 2007; Chapman and Anderson 2005) and by (2) a multivariate analysis (factorial analysis and principal component analysis (FA-PCA)) that establish statistical associations between variables (Casado-Martínez et al. 2009; DelValls and Chapman 1998; Hunt et al. 2001; Morales-Caselles et al. 2007; Riba et al. 2004b).

This study provides an integrated assessment of sediment quality in harbour areas. Three ports of the Spanish Atlantic littoral, that present high degree of contamination from different sources, have been selected as a case study. The Harbour of Vigo, located at the Ria of Vigo (Northwestern Spain), the Harbour of Bilbao and the Harbour of Pasajes (Northern Spain), sited at Nervión and Oiartzun estuaries, respectively. These areas are strongly affected by industrial, sport and leisure activities (Belzunce et al. 2001).

The main objectives of this work are (1) to apply integrative tools to assess sediment quality and (2) to evaluate their potentiality to be used in the management of harbour areas at the Spanish Atlantic littoral.

Materials and methods

Sediment sampling campaigns

Three sampling campaigns were carried out in Bilbao (2007), Vigo (2008) and Pasajes (2008) harbours and six sampling stations were chosen at each site, covering a contamination gradient. The position of each station has been represented in Fig. 1.

Fig. 1
figure 1

Map showing the sampling stations in three harbour areas in the Spanish Atlantic coast: Vigo, Bilbao and Pasajes

Surface sediments were collected by a Van Veen grab and the redox potential was measured on board by an Orion platinum electrode potentiometer. Sediment subsamples were collected in polyethylene bottles for granulometry, organic matter content, metal concentrations determination and for toxicity bioassays and in glass jars for the analysis of organic compounds (PAHs and PCBs).

Physico-chemical characterisation

Granulometry

Dried sediment samples (60°C, 24 h) were run through a column of sieves and the percentage of gravel (>2 mm), sand (2 mm–63 μm) and mud (<63 μm) were calculated (Holme and McIntyre 1971). Those samples with high content of fine-grained sediment were analysed by means of a Beckman-Coulter LSTM 13 320. Particle size distribution was interpreted using the GRADISTAT software (Blott and Pye 2001).

Organic matter content

The organic matter content was determined as a loss of ignition percentage at 450°C for 5 h (Kristensen and Andersen 1987).

Chemical concentrations determination

Metal concentrations (Cd, Cu, Zn, Pb and Hg) were determined on the <63-μm fraction. After acid digestion (HCl and HNO3, 2:1, v/v) metals were analysed by atomic absorption spectroscopy using flame and graphite furnace. The method followed for PCB (PCB-28, PCB-52, PCB-101, PCB-118, PCB-138, PCB-153 and PCB-180) and PAH (naphthalene, acenaphthylene, acenaphthene, fluorene, phenanthrene, anthracene, fluoranthene, pyrene, benzo[a]anthracene, chrysene, benzo[a]pyrene and dibenzo[a,h]anthracene) concentrations determination is explained elsewhere (Bartolomé et al. 2005). The accuracy of the analysis was validated using certified reference materials; marine sediments PACS-2 for metals and NIST1944 for organic compounds analysis.

Toxicity bioassays

In order to analyse the potential toxicity of sediments, a battery of bioassays was performed: the solid phase screening Microtox® test, based on the bioluminescence inhibition of the marine bacteria V. fischeri (e.g. Environment Canada 2002); the whole-sediment 10-day amphipod (Corophium sp.) survival test (e.g. Casado-Martínez et al. 2006a; Environment Canada 1992; OSPAR 2005; Schipper et al. 1999) and the 48-h sea urchin larvae (P. lividus) embryogenesis success test in elutriates (e.g. Beiras 2002; Casado-Martínez et al. 2006c).

Following Casado-Martínez et al. (2006b), samples displaying EC50 values lower than 1,000 mg L−1, in the Microtox® test, were considered toxic. Acceptability criterion of bioassays was set at ≥80 %; both for amphipods survival in whole-sediments (Whiteman et al. 1996) and for the embryogenesis success of sea urchin larvae in elutriates (Cesar et al. 2004; Marin et al. 2007). A sample was considered toxic when (1) there was a statistically significant difference between control and testing samples (ANOVA, Bonferroni post hoc; α = 0.05), and (2) the difference was higher than 20 % (e.g. Casado-Martínez et al. 2006a, c; Chapman and Anderson 2005; DelValls et al. 2003). Shapiro–Wilk and Barlett’s test were applied for the assessment of normality and homogeneity of variance, respectively. Statistical analyses were carried out by means of Statgraphics ® Plus 5.0.

Moreover, after the conclusion of amphipod and sea urchin bioassays, ammonia concentration, measured as Total Ammonia Nitrogen, of overlying waters and elutriates, respectively, was determined by an ion selective analyser (Orion 920Aplus model), following the APHA-AWWA-WPCF (1989) and Thermo Electron Corporation (2003a, b) recommendations.

Integration of lines of evidence

Tabular matrix

Based on the original decision-making tabular matrix proposed by Chapman (1990) and Chapman et al. (1996), and modified afterwards by DelValls and Chapman (1998) and Riba et al. (2004b), sediment contamination and toxicity data were gathered together. The potential pollution of sediments was determined by comparison with the Sediment Quality Guidelines (SQGs) proposed by Long et al. (1995) and Buchman (2008), for metals (single values of five metals) and organic compounds (single values of 12 PAHs and total value of the sum of seven PCBs). Also, the SQGs reported by Riba et al. (2004a) were used for comparison, but only for metals (single values of five metals) and ∑PCBs (total value of the sum of seven PCBs), as they were not calculated for PAHs. In this sense, chemical concentrations were compared with effect range median (ERM; Long et al. 1995), probable effects level (PEL; Buchman 2008) and highly polluted benchmark (HPB; Riba et al. 2004a) values, which refer to the threshold above which adverse biological effects are potentially expected. In addition, the results of the battery of bioassays were included and sediment samples were classified depending on the toxicity score (TS; number of bioassays that were toxic): non-toxic (NT; TS = 0), low toxic (LT; 0 < TS ≤ 1), moderately toxic (MT; 1 < TS ≤ 2) and highly toxic (2 < TS ≤ 3).

Multivariate analysis approach

In order to evaluate the associations among variables, a multivariate factor analysis (FA) with a principal component analysis (PCA) as extraction procedure was conducted (DelValls and Chapman 1998). This methodology aims to derive a reduced number of new variables, named ‘factors’, as linear combinations of the original variables. The correlations between the original variables and factors, given by coefficients termed ‘factor loadings’, are the basis to identify the associations among the measured variables. The FA-PCA was applied over the three harbour areas, Bilbao (BI), Vigo (VI) and Pasajes (PA), with six sampling stations in each estuary and 14 variables (i.e. Cd, Cu, Hg, Pb, Zn, ΣPAH and ΣPCB concentrations, organic matter and mud content, ammonia concentration in amphipod and sea urchin bioassays, amphipods mortality, abnormal development of sea urchin larvae and Microtox®). For the extraction of factors, contaminant concentrations were log transformed and percentages were arcsine-root transformed. The axes were orthogonally rotated in order to maximise the variance of the factors (Varimax normalised) whilst minimising the variance around them (e.g. Choueri et al. 2010). In order to establish significant associations between variables, a 0.40-factor loading cut-off was used, which corresponds to an associated explained variance of over 65 % (DelValls and Chapman 1998). The multivariate analysis and Pearson’s correlations were performed using Statgraphics® Plus 5.0.

Results

Physico-chemical characterisation

The physico-chemical parameters of sediments have been summarised in Table 1. In general, outer stations were characterised by high sand content (BI-0, 96.5 %; PA-0, 98.6 %) and fine-grained sediments dominated in those stations located in the inner part of estuaries, associated with high organic matter content. Besides, the maximum negative redox potential values were recorded in the inner stations (BI-2, VI-2 - VI-5 and PA-2), being indicative of reduced sediments.

Table 1 Physico-chemical parameters of sediments of the three harbour areas

Concerning chemical concentrations in sediments, both metallic and organic compound contaminations seem to follow the same trend; the inner stations, characterised by fine-grained and reduced sediments, presented the highest contamination: in Bilbao, the station BI-2 presented the highest concentrations for all metals, only exceeded by BI-3 (1.07 mg kg−1), BI-4 (3.01 mg kg−1) and BI-5 (1.04 mg kg−1) stations for cadmium; in Vigo, VI-2 station accounted for the highest contamination levels of all the chemical compounds, except for Cu (VI-1, 751.3 mg kg−1); in Pasajes, in general PA-2 was the most contaminated station (for Cu, Hg, Zn, ΣPAHs and ΣPCBs), followed by PA-4 (for Cd and Pb).

Toxicity bioassays

The results of the Microtox® test have been collected in Table 2 and data from amphipod and sea urchin bioassays have been represented in Fig. 2. In Bilbao, all the stations with the exception of BI-0 would be considered potentially toxic regarding the Microtox® test. However, these sediment samples did not cause toxicity to amphipods (Fig. 2a) and only BI-2 and BI-5 resulted toxic to sea urchin larvae (Fig. 2b). Also, BI-1 met one of the conditions to be considered toxic (i.e. statistically significant difference against control) by the sea urchin bioassay. In Vigo, the only station that resulted toxic to all the battery of bioassays was VI-2. Additionally, VI-1 and VI-4 were toxic to the Microtox® test. Regarding the amphipod bioassay, VI-5 presented a mortality rate more than 20 % higher than the control sample; however, this difference was not statistically significant. Similarly, in the sea urchin bioassay, none of the samples were toxic (apart from VI-2), however, all the samples met one of the conditions to be considered toxic (i.e. statistically significant difference against control). In Pasajes, only PA-0 station was non-toxic to the Microtox® test. Besides, PA-2 and PA-5 stations resulted toxic to the three types of bioassays while PA-3 was toxic to amphipods. As observed in samples from Vigo, in Pasajes, at PA-0, PA-1, PA-3 and PA-4 stations, one of the conditions to be considered toxic to the sea urchin bioassay (i.e. statistically significant difference against control) was met.

Table 2 Results of the Microtox® test in sediments from sampling stations
Fig. 2
figure 2

Results of the whole-sediment 10-day amphipod (Corophium sp.) survival test (a) and 48-h sea urchin larvae embryogenesis success bioassay (b). Different bar colours denote the sediment samples analysed in the same batch. Each sample has been compared against the control of its batch, which is represented by the same bar colour. The 20 % of difference against control sample has been defined by lines; solid line represents the first batch (light grey), dash line the second batch (dark grey), and pointed line the third batch (black). Asterisks indicate a statistically significant difference against control (ANOVA, α = 0.05)

Integration of lines of evidence

Tabular matrix

Originally, Long and Morgan (1990) assessed the potential toxicity of sediments based on the occurrence of adverse biological effects due to the exposure to toxicants. This concept was further developed by Chapman (2000), including different lines of evidence for decision-making. Following the weight of evidence approach, described by Chapman (2000), the results of the battery of bioassays and potential toxicity of sediments, based on comparison of contaminant concentrations with SQGs, were integrated in a tabular matrix (Table 3). Global toxicity of sediments was derived from the results of the battery of bioassays (see ‘Tabular matrix’).

Table 3 Integration of the results in a decision-making tabular matrix

Only BI-0 was NT for all the bioassays, which is in accordance with the low contaminant loads in this sediment (none contaminant over ERM, PEL or HPB). Low toxicity was observed at two stations in Bilbao (BI-3 and BI-4), at three stations in Vigo (VI-0, VI-3 and VI-5) and at one station in Pasajes (PA-0). BI-3, BI-4 and PA-0 did not present any chemical compound over ERM. Contrastingly, BI-3 presented contaminants over PEL (two metals and one PAH) and over HPB (one metal), BI-4 showed one metal over HPB and PA-0 one metal over PEL. Besides, the stations from Vigo presented two groups of contaminants over ERM, one group over HPB (should be noted that there are not HPB for PAHs) and three groups over PEL (with the exception of VI-0, with only one group of contaminants over PEL).

Samples classified as MT were BI-1, BI-2 and BI-5 in Bilbao, VI-1 and VI-4 in Vigo and PA-1 and PA-4 in Pasajes. At these stations, excluding BI-1 (none contaminant over HPB) and BI-5 (none contaminant over ERM and PEL), at least one group of contaminants was over the ERM, PEL or HPB values. Contrarily to BI-5 (none contaminant over ERM and PEL), VI-1 and VI-4 were characterised by high levels of contaminants over ERM (i.e. three metals, one PAH and ΣPCBs in VI-1 and one metal, one PAH and ΣPCBs in VI-4) and PEL (i.e. four metals, eight PAHs and ΣPCBs in both cases) values. In Bilbao any station was classified as HT. On the other hand, VI-2, in Vigo, and PA-2, PA-3 and PA-5, in Pasajes, were classified as HT. VI-2 and PA-2 presented organic (ΣPAHs and ΣPCBs) and metallic compounds over the ERM values, while at PA-3 and PA-5, the ERM threshold was only exceeded for metals and ΣPCBs. The four stations showed ΣPCBs over HPB but only in PA-2 one metal was over HPB. Contrastingly, when compared with PEL, the four stations (VI-2, PA-2, PA-3 and PA-5) presented organic (ΣPAHs and ΣPCBs) and metallic compounds over those values.

Multivariate analysis approach

A multivariate Factorial Analysis was performed to establish associations between the biological endpoints and chemical concentrations. The PCA extraction of the original variables generated three factors, which together accounted for 80 % of the variance of the original data set. The loadings of each variable within these factors, after varimax rotation, are represented in Table 4.

Table 4 Sorted rotated factor loadings (varimax rotation) of the original variables in the principal three factors derived from the factorial analysis (FA-PCA)

The first factor (F1) accounted for 50 % of the variance and combined the results of the three bioassays (amphipods mortality, abnormal development of sea urchin larvae and Microtox®) with chemical concentrations (Cu, Hg, Pb, Zn, ΣPAHs, ΣPCBs) and organic matter (OM) content. The second factor (F2) accounted for 15.6 % of the variance and represented OM content and ammonia concentration in amphipod and sea urchin bioassays associated with amphipod mortality and abnormal development of sea urchin larvae. The third factor (F3) accounted for 14.1 % of the variance and was a combination of chemical concentrations (Cd and Pb), mud and OM content and Microtox® results. The relevance of factors for each station is shown in Fig. 3. In Bilbao estuary, at BI-2 station, the observed amphipod mortality and abnormal development of sea urchin larvae was associated with contaminant concentrations (positive score of F1) while at BI-1, BI-4 and BI-5 was linked to OM content and ammonia concentration (positive score of F2). At BI-1, BI-2, BI-3 and BI-4, F3, that associates Microtox® results with OM and mud content and concentrations of Cd and Pb, presented the highest score.

Fig. 3
figure 3

Estimated factor scores of the principal three factors (F1, F2 and F3), derived from the factorial analysis (FA-PCA), at each of the three studied areas: Bilbao (BI), Vigo (VI) and Pasajes (PA)

In Vigo, F1 got a positive score at all the stations excepting VI-0, while the associations of F2 were only relevant at VI-2. At VI-1 and VI-4 the associations described by F3 were also important. The high score of F1 at VI-2 station is remarkable, highlighting the importance of the association between contaminant concentrations and toxicity endpoints in this station.

In Pasajes, F1 presented a positive score at PA-2, PA-3, PA-4 and PA-5, while F3 only showed a positive score at PA-3 and PA-4. Moreover, PA-2 was the only station that showed a positive and high score of F2, what means that the biological effects at this station (amphipods’ mortality and abnormal development of sea urchin larvae) were mainly associated with OM content and ammonia concentration.

The significant relationship between variables was further studied by means of a correlation matrix. In Table 5, the correlation coefficients between variables and biological endpoints are given. Amphipod mortality was significantly (p < 0.05) correlated with chemical concentrations (Zn, r = 0.53, ∑PAHs, r = 0.48; ∑PCBs, r = 0.57), ammonia (r = 0.51) concentration and especially with abnormal development of sea urchin larvae (r = 0.84). The results of the sea urchin bioassay were correlated with chemical concentrations (Cu, r = 0.48; Zn, r = 0.57; ∑PAHs, r = 0.56, ∑PCBs, r = 0.48), OM content (r = 0.56), ammonia concentration (r = 0.66) and with the other two bioassays (amphipod mortality, r = 0.84; Microtox®, r = −0.50). Bioluminescence of marine bacteria in the Microtox® bioassay was negatively correlated with chemical concentrations (Cd, r = −0.49; Cu, r = −0.64; Hg, r = −0.68; Pb, r = −0.76; Zn, r = −0.65; ∑PAHs, r = −0.60), mud (r = −0.68) and organic matter (r = −0.57) content and abnormal development of sea urchin larvae (r = −0.50).

Table 5 Correlation coefficients between chemical, physico-chemical variables and biological endpoints

Discussion

In the present study two lines of evidence have been studied, contaminant loads and toxicity in sediments, in three harbour areas of diverse characteristics and contamination sources. The results obtained have been integrated with the aim of establishing associations between contaminants present in the sediment and the toxicity endpoints. This has been attained by means of a tabular matrix and a multivariate factorial analysis (FA-PCA).

The stations presenting the highest metallic and organic contamination were VI-2, that is located inside a harbour, BI-2, which is characterised by a low local flushing time (Grifoll et al. 2011), and PA-2, which is placed in the inner part of the estuary and characterised by a low water renovation rate (Montero et al. 2011; Solaun et al. 2009). The range of metal concentrations measured at stations of Vigo was similar to those reported by Quelle et al. (2011) in the Ria of Vigo, although, higher concentrations of Cu, Hg and Zn were measured in this study. In the case of Bilbao, the reported metal values are in accordance with those found by Fdez-Ortiz de Vallejuelo et al. (2010) in the Nervión estuary. Montero et al. (2011) found metal concentrations in the Oiartzun estuary comparable with the results reported in this study. It is widely accepted that sediments are good indicators of anthropogenic impacts to coastal and estuarine environments since they act as a sink for contaminants, providing time-integrated information of contamination in an area (Viguri et al. 2007). In this sense, different approaches have been tried to link toxicity to contaminant concentrations in sediments. The most applied approach is based on the evaluation of potential toxicity based on comparison with numerical chemical values (Belzunce et al. 2004; Burton 2002), such as ERM, PEL and HPB. Over these thresholds, a contaminant would be suspected to cause adverse biological effects (Buchman 2008; DelValls et al. 2004; MacDonald et al. 1996; Morales-Caselles et al. 2007; Riba et al. 2004a). In order to determine the effectiveness of concluding the potential toxicity of sediments based on comparison to SQGs, a battery of bioassays has been applied, which provides direct information about toxicity and potential bioavailability of single and mixtures of contaminants (Cesar et al. 2009; Choueri et al. 2010). These lines of evidence have been integrated in a tabular matrix (Table 3).

The most applied approach is the comparison of numerical values with the ERM and PEL values calculated by Long et al. (1995) and Buchman (2008), respectively. However, Baumard et al. (1998) and Fernández et al. (2008), highlighted that SQGs calculated by Long et al. (1995) could be excessively high for our region. Therefore, in this study the SQGs calculated by Riba et al. (2004a) for the Atlantic coast of Spain have been also included. It can be observed that even if SQGs have been calculated based on bioassay results, the thresholds reported by different authors are fairly different, which indicates organisms and site specific differences. In this sense, the comparison with the SQGs calculated by Riba et al. (2004a) could overcome the site specific differences as they refer to the Spanish Atlantic coast. It seems that there is a correspondence between this set of SQGs (HPB) and the global toxicity (Table 3) at low toxicity levels. However, based on this set of SQGs it is not possible to discriminate between the contamination level of stations presenting different global toxicity: LT (e.g. VI-3, none metal but ∑PCBs over HPB), MT (e.g. VI-4, none metal but ∑PCBs over HPB) or HT (e.g. VI-2, none metal but ∑PCBs over HPB). This could be explained by the high HPB values reported for metals. Furthermore, there are not HPB for PAHs, which are known to be major contaminants in harbours, that being the case of our study areas. Therefore, the applicability of the HPBs is reduced to areas not affected by PAHs. Regarding the SQGs calculated by Long et al. (1995) and Buchman (2008), in general, those stations characterised by high number of metals and organic compounds over the ERM or PEL values are also classified as HT. This is especially the case of the PEL approach, as the four stations classified as HT (PA-2, PA-3, PA-5 and VI-2) presented the three groups of contaminants over PEL. When evaluated by means of the ERM approach, at VI-2 and PA-2 ERM values were exceeded by the three groups of contaminants, but at PA-3 and PA-5 were only exceeded for one or two metals and ∑PCBs. The mismatch between the conclusions derived from the application of the ERM or PEL approach and toxicity results is higher at LT or MT levels (e.g. Hübner et al. 2009). For example, BI-5 was MT as this sediment was found to be toxic when tested by the Microtox® and the sea urchin bioassays. However, at this station any compound showed values higher than the ERM or PEL thresholds. Contrarily to BI-5, VI-1 (MT) and VI-4 (MT) were characterised by high levels of contaminants over ERM (i.e. three metals, one PAH and ΣPCBs in VI-1 and one metal, one PAH and ΣPCBs in VI-4) and PEL (i.e. four metals, eight PAHs and ΣPCBs in both cases) values. In the case of those stations identified as LT, there is a high variability. Based on the ERM approach, three stations (BI-3, BI-4 and PA-0) did not present any compound over this threshold and the other three (VI-0, VI-3 and VI-5) presented ∑PCBs and another contaminant (one metal or PAH) over ERM. This variability is more notable when applying the PEL approach; PA-0 (LT) presented only one metal over this threshold while VI-3 (LT) presented three metals, two PAHs and the ∑PCBs over the thresholds. These incongruities could be related with the fact that empirical SQGs do not address efficiently the bioavailability concept (e.g. Burton 2002; DelValls et al. 2004) and only account for single chemicals without considering the interactions among contaminants (Choueri et al. 2010). Moreover, the complexity of deriving potential toxicity based on comparison to numerical values increase when applying different sets of SQGs.

An alternative to overcome these shortcomings is the application of methodologies that statistically integrate both lines of evidence (Casado-Martínez et al. 2009). In this sense, a FA-PCA multivariate analysis was performed in order to link toxicity bioassays to analytical chemistry results (Table 4). The PCA reduces the complexity of the study creating a lower number of factors that gather all the variables. These factors are a linear representation of the original variables and provide a description of the structure of the data set with a minimum loss of information (DelValls and Chapman 1998). Thus, from the factorial analysis of the three estuaries, all the variables were reorganised in three principal factors, that accounted for 80 % of the total variance in the original data set. The associations derived from the multivariate analysis resulted consistent with the correlation matrix of all the variables (Table 5). Then, it can be observed that the inhibition of the bioluminescence in the Microtox® test is significantly correlated with chemical concentrations (i.e. Cu, Hg, Pb, Zn and ∑PAHs in F1 and with Cd and Pb in F3) and with OM (F1 and F3) and mud (F3) content. The relationship between Microtox® results and OM and mud content has been previously reported (Benton et al. 1995; Ringwood et al. 1997). It is difficult to establish differences among stations based on this test, as it showed toxicity in the majority of the stations. Besides, Microtox® is a screening bioassay and it has been suggested that this bioassay alone may not be representative of the full impact of a given pollutant in an ecosystem (Brohon et al. 2001).

In the case of amphipod mortality and abnormal development of sea urchin larvae, the results of these bioassays are significantly correlated with chemical concentrations (Zn, ∑PAHs and ∑PCBs, also Cu for sea urchin in F1) and ammonia concentration (F2) and also with OM content (F2) in the case of the sea urchin bioassay. According to EPA’s guidelines, it is considered that ammonia interferes with the amphipod bioassay when its concentration in water is over 20 mg L−1 (Ferretti et al. 2000). In the case of the sea urchin bioassay this value is lower and different EC50 can be found in the literature; 2.7 (Cesar et al. 2002) and 5.7 mg L−1 (Arizzi Novelli et al. 2003). The only station that presents ammonia levels over those values, both in amphipod and sea urchin bioassays, is PA-2, which is in accordance with the relevance of Factor 2 in this station (Fig. 3). Factor 2 is also represented in Bilbao harbour stations, especially in BI-5, as ammonia concentration in those stations is also relatively high (Table 1). These results highlight the importance of considering ammonia concentration when interpreting sediments toxicity, as it could act as a confounding factor (His et al. 1999).

Concerning the biological endpoints, amphipod bioassay and microtox are not significantly correlated (r = −0.23), while sea urchin bioassay is significantly correlated with the other two bioassays (r = 0.84 with amphipods and r = −0.50 with Microtox®). The differences in toxicity between these bioassays suggest that contaminants are affecting differently depending on the type of organism. This could be related with the routes of exposure of each organism to contaminants and highlights the importance of testing different matrix (i.e. solid phase and elutriates) for a reliable assessment of potential toxicity (Macken et al. 2008).

Regarding the results of the multivariate analysis, at four stations the three factors showed negative loadings (Fig. 3). This is explained by the lack or low toxicity level at these stations, which are classified as NT (BI-0) or LT (VI-0 and PA-0). In the case of PA-1, which is MT, the negative loading of factors represent that the observed toxicity is not explained by the associations of variables described in these three factors. It could be that toxicity in that station is explained by contaminants that have not been measured in this study. Additionally, F1 (i.e. links the contaminant concentrations with the biological endpoints), is better represented in Vigo and Pasajes estuaries, which present high contaminant concentrations and associated toxicity. In general, in these estuaries, the stations characterised by the predominance of F1, present MT to HT: VI-1 (MT), VI-2 (HT), VI-3 (LT), VI-4 (MT), PA-2 (HT), PA-3 (HT) and PA-5 (HT) (Fig. 3). This highlights that at high levels of contamination, observed toxicity is mainly explained by chemical concentrations in sediments. Contrarily, F2 and F3, that associate the biological endpoints with ammonia concentration and with mud and organic matter content, respectively, are better represented in Bilbao estuary. Samples from Bilbao are characterised by lower contaminant concentrations (i.e. ERM values were only exceeded at BI-1 and BI-2 for one and three metals, respectively) and for presenting toxicity levels, mainly given by Microtox®, from NT to MT. Therefore, as demonstrated in this study, at lower chemical levels the importance of confounding factors (i.e. ammonia concentration, organic matter and mud content) should be considered when evaluating potential toxicity of sediments.

Based on the results obtained in the present study, it can be concluded that deriving the potential toxicity of sediments from the comparison of single contaminants with SQGs is a simplistic approach, as it does not take into account the interaction between contaminants (e.g. antagonism, synergism and additive effects) and the bioavailability, which can be the main causes of the potential toxicity (Burton 2002; Chapman and Mann 1999; Choueri et al. 2010). This could explain the mismatch observed in this study between potential toxicity, based on comparison with SQG values, and global toxicity, based on the results of bioassays. These findings highlight that SQGs can be used as a line of evidence (e.g. Burton 2002; Hübner et al. 2009) but that they should not be used alone for regulatory purposes (e.g. DelValls et al. 2004), as decisions made based only on the results of the SQGs approach provides uncompleted conclusions. Contrarily, the multivariate analysis seems to be a robust tool to associate biological endpoints with chemical concentrations. It has enabled the analysis of contaminant interactions and together with the calculation of correlation coefficients has permitted to establish the associations of variables that explain the observed toxicity. However, in future works the inclusion of a higher number of stations as well as other lines of evidence (e.g. biomarkers, etc.) should be considered. This would help in the interpretation of the results as it would ensure the establishment of more reliable associations between contaminant concentrations and toxicity to the biota.

Conclusions

Summarising, based on the multivariate analysis, it has been concluded that at high levels of contamination, observed toxicity is mainly explained by chemical (metals and organic compounds) concentrations in sediments. This is in accordance with the results of the SQGs approach, as those stations characterised by high number of metals and organic compounds over the ERM and PEL values are generally also classified as HT. Contrastingly, the FA-PCA analysis has demonstrated the importance of confounding factors (i.e. ammonia concentration, organic matter and mud content), at lower chemical levels. On the other hand, the ERM approach only accounts for chemical concentrations, without considering the interaction between contaminants and the effect of confounding factors. This explains why the mismatch observed between the conclusions derived from the application of this approach and toxicity results is even higher at LT or MT levels.

Additionally, this study has demonstrated the effectiveness of the multivariate FA-PCA analysis for the integration and interpretation of different lines of evidence in areas affected by different sources of contamination. Based on this tool, it has been possible to assess sediment quality proving to be a useful tool for the management of harbour areas. This approach could also be applied to risk assessment of dredged sediments, providing necessary information to stakeholders for the management of harbour activities.