1 Introduction

Urban environments feature strong imprints of human activity. They are centers of industrial growth and economical production, wherefrom contaminants like heavy metals are intensively emitted since the beginning of the Industrial Age (Crutzen 2002; Lyons and Harmon 2012). In recent times, this contamination is constantly surveyed by air quality monitoring programs (Ellermann et al. 2012), and it is decreasing in many industrialized countries (von Storch et al. 2003). In contrast, past environmental burdens can only be indirectly assessed. Particularly, lake sediment strata have proven to be a useful tool for contamination investigations as they serve as final sinks for these contaminants in aquatic environments and thus function as historical archives (Haworth and Lund 1984; Last and Smol 2004). Both urban and remote lacustrine locations contain distinct anthropogenic impacts (Renberg 1986; Chillrud et al. 1999; Walraven et al. 2014).

X-ray fluorescence (XRF) core scanning has become widely used to assess sedimentary element levels, especially since the early 2000s (Croudace et al. 2006). It provides an effective technique to generate large datasets covering major and trace elements (Weltje and Tjallingii 2008; Comero et al. 2011; Löwenmark et al. 2011). On the one hand, iron (Fe), titanium (Ti), and silicon (Si) as well as alkali metals such as rubidium (Rb) and potassium (K) are prominent in clay minerals. They serve as ideal scavenger for heavy metals due to their large relative surface areas (Eisma and Irion 1988). On the other hand, elements like copper (Cu), zinc (Zn), lead (Pb), chromium (Cr), nickel (Ni), and cadmium (Cd) characterize anthropogenic imprints. They are mainly emitted by high temperature processes such as fossil fuel combustion and as part of material preservation applications (Nriagu 1979). Manganese (Mn) has redox-sensitive properties that can mark interactions at the water-sediment interface (Hallberg 1991; Davison 1993). Calcium (Ca) can be considered a salinity indicator, as it is the fifth most abundant element in seawater and could reveal occurrence of carbonate precipitation in the sediment columns.

For associated data interpretation, applications of multivariate statistics have proven to be a beneficial tool to apply on records in order to e.g. group variables of similar characteristics and properties (Selig and Leipe 2008; Templ et al. 2008; Hansson et al. 2013). Principal component analysis (PCA) is a powerful method used in chemometrics which provides an overview of complex multivariate data, revealing relations between variables and samples (Bro and Smilde 2014). Its application on environmental data is well documented revealing basic relationships in element data and providing data association and dimension reduction (Zitko 1994; Zupan et al. 2000; Passos et al. 2010; Gredilla et al. 2012). Furthermore, it serves for pattern recognition, for outlier detection, and for data classification and trend delineation (Wold et al. 1987; Olsen et al. 2010; Comero et al. 2011; Quinn and Keough 2013).

In this study, PCA implementation on 13 variables of XRF core scanning from a lake in the center of Copenhagen is presented in order to detect features and nature of heavy metal contamination in an urban environment. A number of cores were retrieved from lakes in the former defense work system and cores from one site, the Botanical Garden, proved to be particularly well preserved (Fig. 1). The lake and its sediments from the heart of Copenhagen were therefore chosen as investigation site as it could provide detailed information since its establishment in the seventeenth century. The Danish capital has been a center for trading in Baltic region since the Middle Ages and was therefore exposed to a large variety of power and goods at an early stage. Since a historical contamination reconstruction is intended, underlying chemical patterns are highlighted and linked to their sources.

Fig. 1
figure 1

Maps showing (a) North-Western Europe, (b) Copenhagen City center, and (c) the coring sites in the lake of the Botanical Garden of Copenhagen

2 Material and Methods

Two sediment strata were retrieved from the central part of lake within the Botanical Garden of Copenhagen, Denmark (55° 41ʹ 10ʺ N, 12° 34ʹ 28ʺ E). The cores BH27 (145 cm long) and BH28 (138 cm long) were taken circa 30 m from each other using a rod-operated piston corer (Livingstone 1955; Smol 2008). The corer consisted of a tube sealed with a piston which was attached to an extension rod. This construction was lowered through the water column onto the sediment surface. Subsequently, the tube was penetrated through the sediment column until a coarse, sandy bottom was reached. The coring was done in winter 2011, and sediment recovery was achieved through holes in the ice-covered lake. The examined water body once belonged to a late medieval moat as part of the defense-wall system established around AD 1650 and surrounding the old town (Skaarup et al. 1998). It was turned into a lake by partial terrain leveling along with urban expansion since the mid-nineteenth century. There is no historical or sedimentological data indicating that the lake has been dredged, but the establishment of the Botanical Garden between 1872 and 1874 (Skaarup et al. 1998) appears to have induced an elevated input of minerogenic material leaving behind a chronological time-marker (see later). Today, the freshwater lake has a surface area of approximately 7000 m2 and a maximum water depth of about 2 m. Lake-water temperature varies with air temperature, so that the water body’s surface even freezes in winter time.

The cores were permanently stored at 4 °C. The sediment tubes were split in halves and analyzed at the Department of Geological Sciences at Stockholm University, Stockholm, Sweden, using an Itrax™ X-ray fluorescence (XRF) core scanner (COX Analytical Systems 2011). This nondestructive technique provided datasets of major elements and trace elements as well as line-scan images and radiographs (XRG; line-scanned) of the sediment cores (Croudace et al. 2006). Therefore, the core halves were covered with a thin polyethylene film. The XRF-core scanner was used in combination with a 3 kW molybdenum (Mo) tube operating at 55 kV and 50 mA. Especially, elements of environmental interest were detected in this way due to their relatively low detection limits (Croudace et al. 2006). The measurements were acquired at a 1000-μm increment with an exposure time of 200 ms.

The obtained datasets were auto-scaled by subtracting the overall average (\( \overline{x} \)) from each variable and dividing by the standard deviation (σ) which resulted in datasets with a new mean of zero (\( {\overline{x}}_a \)) and a standard deviation of 1 (σ a ) (Eq. 1). This form of data rescaling provided intercomparable objects of all variables for all associated cores on the one hand and removed the influence of extreme values on the PCA results on the other hand (Zitko 1994; Quinn and Keough 2013; Bro and Smilde 2014).

$$ \frac{x-\overline{x}}{\sigma}\Rightarrow {\overline{x}}_a=0;\ {\sigma}_a=1 $$
(1)

Light intensity was calculated as well in order to facilitate the detection of matrix differences in the sediment columns. From the monochromatic radiograph images, a maximum of 256 shades of gray were read by automatically allocating one value between 0 (for black) and 255 (for white) to every pixel.

Variable selection for the principal component analysis was based on environmental interest and geochemical properties. Here, multivariate statistical methods have some significant advantages over univariate techniques as they account for a group of variables that influence the data variability jointly (Borůvka et al. 2005). Even though environmental, geochemical data usually contains strong correlations, PCA contributes with its exploratory approach and decreases the data dimensionality while retaining most of the original information. On the basis of chosen variables, e.g., chemical elements, PCA provided their weights gathered in compound variables, the principal components (PC) which best explained the data variation (Bro and Smilde 2014). These components are independent in their contribution to the explained variation as they are orthogonal (Bro and Smilde 2014). The first PC explained the largest variation in the datasets in a certain multidimensional direction and every subsequent component accounted for additional information. A scree plot showed how much of the original variance in the dataset was explained by each PC. From this point on, the number of principal components was chosen which shall retain in the PCA model. On this basis, a score plot was developed which structured the representation of the original samples in the new multidimensional space, whereas the loadings plot highlighted the contribution of the variables to the principal components. In general, high loadings of elements express their importance for the PC. Score and loadings plots always have to be interpreted together. Avoiding consideration of textural differences in the PCA models fundamentally increased the description of it (Reid and Spencer 2009).

3 Results

The deposits of the lake within the Botanical Garden of Copenhagen are muddy. In both cores, two distinct layers are interstratified in the sediment columns, referred to as M1 and M2, indicating changes in the accumulation regime (Fig. 2; XRG panels). Both substrates were visually recognized by a change in material color from grayish-/brownish-green to yellowish-/greenish-gray. For BH27, M1 occurred between 61 and 71 cm, whereas M2 appeared between 89 and 99 cm. In core BH28, these layers did not appear quite as thick with M1 between 62 and 71 cm and M2 at around 86.5 cm.

Fig. 2
figure 2

Radiograph image (XRG) and auto-scaled data for light intensity by depth for cores (a) BH27 and (b) BH28. Value 0 marks the scaled data average

Textural changes were also identified on the basis of light intensity as it is a direct function of material density. Light intensity was determined by the individual radiograph images. Figure 2 shows these radiographs and the related auto-scaled light intensity for core BH27 and BH28. Auto-scaled light intensity values steadily increased from the bottom to the top of core BH27 (Fig. 2a). This course is interrupted by a distinct decrease between 99- and 92-cm depth. At around 38-cm depth, values shift from measurement mean to 1.6 σ and continued at this level until core surface. In the case of BH28 (Fig. 2b), auto-scaled light intensity revealed a distinct maximum at 86.5 cm. Strongly varying material density differences were recognized up to a depth of 68 cm. From there on, values steadily increased to the sediment surface.

Figure 3 depicts the auto-scaled element data for core BH27 over depth. Silicon, K, Ti, Fe, and Rb values showed a major deviation from their overall progression at the core bottom as well as at the depths between 89 and 99 cm. This tendency was also obvious in the Cd trend to a moderate degree.

Fig. 3
figure 3

Auto-scaled data of XRF measurements over depth for core BH27. Value 0 marks the scaled data average, whereas every tick mark represents one standard deviation

Auto-scaled element data for core BH28 over depth is depicted in Fig. 4. Also for this core, Si, K, Ti, Fe, and Rb values showed a major shift from their overall course at the core bottom as well as at the depths between 83 and 89 cm. This tendency was associated with a decrease in Cr values at these depths.

Fig. 4
figure 4

Auto-scaled data of XRF measurements over depth for core BH28. Value 0 marks the scaled data average, whereas every tick mark represents one standard deviation

The primary objective of the study was to examine whether PCA could help to visualize changing contamination patterns over core depth. Therefore, an initial PCA on the entire core (full core model) was run to identify the textural changes on elemental basis that also became visible in the light intensity data. Furthermore, general contaminant features should be accentuated. Subsequently, a second PCA (top core model) was carried out covering only the top part of the cores in order to avoid the influence with markedly textural differences on the PCA result. The distinction of these core sections was based on the element and light intensity profiles, the lower parts of the core sections were then excluded, and the data columns were auto-scaled without these data (Wold et al. 1987; Quinn and Keough 2013). This way, more homogenous core segments were considered. The top core PCA model was calculated afterward which covered the same variables only for the upper, homogenous part of the cores.

Principal component 1 of the top core model of BH27 already accounted for 60 % of the datasets variations, whereas PC1 of the full core model captured 46 %. The explained variance for PC2 increased from the full core model to the top core model as well, from 18 to 20 %, respectively. At least three principal components would be required in order to reach about 80 % explanation for the full core model, whereas only two were required in the top core model.

The first PC of the full core model of BH28 captured only 40 %, whereas PC1 of the top core model accounted for 47 %. Principle component 2 captured 26 % in the full core model and 29 % after data restriction. Again, the explanatory strength was increased from the full core model to the top core model. If the full core model should reach an equivalent acquisition as the top core model, at least three principal components would be required as well.

The loadings for the chosen chemical variables with the principal components are presented in Fig. 5. For both cores, BH27 (Fig. 5a) and BH28 (Fig. 5b), the first PC was strongly related to elements referring to clay minerals like Si, K, Ti, Fe, and Rb. In general, relations were strengthened from the full core to the top core models. In the case of BH27, minerogenic elements reached mean loadings of 0.91 in the full core model and 0.96 in the top core model. Cadmium also showed strong relations to this PC, although rather considered a contamination indicator instead. For BH28, clay mineral-related elements had a mean loading of 0.88 for the full core model and only 0.82 for the top core model. The full core model of BH27 described the contaminants with the use of the first three PCs. This was decreased to two principal components in the top core model. Strongest loadings were acquired for Cu, Zn, and Pb with an average of 0.82.

Fig. 5
figure 5

Loadings of variables (weights) with the first three principal components for full core and top core PCA models of (a) core BH27 and (b) core BH28

The PCA models for BH28 always enclosed contaminants in the second principal component, although shifting from positive to negative loadings and vice versa. The image appeared mirrored and accounted to a switch of algebraic signs from the full core to the top core PCA model of this dataset due to associated outlier removal (Bro et al. 2008). Disregarding this arbitrariness in sign conventions (Bro et al. 2008), the amplitude of the loadings increased for Cu, Zn, and Pb. In addition, it covered a relatively high negative loading for Cd with −0.81.

Figure 6 compiles the loadings for PC1 and PC2 of core (a) BH27 and (b) BH28 over depth. Layer of textural difference was recognized by high positive standard deviations for the full core model of core BH27. They extended over 10 cm from 89 to 99 cm reaching values above 3 σ (PC1). This part was also recognized in the auto-scaled element data of the full core as well (Fig. 3). Below this part, several other depths (e.g., 123–135 cm) could be addressed the same way. In general, a decreasing trend for PC1 was obvious from the core bottom to the sediment surface which was interrupted by the responded M2 section and a layer with elevated values around 63-cm core depth (M1). The textural variation also became obvious in PC2. Its course started at around −2 σ and drastically increased to the mean at 130 cm. Interrupted by M2, this level was kept until around 60 cm. Here from, PC2 values increased to a maximum of 1.4 σ at 31.5 cm. The measurements returned to the mean level all the way up to the sediment surface. The continuing analysis comprised depths from 0 to 85 cm only in order to avoid the influence of the described elevated values on the top core PCA model.

Fig. 6
figure 6

Principal component scores by depth for the full core model of (a) BH27 and (b) BH28

In the case of BH28, the first principal component features a distinct maximum at 86.5 cm with 4.6 σ which could be traced at PC2 as well. Principal component 2 of the full core model had negative loadings of the associated elements (see Fig. 5b) and therefore associated negative scores. In general, the depicted inverse scores were linked to concentration maxima. Values arranged along 1 σ until they start to increase at around 68 cm to their minimum at 30.5 cm with −2.2 σ. These measurements reached their mean again until the sediment surface. The depths from 0 to 80 cm were included for the top core model of BH28.

4 Discussion

The datasets being analyzed in detail comprised two cores of sediments from the lake of the Botanical Garden of Copenhagen. The rod-operated piston corer was penetrated as deep as possible into the sediment until a coarse, sandy bottom was reached. This signified that the complete strata covering lacustrine accumulations were retrieved.

Even though the two sediment cores featured macroscopic differences as seen on the XRG images (Fig. 2), their PCA models revealed similar patterns when highlighting clay minerals by the first principal component and heavy metal contamination by the second one (Fig. 5). However, variable behavior depicted in calculated loadings appeared to be different when considering each individual core. The individual explanation of the PCs varied for the models although the variables showed the same tendencies. This was due to different sedimentological features as depicted in light intensity (Fig. 2; XRG panels). The full core models highlighted textural variation and covered all depths and their associated measurements. The top core model had the advantage to be restricted on the basis of dimensions and maintained a high level of explained variance as they contained a homogenous core section only. It was important to carry out an initial PCA on the full datasets as it revealed differences that immoderately shifted the model’s depiction and facilitated in reducing the dimensionality even more. Along with this, density features of the investigated material as represented by light intensity help to restrict the model and improve its explanatory strength.

The thickness of the minerogenic layers (M1 and M2) could easily lead to an overestimation or underestimation of variables due to their associated chemical properties. Here, especially M2 stood out due to high values for PC1. Therefore, it was necessary to reduce the number of considered objects for further analysis. Other studies showed that a normalization of the data to a lithogenic element (Löwenmark et al. 2011) or to grain size (Reid and Spencer 2009) improved model recovery. Results for ideal normalization elements like aluminum (Al) were neglected due to the utilization of Mo tube in the XRF analysis. Light elements in particular could be biased by absorption effect of water along with the use of a polyethylene film covering the core during the measurement (Kido et al. 2006; Tjallingii et al. 2007). In addition, surface roughness and density could affect XRF analysis (Croudace et al. 2006). In order to minimize these effects, the PCA result was instead controlled by choosing only the homogenous top sections of the sediments in the top core PCA models on the basis of element and light intensity data. For comparison, Reid and Spencer (2009) gained similar loading recovery when normalizing their environmental data by the fine sediment fraction (<63 μm).

A distinct trend over depth was visible when looking at the score plot in combination with the loadings plot for the top core model of core BH27 (Fig. 7). Samples starting from the M2 layer, around 85 cm, showed a tendency toward the natural component. This process climaxed in the depths around 60 to 70 cm and is related to the M1 layer samples. Lead dominated the pattern in depths above this level. At around 35 cm, this trend appeared to be superimposed by other contaminants, namely Cu and Zn. The sediment surface (0–20 cm) was dominated by the influence of redox-sensitive Mn which marked the water-sediment boundary.

Fig. 7
figure 7

The top core PCA model of core BH27 including (a) score and (b) loadings plot

A similar picture to Fig. 7 was drawn from the top core model of BH28 (Fig. 8). Depths from 60 to 70 cm were directly associated with the first component (PC1), while subsequent samples started to be dominated by the second component (PC2). Heavy metals like Cu, Zn, and Pb had high levels especially between 30 and 50 cm. Sediment surface depths could only be associated with redox-sensitive Mn to a minor degree.

Fig. 8
figure 8

The top core PCA model of core BH28 including (a) score and (b) loadings plot

Explanatory power of the PCAs was increased by limiting the models to certain depths from the full core model to the top core model. In order to distinguish variables of highest importance for a principal component, we have set an overall loadings threshold of r = 0.8. This procedure facilitated the characterization of the compound variables. Thereby, the first PC could be addressed as the natural component since it had the strongest relationships with elements that refer to clay minerals and aluminosilicates. Silicon mainly originated from weathering processes along with its high loading for Ti (0.95, BH27) (Olsen et al. 2010). Rubidium and K showed strong loadings in both PCA models as they originate from the clay mineral fraction in particular (Vasskog et al. 2012). The second PC could be characterized as contamination component as it contained high loadings with heavy metals which are linked to anthropogenic influence.

A third principal component missed its importance in the top core models since two PCs already explained ~80 % of the variance in the dataset. The Kaiser criterion (eigenvalues > 1) is often discussed as a threshold for the number of principal components used in a model (Comero et al. 2011). But since two principal components covered a large share of the dataset variation, two principal component models were considered as they facilitated a two-dimensional visualization. Beyond, the third PC did not feature strong loadings except for Mn or Si after the models were limited to the top core sections.

It was possible to capture trends in the contamination sources by the PCA models since element contribution varied over depths. Temporal shifts in Pb and Zn contaminations of flood plain sediments in South-Eastern Czech Republic were observed by Matys Grygar et al. (2012). In our PCA model, the influence of variables associated with contamination decreased in favor of another variable as depicted in the shift between Pb and Zn (Figs. 7 and 8). Copenhagen had gone through a transition from a city being influenced by Industrial Revolution in the eighteenth century until today where it is branded “green,” due to its intended extended use of renewable energies (Danish Ministry of Climate, Energy and Building 2013; European Environment Agency 2014). Especially during the period from the 1920s to the early 1990s, Pb was extensively used as an additive to gasoline (tetra-ethyl lead) and was wide-rangingly emitted to the atmosphere (Nriagu and Pacyna 1988). Environmental legislation, such as the phasing out of Pb in gasoline and the improvement of fly ash filters in incineration plants (European Council 1975, 1987; von Storch et al. 2003), supplied the basis for decreasing stresses on human health as well (Molin Christensen and Holst 1988). As Zn levels remained on a relatively higher level, source contribution patterns must have changed. Fossil fuel combustion may no longer be its dominating origin as the use of galvanized metal products and fertilizers increased alongside. Additionally, Zn-containing compounds are common constituents in hygienic products and agrochemicals and serve as wood preservation (Bhattacharya et al. 2002; Nicholson et al. 2003).

5 Conclusion

The application of principal component analysis (PCA) proved to be a helpful tool for interpreting XRF-core scanning datasets for the presented lacustrine cores. The number of variables of the presented datasets was considerably decreased from 13 measured elements to only two principal components while still capturing about 80 % of the original variance in the datasets. This aspect facilitated data presentation and its interpretation. Limiting the PCA models to only the homogenous top part of the cores, avoiding deeper layers with different texture, significantly improved the dimensionality of the PCA models.

The first and second principal components were related to sedimentological features and contamination burdens, respectively. Clay minerals dominated some parts of the core matrixes and were controlling the first principal component. Heavy metals in contrast were dominating the second principal component. Surprisingly, even the course of contamination trends were revealed by the help of the score plots, highlighting the transition of contamination loads over core depth and time in both locations.