Keywords

1 Introduction

Bonn-OSINet was established in Ostend, Belgium, during one of the Bonn Agreement’s annual meetings in 2005. Spilled oil from the “Tricolor”, a Norwegian vehicle carrier that sank in the English Channel in 2002, had reached the coasts of Belgium, France, and the Netherlands, and whereas oil strandings were actually expected, laboratories from these countries were not able to prove any connection with the ship’s oil, at that stage. High cleaning costs were claimed, and not being able to provide proof that the stranded oil originated from the “Tricolor” resulted in problems with reclaiming the costs of that cleaning.

“No match” was found between source and spill samples. Reasons for this might be wrong combinations of spill and source samples being available for comparison and/or unsufficient experience in oil spill identification. Results could have been much more useful as evidence if labs testing samples had known each other and had cooperated and samples had been exchanged. Thus, the Bonn Agreement decided in 2005 that laboratories involved in oil spill identification in the Bonn Agreement area should cooperate in the future.

It proposed that a forum of Bonn experts on oil spill identification should be created, with Dr. Gerhard Dahlmann (Germany) as convenor. Recommended by the workshop, the aim of the forum was to provide mutual assistance in difficult cases, to promote quality assurance in oil spill identification (especially through ring tests, development of common reference materials (CRMs), and sample exchanges) and consider the possibility of a common database of oil sources [2].

From the very beginning, Bonn-OSINet proved to be very useful because in a new attempt to identify stranded oil in the Tricolor case in 2006, combined sets of spill and comparison samples could be analyzed. While some differences between spill and comparison samples were found, these could without doubt be attributed to the weathering of the spill samples.

Based on the evidence that oil was leaked from the Tricolor, combined now with analytical results of stranded oil, the Netherlands, Belgium, and France received, respectively, 1.8, 2.0, and 0.54 million € from assurance companies in 2007.

Cooperation and mutual assistance might require that all participating laboratories analyze and compare oil samples in the same way. A common method makes such cooperation much easier. The final conclusion about the relation between samples should be as objective as possible. Every participant should be able to trace back every conclusion produced by other laboratories step by step.

1.1 Difficulties in the Development of a New Method

Producing a common method for oil spill identification posed a big challenge, particularly regarding the great variability of oil spill cases, and the many different circumstances, where chemical comparisons of oil samples are required. Publications about chemical investigations in oil spill cases are rare. These are always focused on bigger cases [35]. Bigger cases consume more resources, require more time, and allow deeper investigations into the composition and the compositional changes of oil in the environment. Thus, in those cases, analytical errors can be determined more precisely, and even experiments can be made, in order to verify the findings observed on field samples [5].

Modern, so-called “unconventional” statistics, such as pattern recognition or PCA, can only be used for result evaluation in bigger cases, where a large number of samples is taken, and more than one source is suspected [6, 7].

Large oil spill cases occur only rarely. Smaller ones do not generally receive the same level of attention and are also not described in literature. However, hundreds of oil spills are still found every year in European waters, although their numbers have decreased over the last years [811].

Participating laboratories had different experiences and preferences: they had worked with different kinds of oil, e.g., crude oils in the oil platform area of the middle and northern North Sea, heavy fuel oils (HFOs) on the highly frequented shipping lanes in the southern North Sea, or light fuel oils in inland waters. In addition, the preconditions in the laboratories varied, as most laboratories had only very few cases per year, where oil samples had to be compared with suspected source samples. In these cases, analytical instruments were mainly used for other purposes, e.g., marine environmental monitoring. These instruments could thus only be used a very limited time, and the analytical parameters had to be changed, when oil samples had to be analyzed.

1.2 Intercalibrations and Participation

Since OSINet was established in 2005, annual ring tests – Round Robins – have been conducted for increasing knowledge and experience of laboratories and for improving the quality of analytical data. Each Round Robin dealt with different kinds of problems, which appear when spilled oil has to be compared with oil from suspected sources (Table 1). If these problems could not undoubtedly be solved during those tests, further experiments were conducted by different participants for clarification. This often resulted in scientific publications [1216].

Table 1 Intercalibrations (Round Robins), number of participants, and main topics

The method has been continuously tested and improved over the last years. Since 2005, the number of OSINet participants has grown from six members of the Bonn Agreement area to about 50 scientists from 27 laboratories from 20 countries all over the world.

Summary reports of the Round Robins can be found on the Bonn Agreement website, section Bonn-OSINet [17].

2 Methodology

2.1 General Principle and GC–FID

The principle of oil spill identification is based on the fact that petroleum consists of many thousands of different organic compounds. It is simply neither practical nor possible to analyze and compare all of them. Therefore OSINet decided to analyze the samples by means of gas chromatography with flame ionization detection (GC–FID) and low-resolution gas chromatography–mass spectrometry coupling (GC–MS), in order to compare the general compound patterns and to measure a range of specified compounds. Both analytical techniques are adequately available in laboratories and are precise enough for a large range of compounds and compound groups.

It is always easier to ascertain what it is not than what it is. Thus, the general concept for comparing oil samples consists of looking for differences. A fuel oil cannot be identical with a lubricating oil, for example. Such a difference can easily be identified by simple GC–FID. Preliminary investigations by GC–FID are thus of great value, where comparison samples are taken from several different compartments of a suspected ship, which contain different types of oil.

Further characteristics of the samples can be identified by GC–FID, such as roughly the concentration of oil in the spill samples, the shape of the “unresolved complex mixture” (UCM), the shape of the envelope of the n-alkanes, or the relation of the branched chain alkanes pristane and phytane. If “obvious” differences of samples are detected, even between samples of the same type, investigations may be terminated here.

However, the decision to declare samples as nonmatching should not be made, if there is even the slightest doubt that the samples are not identical. In such cases, samples must be further analyzed using the more complex GC–MS.

One has to keep in mind here that the composition of spill samples may have changed because the reduction of compounds due to weathering begins as soon as oil is released into the environment.

2.2 GC–MS

By means of GC–MS, a great multitude of compounds may be found in oils. Huge collections of oil compounds, proved to be especially useful in bigger oil spill cases, are available in literature [18]. Generally, knowledge about the classification of oils and their differentiation is derived from geochemistry, because information about the source and maturity of detected oil is required in oil exploration [1921]. Detailed classifications are achieved by means of compound concentrations and relations of compound groups.

Time and resources might always be too limited to determine all compounds, which can be measured. But what are the most important?

Empirically, i.e., gained from field and laboratory experiments, intercalibrations, and oil incidents, where the source oils were known, a minimum set of compounds is chosen, which must be used in every investigation (normative compounds). Since not all compounds are present in every oil type, this set is adapted to the special type of oil involved in a given oil spill incident. Examples of additional compounds which may also be useful are then provided (informative compounds).

Thus, distinct sets of (semi)quantitative concentrations of oil compounds have to be determined and have to be compared between spill samples and samples from suspected sources.

Concentrations are identical, i.e., not discernible, if their differences do not exceed the repeatability limit of the analytical method. If they are all identical, the proof is given that the findings of the visual inspections are actually true, i.e., the samples are identical without any scientific doubt.

In every oil spill case, double measurements are used for verifying the procedure. These repeated measurements are used to find out, whether the precision of the analytical method is adequate, and whether all compounds of the oil involved can actually be determined precisely enough. If not, it is justified to exclude those compounds from the given sets.

2.3 Weathering

When a spill sample is compared with a suspected source sample in this way, however, differences in concentrations of compounds may not only be present due to analytical error.

Differences in compositions may also appear because of weathering processes, contamination, and inhomogeneous distributions of oil. All these problems can even appear at once.

In order to avoid “false-negative” conclusions, the responsible analyst has to be acquainted with these difficulties, and he has to be able to show indisputably that they have not falsified the results. This means that the proof has to be given that every single observed difference is not derived from the fact that spill and suspected source sample consist of different oils. In other words, the proof has to be given that it is possible that a spill originates from a suspected source, despite those differences.

Spilled oil samples and suspected source oil samples can never be identical because the composition of oil changes, as soon as it is released into the environment. Volatile compounds evaporate immediately, for example, and their concentrations decrease rapidly. Correspondingly, the concentrations of less volatile compounds increase. Thus “identity” can no longer be determined. The composition of the oil has changed.

2.4 A Nice Analogy

Sometimes jurists do not follow the argument that differences between spilled oil and suspected source oil are caused by weathering processes. The reason might be that it’s not easy to refute an expertise, where “identity” is concluded, and a conclusion like this often does not leave much space for the consideration of evidence. “How can you be sure that the oils were not different from the beginning?” is thus a common question in court trials.

Others even make use of the definition of the word “identity,” which means that every measured characteristic of two samples must be the same. A modification of this definition, e.g., “no differences, except those, which are caused by weathering of the spill sample,” for example, is simply not accepted.

A good response in this case might be the following analogy: if the sun is shining, and I spend the whole day outside in the sun, my face will have turned red in the evening and might be tanned the next day. Thus, a special characteristic of my person has changed, but I am still the same person.

Weathering processes follow distinct rules, such as a longer stay in the sun will consequently lead to a deeper red color of my face, and effects of evaporation will be more severe, the longer an oil spill stays in the environment. A further parallel can be found with regard to the strength of the irrigation from sunlight.

2.5 Partial Weathering Plots (PW Plots) and Mere Evaporation

Generally, lower boiling oil compounds are more affected by evaporation than higher boiling ones, which means that their concentration is more reduced. However “more or less” is a very vague term to use, while actually the proof is needed that a reduction by evaporation has taken place.

The very first steps for presenting this proof are undertaken in the NORDTEST method NTChem001 [22]: PW plots are mentioned here, which may generally show how compounds and compound classes decrease by different weathering processes of the spill sample. Here, their concentrations in the spill sample are plotted against their concentrations in the suspected source sample. Corresponding ratios are given in percentages, and if not affected by weathering processes, all values must fall on a straight 100% line. However, in NTChem001 [22], only a very general description of the PW plots is given, and the compounds, which have to be measured, are not given in detail, nor is there any information given with regard to the analytical uncertainty of the method.

Objectivity is highly increased, when the kind and number of compounds to be used for the PW plots are prescribed. Thus a set of “normative” compounds is given in CEN/TR 15522–2:2012 [1], and all of them have to be determined always and in any case. If a compound of this set is excluded from the PW plot, it has to be explained why, e.g., because it is too low in concentration and/or cannot be measured precisely in repeated measurements. The latter is proved by those PW plots, where double measurements are used. The double analysis of a sample will theoretically result in data points at 100%. In practice, however, inherent variance of the instrument and data handling causes variations around the 100% value (see Fig. 1).

Fig. 1
figure 1

PW plots of double measurements of the spill sample of RR2013. The standard deviation of the data points in the left graph is 2.8 and in the right graph 7.4

There is not any difference, if semiquantitative values are used here. Practically, and instead of absolute concentrations, it is even more convenient to normalize the concentrations of the compounds on the concentration of a stable, higher boiling compound, which is not easily affected by weathering processes, e.g., hopane (Eq. 1).

$$ \mathrm{Ratio} \mathrm{of} \mathrm{compounds}=100*\left(\mathrm{s}1/\mathrm{hop}1\right)/\left(\mathrm{s}2/\mathrm{hop}2\right), $$
(1)

where 1 and 2 correspond to the spill and the suspected source sample, respectively.

Error limits (yellow, ±2 st. dev.; red, ±20% of the ratio) are included, representing the maximal accepted analytical error based on a st. dev. of 7.5% which is allowed in repeated measurements and between the non-weathered part of matching samples. Several Round Robin tests have shown that these limits can normally be reached. The comparison of double measurements is done in the same way as the comparison of a spill with a suspected source sample and forms an integral part of the method.

Concentrations of compounds affected by weathering processes are lower in the spill sample than in the original oil, and the amount of reduction can directly be measured in the PW plots. If the error limits of a compound are exceeded, this has to be explained.

If evaporation is assumed to have caused this reduction, it has to be shown that the amount of reduction of every higher boiling compound corresponds with the amount of reduction of every lower boiling compound.

Compounds evaporate, when their boiling points are exceeded, and when a nonpolar column is used in GC, compounds are mainly separated according to their boiling points.

Based on these principles, simulated distillation by GC is widely used for characterizing oil products in petroleum industry. Consequently, if a spill sample is affected by evaporation, a similar S-shaped evaporation curve must appear, when the concentrations of the oil compounds of the spilled oil, divided by the concentrations of these compounds in the suspected source oil, are plotted against their retention times (see Fig. 2).

Fig. 2
figure 2

HFO evaporated/distilled at 400° for 4 h (comparison of sample 6 with sample 1 in Round Robin 2007)

One has to keep in mind here that, compared to the ratios produced from compounds detected by the same mass fragment [1], an additional error is introduced. This error is connected with the sensitivity of the MS for different masses. However, sensitivity changes differently with time (that’s the reason why instruments must be recalibrated from time to time). Thus, producing MS-PW plots is best feasible on data achieved by consecutive runs. Thus, samples must be analyzed in a batch run.

In Fig. 2, evaporation was tested. Here, it is a fact that sample 6 is derived from source 1 because the samples originate from an experiment: source 1 has been evaporated, which revealed sample 6.

Consequently, in real cases it can be concluded that a spill sample is derived from a suspected source sample, if a PW plot such as given in Fig. 2 is found – without any scientific doubt.

2.6 Additional Weathering Processes

The effects of weathering on oil are cumulative. In the Round Robin test in 2010, a crude oil was artificially biologically degraded: oil spiked with a fertilizer was left on seawater for several weeks.

In Fig. 3 one can clearly find the region of evaporation (area of red columns) and the region of bacterial degradation (purple columns). Whereas the n-alkanes are heavily affected by bacterial degradation, the isoprenoids, i.e., norpristane (nor), pristane (pr), and phytane (phy), are not. N-C17, for example, is reduced by about 45% by bacterial degradation and by about 25% by evaporation. The very even reduction of the lower boiling aromatics (Methyl-phenanthrenes and Methyl-dibenzothiophenes at about 30 min) is mainly caused by dissolution, whereas all higher boiling aromatics and all biomarkers are not affected at all.

Fig. 3
figure 3

Spilled crude oil from a fertilizer experiment (comparison of sample 3 with sample 1 in Round Robin 2010)

In order to avoid oil on beaches, a crude oil spill, which had been discharged from a platform in the Nigerian oilfields, was heavily treated with dispersants. Nevertheless, it can be proved that the oil reached the shore (Round Robin 2012, sample 2). In addition to only weak evaporation and bacterial degradation, merely Methyl-Anthracene (MA) was heavily degraded by about 55% through photooxidation (Fig. 4) (cf. [5]).

Fig. 4
figure 4

Identification of spilled crude oil on the Nigerian coast after a bigger accident in the Nigerian offshore fields (sample 3 of Round Robin 2012, MA)

It is confirmed that sample 4 from Round Robin 2011 originated from the sunken tanker “Erika” because samples from this site were continuously taken over the years. But the accident had happened more than 10 years ago. Very severe weathering can be seen in the PW plot of Fig. 5: all compounds up to the mid-boiling aromatics have disappeared. In addition, also higher boiling aromatics and even distinct biomarkers are severely affected by dissolution, photooxidation, and even bacterial degradation. In this case, the source was known (“Erika”). In an unknown case, it might be difficult to prove that every reduction, i.e., every deviation from the 100% line, was caused by weathering effects. Hardly anything can be found in literature about such highly weathered samples and the degradation of biomarkers. All OSINet members agreed that in this case the conclusion should be reduced to a “probable match” because the number of still matching ratios is too low.

Fig. 5
figure 5

Identification of spilled HFO from the sunken tanker “Erika” on the French coast more than 10 years after the accident had happened (sample 4 of Round Robin 2011)

2.7 PW Plots with Weathering Indication

For the correct interpretation of weathering, it might be useful to know which of the compounds mentioned in CEN/TR 15522–2:2012 [1] are affected by the different weathering processes. In Table 2, the weathering behavior of these compounds is indicated by stable (very resistant), bio(degradation), solub(ility), and photo(oxidation). Some of the PAHs have no indication. They are not stable enough to be indicated as stable and are also not specifically sensitive for one of the weathering effects.

Table 2 Weathering sensitivity of compounds mentioned in the CEN/TR 15522–2:2012 [1] (bold: stronger effect)

The information given in Table 2 is used to create PW plots with indication of weathering. An example is shown in Fig. 6. Artificially biodegraded heavy fuel oil (HFO) from the Erika spill (RR2011) is compared with the original HFO. A small amount of oil has been weathered by Cedre (Fr) at room temperature for 2 months on seawater with a fertilizer in the dark in a large open beaker constantly mixed with a magnetic stirrer.

Fig. 6
figure 6

The original HFO (source 1) compared with artificial biodegraded HFO from the Erika spill (RR2011, spill 1)

In Fig. 6, a sinus curved evaporation line is drawn through the compounds, indicated as stable in Table 2.

Markers for biodegradation are the n-alkanes. These n-alkanes have been reduced completely. In the PW plot of Fig. 6, these are represented by C17 and C18 and can be found at about 1% between 25 and 30 min, respectively. The branched alkanes pristane (Pri) and phytane however are more robust against biodegradation and can be found close to the evaporation line at 21% and 36% between 25 and 30 min. Figure 6 shows that besides the alkanes, also the PAHs are reduced in this experiment. All the biomarkers (sesquiterpanes, hopanes, steranes, and aromatic steranes), however, were unaffected and can be found on the evaporation line or at 100%.

Figure 7 shows a comparison of source 1 with spill 2 of the RR2011 samples. A small layer of oil from source 1 was applied to the surface of a tile. To simulate an oil-contaminated rock, the tile was positioned outside on a wall, which was directed to the south, for 3 months. It was inundated by seawater at high tide. The main weathering effects to be expected were evaporation, photooxidation, and dissolution.

Fig. 7
figure 7

The original HFO (source 1) compared with artificial weathered HFO from the Erika spill (RR2011, spill 2)

The evaporation line shows evaporation up to a retention time of 40 min. C17, pristane, C18, and phytane are all on the evaporation line, indicating that biodegradation has not occurred. The compounds specific for dissolution are mainly in the range of complete evaporation except 2- and 1-methylphenanthrene. These can be found slightly below the evaporation line at a retention time of about 30 min. The triaromatic steranes (red dots between 45 and 50 min) have been reduced to about 70% by photooxidation.

There might be the need, finally, to give an impression on how the PW plots of actually nonmatching samples generally look like. This is given on the right side of Fig. 8, where the PW plot points are simply spread and don’t follow any rule. Figure 8 shows clearly the difference between a heavily weathered, but (probably) matching, sample on the left side and a nonmatching sample. Both are from RR2011.

Fig. 8
figure 8

Left: comparison between the original HFO (source 1) and a spill sample collected after 10 years. Right: comparison between source 1and HFO from the Prestige spill artificially biodegraded for 2 months

On the left, the samples are the same as the samples of Fig. 5, which has already been discussed. However, the informative compounds are added here together with the information about the weathering sensitivity of the compounds as given in Table 2. The most stable compounds are on the evaporation line or close together at 100% (black squares).

On the right, a biodegraded HFO from the Prestige spill is compared with the original HFO from the Erika oil. The biodegradation has been done in the same way as with spill 1 (see Fig. 6). The PW plot simply shows scattering without a pattern: the stable compounds between the retention time of 40–50 min. range between 40% and 150%. Additionally the TAS can be found at 250–300%. It is impossible to draw an evaporation line and a nonmatch has to be concluded.

3 COSIweb

All examples of sample comparisons given above can be found in the online database and evaluation system COSIweb (Computerized Oil Spill Identification, web based). This system, which can easily be assessed using any browser, includes samples from many major accidents (among others, “Macondo”, “Erika”, “Prestige”, “Tricolor”, “Baltic Carrier”) but also hundreds of different crude oils and many oil products and waste oils from real spill cases.

COSIweb has two functions:

  • Searching for unknown samples by means of all or selected compound ratios as given in CEN/TR 15522–2:2012 [1] (statistical comparison)

  • Comparing of two samples by producing all the means needed for coming to a conclusion according to CEN/TR 15522–2:2012 [1]

One of its most unique features is the automatic detection and measurement of all relevant peaks from raw GC and GC–MS data (Fig. 9). Gas and mass chromatograms consist basically of x and y values (representing time and intensity in this case). These data can then be exported by means of any acquisition software. As soon as these raw data files are uploaded into COSIweb, all relevant peaks are found and named. Their heights above baseline are measured and compound ratios (“diagnostic ratios”) are produced for comparison, automatically.

Fig. 9
figure 9

Hopanes automatically detected, named, and measured by COSIweb (above, with zoomed area below)

All of this is done within seconds. COSIweb thus saves both time and resources.

COSIweb is hosted by the BSH and freely available to all OSINet members. At the time of writing, it includes data from 16 laboratories from all over the world. In order to participate in COSIweb, a username and password are required. In order to demonstrate the capabilities and reliability of the system, a special guest status has been produced: the system, available at http://cosi.bsh.de:8080/CosiWeb/, can be accessed freely and tested by using two times the word “guest” (without quotation marks).

4 Conclusion

Information about the development of the common method CEN/TR 15522–2:2012 [1] is presented together with examples of PW plots as one of the highlights of this method. The method itself is much more comprehensive and provides much more details about different oil products and possibilities for their comparison than can be presented here. The interested reader is encouraged to study the method itself. Although this method is written as a guideline, laboratories should collect experience through practice. Information about different oil spill cases and examples of how others have analyzed and compared oil samples is found in the online database and evaluation system COSIweb. COSIweb can easily be accessed by a web browser. This system might also be helpful in assisting users to learn its procedures. It provides many examples on how analytical GC and GC–MS results should appear. All means for sample comparison and for drawing a final conclusion about the connection between two samples are produced here automatically and within minutes. This includes overlays of chromatograms and mass chromatograms, measuring of chromatographic peaks, and producing and comparing of peak ratios as well as PW plots. Thus, this system saves much time and resources. Using this system must also be regarded as the strongest form of cooperation among laboratories as raw data of chromatograms and mass chromatograms uploaded from anywhere in the world can be treated and evaluated as if they were produced in the own laboratory. Samples can be used by all participating laboratories as soon as they are included in the database.