Introduction

Diatoms are increasingly used for freshwater environmental assessment world-wide (Stevenson et al., 2010; Stevenson & Smol, 2015). Thus, a lot of information is gathered that could be used for new studies of diatom ecological traits, diatom biodiversity patterns and diatom biogeography. However, this is only possible if datasets can be merged easily. Even though diatom indices have been proven to be quite robust for the assessment of ecological status (Kelly et al., 2009; Kahlert et al., 2012; Kelly & Ector, 2012; Almeida et al., 2014), diatom taxa lists can often not be compared directly. The reason for this is the lack of harmonization of diatom identification and counting techniques among regions or countries, which leads to inconsistent diatom datasets if not a form of taxa harmonization is used before analysis (Vyverman et al., 2007; Stevenson et al., 2010; Kelly et al., 2014). Thus, to prevent dataset inconsistencies, some form of standardized quality assurance (QA) is imperative (Kahlert et al., 2009; Dreßler et al., 2014).

QA is a way of systematically comparing results (here diatom taxonomic composition data, further called diatom counts or diatom datasets) to a standard, with an integrated monitoring of the involved processes and an associated feedback loop. QA may be a part of quality control but should not be mistaken for it. QA aims at error prevention, whereas quality control focuses on the control of the final results. To improve consistency between diatom datasets from different laboratories, QA of taxonomic methods should be broad, including identification exercises, workshops, regular internal training and education programs (Kelly, 2013). QA may even include diatom sample auditing by accredited diatom taxonomists or photographic taxa documentation (Kelly, 2013; pers. comm. Mertens, Werner). In general, QA should help to first identify and then prevent error sources, such as environmental variability, workforce variability (e.g. in training or adherence to methodology), technical errors and equipment failure (specified in ISO/IEC, 2005).

Inconsistencies in diatom counts have many sources. However, taxon identification uncertainties often contribute more to overall inconsistency between datasets than other error sources (Kelly, 1997; Prygiel et al., 2002; Alverson et al., 2003; Van der Molen & Verdonschot, 2004; Lavoie et al., 2005; Besse-Lototskaya et al., 2006; Kahlert et al., 2009). Even though all uncertainty sources should be minimized as much as possible, we think that agreement on consistent diatom identification will be the greatest challenge in the dataset harmonization. Diatom identification exercises are very demanding, because there are over a thousand freshwater diatom taxa in Europe (Guiry & Guiry, 2015), and diatom taxonomy is developing fast (Medlin & Kaczmarska, 2004; Mann, 2010; Mann & Vanormelingen, 2013; Zimmermann et al., 2014). Therefore, we consider it particularly important to spread and exchange knowledge about best practice of diatom taxonomic identification exercises.

Different countries designate identification exercises in many ways. Some of the terms used are ring test, harmonization exercise, intercalibration, proficiency testing and intercomparison. A ring test defines the process of a reference institute which sends replicate samples to the participants to be analysed, and the results reported within a given time frame. Then, the reference institute statistically analyses, evaluates and interprets the results. The term harmonization exercise emphasizes the need of taxonomic consensus between laboratories and countries (e.g. Kahlert et al., 2009). Intercalibration is the process that ensures that several laboratories produce compatible data (Taylor, 1987). The term proficiency testing highlights the fact that the skills are actually tested in a type of exam. Finally, intercomparison means the mutual comparison of laboratories and is often used as synonym to proficiency testing. A diatom identification exercise might combine two or more features of the above mentioned terms. In any type of diatom identification exercise, diatom samples are analysed in the home laboratory according to the established routines, and the results of the diatom counts are then sent for central evaluation. Often, but not always, the exercise is followed by a workshop where the participants discuss the results of the exercise.

In Europe, diatom counting and their QA are an essential part of implementing the European Water Framework Directive (The European Parliament and the Council of the European Union, 2000). However, the existing European Committee for Standardization (CEN, 2012) provides only general guidance for the design of QA and does not specifically mention diatom count QA. Consequently, each European member state has developed different national or regional QA measures for diatom counts, complicating comparisons and information exchange. In addition, European QA measures are often only published in grey (non-peer reviewed) literature and often not in English.

This paper provides critical information for diatom researchers and managers to reach greater consistency in QA/harmonization studies. First, we present and compare information on the implementation of diatom count QA in Europe, to identify advantages and drawbacks of each approach. Second, we summarize problematic groups of taxa, highlighted by European identification exercises. And third, based on the above, we suggest a design for diatom identification exercises in order to provide consistent diatom count QA.

Overview and comparison of diatom count QA in Europe

Information on diatom count QA was collected in two steps. First, all European countries were asked to answer a questionnaire focusing on diatom identification exercises (in 2012, with an update in 2014) and we received 16 answers, which are summarized in Table 1. Second, additional information was extracted from grey literature (listed in Table 2) and personal communications.

Table 1 European quality assurance (QA) and quality control measures for diatom counts used in bio-monitoring
Table 2 Publicly available references of various intercalibration exercises in Europe

Diatom counts in Europe are conducted on many different administrative levels, from local authorities to regional and national monitoring institutes in each country, but also for research projects, by water authorities, research institutions or universities. These counts are done by a mixture of people including staff at central or regional water authorities, consultants, researchers and graduate students. Generally, the number of counts for routine monitoring range from approximately a hundred samples per year in Estonia to “several thousand” samples per year in the UK, Germany, France and Spain. Consequently, the European countries have different needs for diatom QA: a country with several large monitoring programs conducted at different administrative levels or a setup of analysts with varying degrees of expertise and experience will have a more urgent need for diatom QA than a country where a small number of samples is processed by a single analyst under the administration of a single authority.

As all countries strive for diatom count QA, most countries participate in identification exercises as a part of QA. Many countries conduct their own national identification exercises, but small countries with few experts usually participate in neighbouring countries’ identification exercises. These exercises are organized by the water authorities or alternatively by the diatom analysts (consultants or researchers), or in combination. The advantages of the organization by a national authority are a long-term approach including basic funding and a good acceptance of the outcomes of the identification exercises. On the other hand, an organization of diatom count QA by diatom analysts (consultants or researchers) directly provides the necessary expertise on diatoms. Researchers usually ensure the inclusion of the newest taxonomical views and may provide advice on how to separate species under the light microscope, while consultants ensure that the view of environmental assessment requirements and practical questions are taken into account.

The efforts of the conducted identification exercises are quite different, varying in frequency, as well as work load and costs. The frequency ranges from annually over biannually to once every 3 or 4 years. An exception is the UK/Ireland test which is organized as a continuous procedure, sending one diatom slide to the participating laboratories approximately every second month. Some countries carry out infrequent exercises or have only carried out a single one so far. The work load for the participants might include sampling, often the preparation of permanent diatom slides and always their counting according to standard protocols, varying from one to seven slides on each occasion. In some countries (France, Italy and the Netherlands), the calculation of diatom indices is also included in the exercise. Sometimes very strict protocols with time constraints are followed (Italy). However, most often experts have no time constraints for sample counting, and they count these in their own laboratories according to given standard protocols. Results are then sent in for central evaluation. Participation fees ranges from free to 550€. There are costs for the organization of identification exercises, so in case of free participation, costs are covered by other sources, often by the organizing authorities. Obviously, a high effort guarantees an intensive training, and frequent exercises ensure that the analysts’ knowledge is up to date. On the other hand, if the invested efforts are considered to be too high, there is a risk of non-participation due to lack of time and funding.

The number of participants in these exercises is 5–60, which again highlights the different approaches between countries. A high number of participants enable the training of many analysts at the same time. Furthermore, the participants’ fees cover the costs of the identification exercise. However, high participant numbers also make organization more difficult. Additionally, if the results of the exercise are discussed in a workshop connected to the identification exercise, low participant numbers are preferred in order to enable valuable discussions. If harmonized solutions to taxonomic problems are the goal of such a workshop, low participant numbers are definitely necessary. A workshop is, however, not always a given part of the identification exercises. An identification exercise might only consist of a number of samples to be counted and the comparison of the results of the participating laboratories in a report. Still, when the initial comparison is followed by a workshop and agreements on how to handle difficult taxa are attained, between analyst variation can be reduced (Kahlert et al., 2009). Evaluations of NorBAF (Nordic-Baltic Network for Benthic Algae in Freshwater, including Sweden, Finland and the Baltic countries) also showed that the workshop was the most appreciated part of the intercalibration exercise, where results were eagerly discussed and clarifications could be made.

Most countries hand out certificates in connection with the identification exercises, reflecting the intention of the exercises. Diatom identification exercises can have the form of a stringent test with time restriction (Italy; Martone et al., 2012), of more informal meetings coupled to formal certificates after passing a minimal threshold of correctly identified taxa (like in the NorBAF exercise, Kahlert & Albert, 2005), or focus on reflective learning instead of examinations (UK and Ireland; Kelly, 2013). Certificates may thus only confirm participation without information on performance, or they may be used to certify that a participant has met a quality criterion such as in the tests of NorBAF, Germany and Hungary. The threshold can be based on the similarity of the diatom taxa lists established by the auditor(s) in comparison to the participant. In Hungary, an additional threshold based on the diatom ecological index values is calculated and compared with the audit. In the UK/Ireland ring test, only the index value threshold represents the limit of acceptable variation, but as the focus is on learning, there are no certificates (Kelly, 2013). Obviously, the choice of the auditor or expert (or group of experts) is important. An agreement must be reached on diatom taxonomy to achieve harmonized results. Expert opinions on taxonomy often vary, thus it is a sensitive issue. UK/Ireland solved this issue by appointing some of their most experienced experts to take turns on the responsibility for different samples, and additionally several experts count one slide to get natural and expert variability, per slide. All countries that conduct identification exercises also produce different types of written reports of the outcomes.

As diatom identification exercises are part of diatom count QA, it is commonly expected that a laboratory accredited for diatom analyses ensures the analyst’s skills via participation in it (see for example, The Swedish Board for Accreditation and Conformity Assessment (Swedac), 2011), and the responsible water authority might require the use of accredited laboratories based on ISO-IEC17025 (ISO/IEC, 2005) for diatom counts in monitoring programs. However, we are lacking information on what exactly is required to achieve the accredited status of a diatom expert, partly because the accreditation rules for analysts are not well formulated and also not well communicated. Often, the accreditation rules and the eventual in-house inspections do focus on technical routines of the accredited laboratory. In contrast, the requirements for diatom identification and counting skills are seldom specified in the accreditation.

Common quality problems and problematic taxa groups

Common quality problems encountered during the diatom identification exercises in Europe include (1) mistaking one taxon for another due to insufficient use of given taxonomic details for identification, leading to differences in the final diatom list, and in water quality index calculations if those taxa have different ecological requirements (Austria [Mauthner-Weber, 2001–2012], GER, UK/Ireland, Table 2); (2) identification problems due to imprecise taxonomic literature, i.e. ambiguous species’ descriptions and documentation in the current identification literature (FR, GER, NL, NorBAF, UK/Ireland, Table 2) or not using mandatory identification literature (GER, NL, NorBAF, Table 2); and (3) overlooking of small taxa (FR, GER, NL, NorBAF, UK/Ireland, Table 2).

Identification problems occurred within similar taxa complexes in all exercises, despite geographical differences (Table 3). Most of these taxonomic groups are also common or abundant in European waters, and thus, relevant for environmental assessment. Consequently, solutions must be found to ensure QA of these common taxa. For example, the identification of single taxa from the Achnanthidium minutissimum complex was problematic in most identification exercises (Table 3). Sweden uses ecomorphotypes based on valve width thereby avoiding the need to differentiate on a species or variety level (Kahlert et al., 2007, 2009), whereas other countries try to clarify existing taxa concepts of this complex (e.g. Dreßler et al., 2014). These different solutions for problematic taxa illustrate the necessity of information exchange in order to achieve compatible diatom taxa datasets among countries.

Table 3 Taxa identified as problematic in intercalibration exercises of various European countries from publicly available sources (Table 2)

Suggestions on the design of identification exercises

QA of diatom identification and counting, at least in some countries, is still in its infancy with little consensus on best practice across Europe. Especially countries with low or unclear qualification requirements for diatom analysts, risk employing the cheapest and thus likely the most unqualified one. This might lead to low quality of the resultant diatom dataset. Overall, we would like to recall that it has been suggested that laboratory quality control should make up 10–20% of the effort spent on routine analyses (Cheeseman & Wilson, 1978). In general, we see a need for national/regional identification exercises and for additional European-wide identification exercises, in combination with improved information flow (Table 4).

Table 4 Suggestions on design and performance of national or regional identification exercises for QA of diatom counts, and on European-wide identification exercises, based on best practice experiences

National or regional identification exercises aim to ensure QA of diatom counts on national/regional levels by addressing regional peculiarities and using the national language to ensure a good communication with the participating routine technicians (Table 4). The exercise should occasionally include sampling and sample preparation, but focus on taxa identification and quantifiable results coupled to a workshop for education and training (Table 4). A summary of the workshop outcome should be published in the national language. Ideally, the results should also be published in English on a common homepage. National/regional identification exercises should be organized by a combination of water authorities, researchers and consultants (Table 4).

National/regional identification exercises should occur once a year or continuously, with a fee as low as possible. To our experience, a number of 15–20 participants are the maximum to enable valuable discussions in a workshop. The target group would be the staff involved in diatom counting on a national/regional level including consultants, researchers and technicians (Table 4). When organized in a concentrated form (and not continuously), two samples per exercise have been shown feasible to keep the effort manageable. The content should vary reflecting different ecological settings, thus covering the broad spectrum of diatom taxa (Tables 3, 4). The use of several auditors or an expert panel is preferred over single auditors (Table 4), as then also eventual errors and problematic taxa groups can be agreed upon. Auditors should discuss inconsistent counting results openly and flexibly to harmonize their view of counting, prior to an analysis of participants’ results.

Participant performance should be quantified and noted in a certificate (Table 4). For example, a Bray-Curtis similarity level of >60% typically indicated good agreement of diatom counts between analyst and auditor in previous studies (Kelly, 2001, confirmed by GER, HU, NorBAF Table 2). Using this threshold, it is possible to decrease the index variation of analysts and auditors to the variation of replicate samples (Kahlert et al., 2009, and unpublished NorBAF report of 2011).

However, while national/regional identification exercises may reduce variation within a country (or region), they can, potentially, perpetuate systematic errors among countries and regions (Van de Vijver & van Dam, 2010; Kahlert et al., 2012). We, therefore propose a European-wide communication of QA of diatom counts to inform about national and regional workshop results. An European-wide identification exercise would ensure an information flow among countries, enabling communication of best practice and harmonization of diatom data, which is particularly important for water bodies with international catchments or international projects or when investigating organism ecology, biodiversity and biogeography. We suggest a frequency for European tests of every second year, connected to a diatom meeting. This setup would enable the incorporation of the outcomes into regional identification exercises in the year in between and would keep the workload manageable (Table 4). European identification exercises and workshops should be held in English for communication reasons. We suggest the target group to be the coordinators or auditors of the national identification exercises to ensure cost-effective workshops with constructive discussions. The outcomes of the European identification exercises should then be discussed in the regional and national workshops (Table 4).

Identification exercises at European level should include the practical counting of diatoms. Only then are relevant problems among participants discovered. The focus should be on the main problems identified by the national/regional identification exercises. Quantifiable results should also be included to enable the comparison of participant results.

A report summarizing the results is necessary for archiving and communication for both the national/regional and the European identification exercises. We suggest publishing the results also as a WikiProject (Wikipedia, 2015a) about “Diatom identification for applied use” linked to the website of the International Society for Diatom Research (ISDR). This publication form would ensure that not all the workload is on the organizers and would enable an active, open and democratic participation of diatomists, current updates and the sharing of practical solutions. A WikiProject would not depend on a single funding source or single authority; it would be curated by the established Wikipedia community and would thus hopefully be carried by many shoulders and long-living. We are not aiming at starting taxonomical sites for diatom species; such efforts are already under way (Maddison & Schulz, 2007; Spaulding et al., 2010; Wikipedia, 2015a, b, c, d, e). Instead, we are thinking of webpages for diatomists working with identification for environmental assessment, where we would link to the national identification exercise reports and results, and create sites for the issues that cause the main problems in this field. Here, we would publish agreed solutions and conventions to handle difficult diatom groups (such as given in Table 3). For further improvement of all national and international tests, we additionally recommend routine evaluations of the identification exercises by the participants.

Conclusions

In summary, we believe that diatom count QA can and should be improved by reciprocal knowledge transfer between regions and countries. Furthermore, we think it is essential to clearly formulate the requirements and qualification necessary for an accredited diatomist. These qualifications should include a description of required diatom identification skills. We suggest regular participation in identification exercises as part of every diatom counting QA. We recommend performing national (or regional) identification exercises with a quantifiable outcome, a threshold to be attained and certificates issued. An identification exercise should be followed by a workshop where common problems are discussed and solved. We recommend furthermore Pan-European identification exercises to ensure communication of best practice, and in cases of international catchments or projects, a harmonization of regional and national diatom identification. We also recommend open discussions among all people involved in diatom counting to achieve taxa identification harmonization. Discussions should also include the ecology and potential indicator values of diatom taxa for monitoring purposes. Last, solutions and discussions should be made public as soon as possible, incorporated into a WikiProject linked to the ISDR for fast publication.