Introduction

Pollination by bees is essential for the sustainability of many sectors of agricultural production; therefore, maintaining the health of bee colonies is of great economic importance. Bee colonies are susceptible to a number of infestations and diseases including varroa and foulbrood, but worldwide there are relatively few drugs legally permitted for control [1]. Although some countries, e.g. the USA, have authorised the use of the antibiotic tylosin to treat American foulbrood [2], the EU has yet to authorise antimicrobial-based veterinary medicines for the treatment of bees. Between 2007 and 2012, there were, however, approximately 100 alerts for honey and royal jelly reported on the European Commission's Rapid Alert System on Food and Feed [3]. Of these, some of the most frequent types of antimicrobials detected were sulfonamides (e.g. sulfadiazine, sulfathiazole, sulfamethoxazole), aminoglycosides (streptomycin), fluoroquinolones (e.g. ciprofloxacin) and chloramphenicol.

Whilst there are no EU maximum residue limits (MRLs) for any antimicrobials in honey, a minimum required performance limit (MRPL) for chloramphenicol has been set at 0.3 μg kg−1 [4]. In addition, the EU Reference Laboratories have published a number of ‘recommended concentrations’ (RCs) in honey ranging from 20 μg kg−1 for tylosin to 50 μg kg−1 for sulfonamides [5]. These RCs are related to the screening target concentration (STC). The STC is the concentration at which a screening test categorises the sample as ‘screen positive’ (potentially non-compliant) and prompts a confirmatory test by, for example, liquid chromatography–tandem mass spectrometry (LC-MS/MS). For analytes for which MRLs have not been established, the ‘STC should wherever possible be at or less than the recommended concentrations as described in the CRL Guidance Paper’ [6]. For other non-authorised analytes (veterinary medicines) in honey, e.g. fluoroquinolones, the STC is at the discretion of the laboratory conducting the work. For this study, the STCs used were as follows: 50, 20, 50 and 10 μg kg−1 for sulfathiazole, tylosin A, ciprofloxacin and chloramphenicol, respectively.

Concern about the occurrence of residues of veterinary medicines in honey has resulted in the need for rapid methods to monitor for antibiotic residues from a variety of classes including sulfonamides, macrolides, fluoroquinolones and tetracyclines. Whilst a sensitive and rapid method based on dipstick technology already exists for tetracyclines in honey (TETRASENSOR® kit, Unisensor, Belgium) [7], and there are a number of commercial ELISA products for streptomycin and tylosin [8], there is a dearth of rapid methods which offer a wide coverage of the antimicrobial classes of interest.

In 2007, the European Commission Framework Programme (FP7) commissioned a new project called Contaminants in Food and Feed: Inexpensive Detection for Control of Exposure (Conffidence, www.conffidence.eu). The aim of the project was to develop and validate faster and more cost-efficient methods for the detection of a wide range of chemical contaminants/residues in both food and feed. One of the outcomes from this project is a multiplex dipstick assay (Bee4sensor® [9, 10]) that could be employed for the simultaneous screening of sulfonamides, fluoroquinolones, tylosin and chloramphenicol, possibly present in honey. Data from this method could be determined in two ways:

  1. (a)

    Visually, in which the colour intensities of the individual test lines on the dipstick (for sulfonamides, fluoroquinolones, tylosin A and chloramphenicol) are compared to the intensity of a control line

  2. (b)

    Instrumentally, in which a Readsensor® is used to generate a numerical value (ratio) for the colour intensity

Initially, this new method was subjected to a within-laboratory validation according to EU requirements (Commission Decision 2002/657/EC [11]) for a wide range of analytes from both the sulfonamide and fluoroquinolone classes of antimicrobials, as well as tylosin and chloramphenicol. Subsequently, to ensure robustness of this new method, a small-scale inter-laboratory validation (ILV) exercise was undertaken using ‘representative analytes’ [6] of sulfathiazole, ciprofloxacin, tylosin and chloramphenicol. This paper reports the results of this external validation study by seven control/industry laboratories. This validation trial was conducted in two stages. The laboratories familiarised themselves with the method by analysing known positive and negative control samples, before testing a number of randomly coded blind samples, containing spiked and incurred residues of the target antimicrobial compounds.

Materials and methods

Materials and reagents

The analytical standards of sulfathiazole, tylosin A, ciprofloxacin (hydrochloride) and chloramphenicol were purchased from QMX Laboratories (Thaxted, UK). Chemicals and solvents were of analytical grade. Deionised double-distilled (or equivalent) water was used throughout.

The multi-sensor (Bee4sensor®) kits for honey, microwells and dipsticks (lot 110505), were provided by Unisensor s.a. (Liège, Belgium) and used during the inter-laboratory validation. Bulk buffer solutions were prepared separately to cater for all the participants of the trial.

Stock and working standard preparation

Standard stock solutions of 1,000 μg ml−1 were made in methanol and kept at −20 °C for up to 6 months. Intermediate solutions of 100 or 10 μg ml−1 were prepared on a monthly basis and also kept at −20 °C, while mixed spiking solutions were prepared on a weekly basis and stored at 4 °C.

Instrumentation

The Heatsensor® equipment and the Readsensor® (for instrumental reading) were supplied to each laboratory, by Unisensor s.a. (Liège, Belgium). This Readsensor® version (Firmware 2.2.13) can be used in stand-alone mode or connected to a computer. In this study, the device was used in computer mode and was operated with specially designed software (either LF control, version 3.01.01; or LF Studio, version 3.3.4). To ensure correct performance of the instrumental system, the Readsensor® equipment was calibrated by each participant in the ILV immediately before each part of the study with a ‘reference check & control dipstick’ as recommended by the provider.

Samples

Blind test material no. 1 (‘blank’)

A ‘blank’ honey was produced by stirring portions (previously tested negative using an ISO 17025-accredited LC-MS/MS procedure) of acacia, woodland, orange blossom, Australian honey (5:5:5:4 ratio) for 24 h. After mixing, aliquots (15 g) were potted and stored at ambient temperature.

Blind test material no. 2 (‘0.5 × STC’)

The following protocol was employed to produce a honey containing sulfathiazole, tylosin A, ciprofloxacin and chloramphenicol at 25, 10, 25 and 5 μg kg−1, respectively. To 1,000 g of ‘blank’ honey were gradually added 0.25 ml of 100 μg ml−1 sulfathiazole and ciprofloxacin solutions, 0.1 ml of 100 μg ml−1 tylosin A and 0.5 ml of 10 μg ml−1 chloramphenicol, whilst the honey was being stirred continuously and then for a further 24 h at room temperature. Aliquots (15 g) were potted and stored at −20 °C.

Blind test material no. 3 (‘1 × STC’)

To produce a honey containing 50/20/50/10 μg kg−1 of sulfathiazole/tylosin A/ciprofloxacin/chloramphenicol, the following protocol was employed. To 1,300 g of ‘blank’ honey were gradually added 0.65 ml of 100 μg ml−1 sulfathiazole and ciprofloxacin solutions, 0.26 ml of 100 μg ml−1 tylosin A and 0.13 ml of 100 μg ml−1 chloramphenicol, whilst the honey was being stirred continuously and then for a further 24 h at room temperature. Afterwards, aliquots (15 g) were prepared and stored using the same procedure as for test material no. 2

Other test materials (incurred)

Test material no. 4 (three honey samples containing known tylosin A concentrations) and test material no. 5 (incurred honeys with varied concentration of sulfonamides, ciprofloxacin and chloramphenicol blended with ‘blank’) were also provided, but for quality control purposes only, and results are neither presented nor discussed.

Homogeneity testing of test materials

Ten aliquots of each test material (2, 3, 4 and 5) were extracted in duplicate and analysed by LC-MS/MS, following ISO/IEC17025:2005-accredited procedures, validated to Commission Decision 2002/657/EC. These data showed that the test materials were homogenous (data not presented), according to ISO17043. The concentrations of the various analytes in incurred material 4 and 5 varied between 3.8 and 93.6 μg kg−1.

Design of inter-laboratory study

This study was conducted in two stages at the beginning of 2012. Each laboratory was supplied with two protocols, one for the procedure of Bee4sensor® multiplex method and one for the protocol of the trial. During the pre-trial, the seven participating labs from five European countries had the opportunity to familiarise themselves with the method, analysing two honey samples of known residue concentration (‘blank’ and ‘1 × STC’).

For the main trial, six replicate samples of ‘blank’, ‘0.5 × STC’ and ‘1 × STC’ honey were analysed blind (i.e. all sample pots supplied to the participating laboratories were allocated a random code) over 3 days, together with two quality control samples (test material 4 and 5) on each day. This mini-trial was therefore designed to provide 42 data points per concentration level for the subsequent statistical evaluation.

Test protocol and interpretation of results

For the analysis of a honey sample, two aliquots (A and B) of 2.5 g each are required. The procedure used in this trial was as described in the kit insert. Briefly, one aliquot (A) is dissolved using acid hydrolysis, whereas the other aliquot (B) is dissolved in water. After liquid/liquid partitioning both aliquots with ethyl-acetate, the organic layers are evaporated to dryness under nitrogen. After reconstitution in a buffer, aliquots A and B are combined and applied to the well of a Bee4sensor® test kit for 5 min at 40 °C, as one sample extract. A dipstick is then incubated in this prepared well for 10 min at 40 °C (using the Heatsensor® device).

The Bee4sensor® kit is an indirect competitive-based assay in a dipstick format. The dipstick has five green capture lines, one for the control and one each for chloramphenicol (CAP), fluoroquinolones (QUINO), tylosin A (TYL) and sulfonamides (SULFA), which turn red (see Fig. 1) when in contact with the rising liquid.

Fig. 1
figure 1

Example of multiplex dipstick test lines for a blank honey sample and honey spiked at 0.5 × STC

For each test sample, the dipstick was assessed both visually and instrumentally via the Readsensor®. In both approaches, the colour intensity of the four capture lines of the analytes (test lines) is compared to the control capture line.

In the visual assessment, samples that ‘screen positive’ (potentially non-compliant) will have an absent or a reduced/weaker red colour (compared to the control line) at the test line. In the case of compliant (or ‘negative’) samples, the test line will show an increased red colour compared to the control line. If the control line fails to become visible, the result of the dipstick is recorded as invalid.

During the instrumental assessment, the intensity of the developed red colour at each of the capture lines is measured, and the results are expressed as the ratios of these measurements (test line versus control line). The software method was programmed with a certain set of ratios to enable objective interpretation of the colour intensity: Honey samples were reported (a) as free of antibiotics (i.e. ‘compliant’) with a ratio >1.1, (b) to contain trace antibiotic residues (‘low positive’) with a ratio of 0.9 to 1.1, and (c) to contain a higher concentration(s) of antibiotic (‘positive’) with a ratio <0.9. The visual assessment of the dipsticks was performed after the instrumental reading, but without looking at the instrumental results to avoid any bias.

Results and discussion

Visual assessment

Data from the qualitative results (visual assessment) were used to estimate an upper confidence interval for average false-negative and false-positive probabilities across laboratories. Visual assessment gave 100 % negative results (n = 42) for ‘blank’ honeys and 100 % positive results (n = 42) for test samples containing analyte at 100 % of the screening target concentration for all analytes. Hence, average false-positive and negative probabilities for visual assessment (at 100 % STC) are estimated to be, with 95 % confidence, no higher than 7 % for all analytes. This is estimated using a published approach [12], whereby if 42 independent observations are made without error, then the underlying rate of errors may be 0 % and is, with 95 % confidence, less than 1 − 0.051/42 = 7 %. One sample out of 42 containing chloramphenicol at 50 % of the target concentration was assessed to be negative. A more robust assessment of the method performance was conducted using the numerical data obtained using the Readsensor® (see below).

Instrumental assessment

Readsensor® results from the 42 measurements obtained over 3 days by the seven laboratories are shown in Tables 1, 2 and 3. These results were analysed (analysis of variance, random effects model with ‘days’ nested within laboratories) to produce one-tailed beta-expectation tolerance intervals [13, 14]. One-tailed beta-expectation tolerance intervals were used to provide the response ratio (a) above which 95 % of results would be expected to lie for samples that do not contain the analyte and (b) below which 95 % of results would be expected to lie for samples that contain the analyte at 50 and 100 % of the screening target concentration (Table 4). No results were excluded from the analysis.

Table 1 Readsensor® results for unfortified (blank) honey
Table 2 Readsensor® results for honey fortified at 50 % of the screening target concentration
Table 3 Readsensor® results for honey fortified at 100 % of the screening target concentration
Table 4 Analysis of Readsensor® results

The intervals were estimated as follows. For each concentration, the size of the reproducibility standard deviation S R was estimated by:

$$ {S}_R=\sqrt{S_{\mathrm{lab}}^2+{S}_{\mathrm{day}}^2+{S}_r^2} $$
(1)

where S 2lab and S 2day were the between-lab and between-day components of variance, and S 2 r was the error variance. The standard error associated with the mean (across laboratories and days) was estimated by:

$$ {s}_{\mathrm{mean}}=\sqrt{\frac{S_{\mathrm{lab}}^2}{n_{\mathrm{lab}}}+\frac{S_{\mathrm{day}}^2}{n_{\mathrm{day}}}+\frac{S_r^2}{n_r}} $$
(2)

where n lab and n day were the number of laboratories and days (seven labs; 21 days:3 days per laboratory) and n r was the total number of measurements (42).

Then, the value of the lower 95 % beta-expectation tolerance interval (y L ) was estimated by:

$$ {y}_L=\overline{y}-{t}_{df,0.95}\sqrt{S_R^2+{S}_{\mathrm{mean}}^2} $$
(3)

For samples that did not contain the analyte, the value of the upper 95 % beta-expectation tolerance interval (y u ) was estimated by:

$$ {y}_U=\overline{y}+{t}_{df,0.95}\sqrt{S_R^2+{S}_{\mathrm{mean}}^2} $$
(4)

for samples that contained the analyte at 50 and 100 % of the screening target concentration, where t df,0.95 is the 95th percentile of the t distribution with df as degrees of freedom. The number of degrees of freedom, df r , associated with the reproducibility standard deviation S R , was estimated by the Welch–Satterthwaite equation:

$$ d{f}_R=\frac{{\left({S}_{\mathrm{lab}}^2+{S}_{\mathrm{day}}^2+{S}_r^2\right)}^2}{\frac{S_{\mathrm{lab}}^4}{d{f}_{\mathrm{lab}}}+\frac{S_{\mathrm{day}}^4}{d{f}_{\mathrm{day}}}+\frac{S_r^4}{d{f}_r}} $$
(5)

where df lab, df day and df r are the number of degrees of freedom associated with each of the estimates of between-lab, between-day and residual error components of variation. Table 4 shows how beta-expectation tolerance intervals were calculated for each concentration of each analyte.

Beta-expectation tolerance intervals were inverted numerically to provide an estimate of the expected false-positive and false-negative probability associated with a range of different critical values for response ratios. This was done by adjusting the probability associated with the t value in Eqs. 3 and 4 until the value of Y L (for false positives) or Y U (for false negatives) matched the value of the critical level that was being assessed. The relation between the critical value that is chosen to make detection decisions and the expected false-positive and false-negative probabilities are shown in Figs. 2, 3, 4 and 5.

Fig. 2
figure 2

Relation between expected false-positive probability, false-negative probability and the critical value for reader response for sulfathiazole

Fig. 3
figure 3

Relation between expected false-positive probability, false-negative probability and the critical value for reader response for tylosin

Fig. 4
figure 4

Relation between expected false-positive probability, false-negative probability and the critical value for reader response for ciprofloxacin

Fig. 5
figure 5

Relation between expected false-positive probability, false-negative probability and the critical value for reader response for chloramphenicol

Data from Table 4 and Figs. 2, 3 and 4 clearly demonstrate that the instrumental (Readsensor®) measurement of the Bee4sensor® multiplex test is capable of detecting sulfathiazole and tylosin at or below a concentration of one half of the EU recommended concentrations [5], i.e. 25 and 10 μg kg−1, respectively. In addition, ciprofloxacin could be reliably detected at 25 μg kg−1. Furthermore, data previously obtained within our laboratory demonstrated that the method is also capable of detecting a wide range of sulfonamides (n = 10) and fluoroquinolones (n = 7) at similar concentrations (data not shown) to the ‘representative analytes’ that were selected for use in this inter-laboratory validation exercise. Unfortunately, the Bee4sensor® kit was not sufficiently sensitive to detect chloramphenicol at the EU MRPL of 0.3 μg kg−1 [4]. The method did, however, reliably detect this analyte at 5 μg kg−1 and would therefore be a valuable tool to give an early warning of honey containing higher concentrations. The detection of higher concentrations of chloramphenicol may be of more use to the honey industry when testing individual hives or smaller lots rather than a blended bulk product, as it is well known that high residue concentrations (between 200 and 20,000 μg kg−1) can be found in honey taken directly from the hive for over 100 days after dosing [15]. It may also be of use to enforcement authorities in times of emergency when many samples need to be tested within a short timescale. In this case, the multi-sensor could be used to quickly identify honeys containing the highest concentration of this banned antimicrobial, before subjecting any suspect (or ‘screen positive’) sample to confirmatory analysis by LC-MS/MS [11].

Conclusions

The results produced by the dipstick, determined by visual observation or instrumental measurement of the four test lines versus the control line, were comparable.

The programmed set of ratios, to enable objective interpretation of the colour intensity with the software method, was deemed suitable to generate consistent Readsensor® results. The chosen critical values/threshold ratios were confirmed by the statistical evaluation of the data obtained from the analysis of blind samples at 0, 50 and 100 % of the EU recommended concentrations [5]. The inter-laboratory validation of the multi-sensor, with seven European laboratories, therefore demonstrated that the Bee4sensor® kit is both robust and capable of meeting the target concentrations set within the EU for sulfonamides (sulfathiazole), fluoroquinolones (ciprofloxacin) and tylosin. The method is also suitable for detecting chloramphenicol in honey at concentrations ≥5 μg kg−1.