Introduction

Pharmaceuticals are emerging environmental micropollutants that received in the recent past increasing attention worldwide (Europe [14], America [5, 6], Australia [7], Asia [8], and Africa [9]). Their continuous introduction into the environment, bio-recalcitrance, and intrinsic ability to interfere with organisms concern the scientific community for their potential ecotoxic effects, toxicity towards humans, and the selection of antibiotic resistance in bacteria [1012]. Although observed concentrations in the environment ranging from nanograms per liter to micrograms per liter are for most of the pharmaceuticals below the acute toxicity lowest observed effect concentrations (LOECs), there is only limited knowledge about chronic effects, toxicity of transformation products, and mixture toxicity [13]. Concentration levels in wastewater effluents and effluent-influenced surface waters approaching chronic toxicity LOECs have been observed recently for some specific pharmaceuticals such as carbamazepine, clofibric acid, diclofenac, fluoxetine, propranolol, salicylic acid, and oxazepam, and increased tetracycline and sulphonamide antibiotic resistance was observed in wastewater effluents [1315]. No current legislative framework exists in the European context defining allowable concentrations for these potential harmful pharmaceuticals in the environment. However, recently, the European Commission introduced the watch list for emerging substances in the aquatic environment (Directive 2013/39/EU) including the pharmaceutical diclofenac.

The growing interest towards screening and quantification of this diverse group of pharmaceuticals in all kinds of environmental samples requests multi-residue analytical techniques. Full-spectrum high-resolution mass spectrometers (HRMS) such as time-of-flight (TOF) and Orbitrap instruments are therefore a promising alternative for the current state-of-the-art triple quadrupole tandem mass spectrometry (MS/MS) instruments. The latter typically performs a target analysis on a predefined limited set of target compounds hereby depending on the availability of standards. In contrast, the full-spectrum HRMS approach has shown the potential to analyze and identify based on accurate mass a virtually unlimited number of analytes and offers the ability for both suspect screening and target quantification simultaneously [9, 1624]. In suspect screening using full-spectrum HRMS instruments, there is no a priori need for standards because the acquired chromatograms are searched for the exact ion masses of a predefined list of suspect compounds within a certain mass tolerance [20]. In a next stage, confirmation of the found suspects with analytical standards based on retention time and/or fragment ions is possible, and target quantification can be performed through validation of only the limited set of confirmed compounds.

Recently, different accurate mass-based screening strategies were developed and applied for suspect screening of pharmaceuticals and other micropollutants in surface waters [9, 16, 25, 26]. However, avoiding numerous false-negative findings and reducing the number of false-positive findings is still a main challenge, and the performance and optimization of such screening strategies is not yet systematically investigated.

Apart from multi-residue screening, achieving quantification of trace amounts is a second challenge in environmental analysis. Usually, samples must be preconcentrated using an enrichment step such as solid-phase extraction (SPE), and a clean-up of interfering matrix compounds is necessary to enhance the method’s performance limits. However, a recent review discussed the applicability of large-volume injection (LVI) as an alternative for the widely applied but labor-intensive SPE techniques for trace analysis of environmental matrices thereby speeding up the analytical procedure [27].

Hence, the aim was to investigate and improve the potential of large-volume injection–ultra performance liquid chromatography (LVI-UPLC) in combination with quadrupole time-of-flight (QTOF) HRMS for both fast screening and target quantification of traces of pharmaceuticals. An optimized and validated novel analytical method for a broad variety of multi-class pharmaceuticals is presented, hereby aiming to screen and quantify traces of pharmaceuticals in drinking and surface water. To reach these goals, prior to this research, we investigated and optimized the determination of the accurate mass for qualitative analysis, the construction of extracted ion chromatograms for quantification, and the calculation of the decision limit and detection capability for the validation in HRMS applications [28].

Starting from this knowledge, a suspect screening strategy was developed applying a novel signal-intensity-dependent mass error tolerance aiming the detection of 69 pharmaceuticals without the a priori availability of standards, hereby keeping the false-negative rate at 5 % and simultaneously minimizing the number of false-positive findings (section on “Development of the signal-intensity-dependent suspect screening model”). In a second part, both spiked and unspiked drinking and surface water samples were analyzed, and the results of a full validation for target quantification of the 69 pharmaceuticals are presented (section on “Validation for target quantification”). Finally, the results of both the suspect screening and target quantification study on a drinking water and five Belgian surface water samples are presented (section on “Application in surface and drinking water”). These results are to be interpreted as a first application of the new method and a proof of concept without aiming to set up an extended monitoring campaign. The applicability, advantages, and limitations of large-volume injection ultra-performance liquid chromatography (UPLC) in combination with full-spectrum HRMS for rapid screening are discussed in the section “Evaluation of large-volume injection UPLC and HRMS for rapid screening and quantification: pros and cons.” A comprehensive scheme representing the workflow for this study is presented in Fig. 1.

Fig. 1
figure 1

Comprehensive scheme representing the workflow for the development of the signal-intensity-based screening approach and for its application and evaluation prior to target quantification

Experimental section

Chemicals

The 69 pharmaceutical standards and their respective suppliers are listed in Table S1. They comprise a representative set both from analytical and environmental point of views, including some highly consumed pharmaceuticals such as paracetamol, ibuprofen, and diclofenac, and covering a broad range of physical–chemical properties. The molar masses of the studied pharmaceuticals range between 151 and 1,240 Da, and they have a wide range of log K ow values (e.g., −2.8 for iohexol; 4.2 for diclofenac; 4.7 for fluoxetine). Methanol, acetonitrile, and formic acid were purchased from Biosolve (Valkenswaard, The Netherlands) and NaOH from Merck (Damstadt, Germany). Deionized water was produced using Q-Gard2 cartridges in a MilliQ-water system (Millipore, USA).

Individual stock solutions of the pharmaceuticals were prepared on weight basis and dissolved in 10 mL of solvent (used solvents are listed in Table S1) to a final concentration of 1 mg mL−1. Daily, a standard mix of the pharmaceuticals was prepared at a concentration of 10 μg L−1 in deionized water. Standard and matrix-matched calibration curves were prepared by serial dilution of the standard mix to a final concentration of 5, 1, 0.5, 0.1, 0.05, and 0.01 μg L−1 in deionized, drinking, and surface water, respectively.

Sampling and sample pretreatment

Drinking water was taken from a drinking water production center (Antwerpse Waterwerken) in Rumst, Belgium. Five surface water samples were collected in prerinsed amber glass bottles on five different locations along the river Maas and the Albert channel, Belgium, and stored at 4 °C in the dark for no longer than 24 h prior to analysis. For the method validation, a drinking water sample and a surface water sample from the Albert channel, Belgium, were stored for a 1-month period and used for all the validation experiments. Prior to standard addition, surface water samples were filtered through 1.5 μm glass microfiber filters (934-AH, Whatman), and subsequently, 0.1 % and 0.02 % (v/v) formic acid was added to all samples for analysis in electrospray positive and negative ion mode, respectively.

Instrumental analysis

The analysis were performed using an UPLC system Waters Acquity (Waters, Milford, USA) equipped with an autosampler (CT2777 Sample Manager, Waters, Milford, USA) with 250 μL loop for large-volume injection and coupled to a Xevo G2 QTOF time-of-flight mass spectrometer with an orthogonal electrospray ionization (ESI) probe (Waters Corporation, Manchester, UK). Chromatographic separation was achieved with an Acquity UPLC HSS T3 150 × 2.1 mm column with 1.8 μm particle size supplied by Waters (Milford, USA) operated at 50 °C.

Briefly, for analysis in electrospray positive ion mode, the mobile phase used was (A) water/acetonitrile 98:2 (v/v) with 0.1 % formic acid and (B) acetonitrile with 0.1 % formic acid. In electrospray negative ion mode, the mobile phase used was (A) water/acetonitrile 98:2 (v/v) with 0.01 % formic acid and (B) acetonitrile. The elution gradient for both modes started with 1 min isocratic at 3 % B at a flow rate of 450 μL min−1 followed by a linear increase to 98 % B in 11 min. Subsequently, the flow rate was increased to 600 μL min−1, and initial conditions were recovered in 3 min. The total time for the chromatographic analysis was 19 min. The first 1.6 min of the eluent was diverted to the waste to prevent clogging of capillaries or build-up of salts on optics in the mass spectrometer. The sample injection volume was 250 μL.

The QTOF mass spectrometer was operated at a resolving power of 20,000 at full width at half maximum acquiring profile data over an m/z range of 50–1,200 Da. Data were acquired in MSE mode in which two acquisition functions with a low (LE function) and a high (HE function, ramped from low to high) collision energy acquire alternating parent and fragment ions, respectively. Leucine enkephalin was used as lock mass for the mass calibration and was continuously infused via the lock mass ESI probe. Vergeynst et al. [28] gave a more detailed overview of the chromatographic and mass spectrometric conditions. The data station operating software was Masslynx version 4.1 (Waters).

Development of the suspect screening methodology

Investigation of the relation between the accurate mass error and the ion’s signal intensity

For screening and accurate mass determination, the chromatograms were converted to centroid data by the automated peak detection algorithm provided with the Masslynx software version 4.1 (Waters) [28]. Extracted ion chromatograms (XICs) were constructed utilizing an optimized mass window width of 50 ppm (exact mass ± 25 ppm [28]) around the exact masses of the [M+H]+ and [M–H] ion for the positive and negative ion mode, respectively, and the accurate mass attributed to a chromatographic peak is determined as the averaged mass over seven consecutive centroid scans around the chromatographic peak apex. The obtained data were subsequently copied to an excel spreadsheet for further data treatment.

To develop the screening strategy, a model describing the relation between the accurate mass error and the ion’s signal intensity has been defined and calibrated. Therefore, a surface water sample spiked with analytical standards (0.01, 0.05, 0.1, 0.5, 1, and 5 μg L−1) of a sub-selection of 44 pharmaceuticals (Table S1) was analyzed and used as training dataset. The 44 pharmaceuticals covered the whole mass range from 151 to 1,240 Da (Table S1) and represented the desired wide signal intensity range from 100 to >100,000 arbitrary units (a.u.). The model calibration was performed using the R 2.14.1 (www.r-project.org) software.

Retention time and fragments for confirmation

Analytical standard mixtures of five to ten pharmaceuticals with a concentration of 10 μg L−1 in deionized water were injected and analyzed in electrospray positive and negative modes for the determination of the retention time (t R) and for the identification of the most abundant fragment ion for confirmation. The pharmaceuticals in the standard mixtures were selected so that their peaks were well separated in the chromatogram. A mass difference of at least 18 Da was preferred for the selection of a fragment ion to avoid the non-specific loss of water or ammonia. The retention time and most representative fragment ion are presented in Table S1. The selected ionization mode for each pharmaceutical was the ionization mode for which the lowest instrumental decision limit (section “Instrumental validation”) was obtained.

Validation strategy for target quantification

For quantification purposes, the extracted ion chromatograms were generated and manually integrated from the raw profile data utilizing an optimized mass window width of 50 ppm [28]. The validation was performed taking the EU Commission Decision 2002/657/EC as a guideline. The method validation was performed for drinking and surface water, and deionized water was used for the instrumental validation. Only peaks deviating not more than 2.5 % from the retention time listed in Table S1 are considered (2002/657/EC).

Instrumental validation

For the intraday and interday instrumental validation, five repetitions of a standard calibration curve (blank, 0.01, 0.05, 0.1, 0.5, 1, and 5 μg L−1 in deionized water) were performed on 1 day and on 5 days in a time period of 2 weeks, respectively. Instrumental repeatability and reproducibility are expressed as the relative standard deviation (RSD) of the integrated peak areas of five repeated injections of analytical standards on 1 and 5 days, respectively. The instrumental decision limit (CCα) and instrumental detection capability (CCβ) are determined from the repeatability data, following the methodology recently proposed by Vergeynst et al. [28]. Linearity is tested based on the F-test for lack of fit in the regression (2002/657/EC) for the standard calibration curve under repeatability conditions (n = 5 for each concentration level). If non-linearity is concluded, linearity is tested again after contracting the working range by omitting the highest concentration level.

Calibration and quantification

Daily external calibration was performed to account for the interday variability of the analytical sequence. The parameters of the standard calibration curve are estimated by weighted least squares with the weights for the squared residuals estimated as the reciprocal of the squared concentration (1/x 2).

For quantification in both drinking and surface water samples, matrix effects have to be determined. Therefore, the calculated concentrations of a matrix-matched calibration curve were plotted as a function of the theoretical concentration (n = 5 per concentration level, reproducibility conditions). The slope of this curve equals the extent of the matrix effects. A slope = 1 (expressed as 100 %) is obtained when no matrix effects are present, and slopes >1 and <1 represent signal enhancement and suppression, respectively. When quantifying pharmaceuticals in drinking and surface water (section on “Target quantification”) the calculated concentrations were corrected for the matrix effects, which were determined on samples collected at the same locations.

Method validation

The method validation for both the drinking and surface water consisted of a matrix-matched calibration curve (unspiked, 0.01, 0.05, 0.1, 0.5, 1, and 5 μg L−1), which was repeated on 5 days within 2 weeks. Daily, a standard calibration curve (blank, 0.01, 0.05, 0.1, 0.5, 1, and 5 μg L−1 in deionized water) was analyzed for external calibration. Each series was followed by a blank assay to prevent cross-contamination. The method reproducibility expresses the precision over 5 days as the RSD of the calculated concentrations after calibration. The method decision limit (CCα) and method detection capability (CCβ) were determined as explained in the section “Instrumental validation,” considering that here, the peak areas are replaced by calculated concentrations in the matrix sample. The mass error was determined for all the compounds in deionized, drinking, and surface water at the concentration level corresponding to the respective CCαs and CCβs, and at 5 μg L−1. The mass error was determined under reproducibility conditions (n = 5), and its precision was determined by calculating the 95 % confidence limit (1.96 × standard deviation).

Results and discussion

Large-volume injection ultra-performance liquid chromatography

The applied gradient allowed sufficient retention and separation of the targeted analytes on an Acquity UPLC HSS T3 column. The 69 analytes elute within a retention time ranging from 1.94 to 11.49 min. A chromatogram of an analytical standard is presented in Fig. 2. During the optimization process, particular efforts were made to improve the chromatography of early eluting analytes, which can be affected by the LVI. The length of the initial isocratic gradient was increased to 1 min, which allowed better column focusing and improved the peak shape of fast eluting compounds. The injected solvent water, which has elution strength lower than the starting gradient, enabled sufficient retention and good-quality peak shapes. Addition of formic acid (0.1 % and 0.02 % in ESI-positive and ESI-negative mode, respectively) to the samples improved the peak shapes (reduced double peaks, sharper peaks, less tailing) for early eluting (t R < 6 min) compounds (sulfadiazine, sulfamerazine, sulfamethoxazole, and salicylic acid).

Fig. 2
figure 2

Extracted ion chromatogram (XIC 50 ppm) of an analytical standard (5 μg L−1) showing the good peak shape of 23 selected analytes having retention times distributed over the whole chromatographic analysis. Relative signal intensities are used for the y-axis to show the peak shape of both low and high signal intensity peaks. The absolute signal intensity (a.u.) is given for each peak between brackets

Development of the signal-intensity-dependent suspect screening model

In a first stage, in order to create the training dataset, the obtained chromatograms of the spiked surface water samples were searched for the exact masses of a sub-selection of 44 pharmaceuticals (Table S1). The mass error tolerance was initially set at ±25 ppm as such that all the peaks present in the constructed XICs are found, and reasonably low minimal signal intensity (i.e., chromatographic peak height) of 100 a.u. was chosen avoiding the detection of numerous noise peaks. For the given set of pharmaceuticals, the lowest concentration corresponding to a signal intensity of at least 100 a.u. in surface water is given in Table S1. The aim was to investigate to which extent the mass error tolerance could be narrowed assuring a false-negative rate of 5 % and avoiding numerous false-positive findings. To label a peak as confirmed, its retention time cannot deviate more than 1.96 × standard deviation, i.e., within the 95 % confidence interval, from the retention time listed in Table S1 (\( \mathrm{deviation}\kern0.3em {t}_{\mathrm{R}}\le 1.96\cdot {\sigma}_{t_{\mathrm{R}}} \)). The sub-selection of 44 pharmaceuticals provided enough data for the model development, and the resulting training dataset consisted of a total of 208 observations (208 traces with a signal intensity >100 a.u.).

The variability of the accurate mass error obtained with the applied TOF-MS was shown to strongly decrease with increasing signal intensity, being in agreement with Vergeynst et al. [28] and Wolff et al. [29]. The variability of the mass error (ME) and the log-transformed signal intensities (i) are inversely related: \( \mathrm{variability}\; ME\sim \frac{1}{ \log (i)} \) (Fig. 3). Hence, the variability of the mass error was modeled as: ME ⋅ log(i) ∼ N(0, σ 2), and a value of 10.96 was obtained for the standard deviation (σ) after fitting the model to the training dataset. The modeled variability showed good normality as evaluated from a Q–Q plot (Figure S1) from which a good fit within the two first quantiles can be concluded.

Fig. 3
figure 3

The variability of the mass error decreases inversely with the log-transformed signal intensity. The training dataset (closed circles) is used for the calculation of the 95 % confidence limits (section on “Development of the signal-intensity-dependent suspect screening model”). Screening results of one drinking water and five surface water samples of the confirmed (plus signs) and non-confirmed (open circles) suspects based on the retention time are also presented (section on “Application of the suspect screening methodology”)

This model permitted to draw the 95 % confidence limits of the mass error as a function of the signal intensity for which holds that \( \left| ME\right|\le \frac{1.96\cdot \sigma }{ \log (i)}\kern0.3em \left(\mathrm{ppm}\right) \) with σ = 10.96. From the 208 confirmed observations in the training dataset, eight observations fell out of the 95 % confidence limits resulting in an effective false-negative rate of 4 %. As an important outcome of the newly developed screening model, a mass error tolerance of 10.7, 7.2, and 5.4 ppm will be applied for the positive conclusion of peaks with signal intensities of 100, 1,000, and 10.000 a.u., respectively, or, in other words, the observations should fall within the 95 % confidence limits in Fig. 3 to be restrained.

Validation for target quantification

Instrumental validation

The results of the instrumental validation are given in Table 1 (Instrumental CCα) and Table S2 (intraday repeatability, interday reproducibility, instrumental CCβ, and linear range). For a majority of the compounds, the instrumental intraday repeatability and interday reproducibility are in general more or less constant in the upper concentration range and increase for concentrations close to the instrumental CCα. This is illustrated in Fig. 4 for diclofenac. The standard deviation increases linearly with the concentration whereas the RSD increases at lower concentrations, i.e., close to CCα and CCβ, and leveled off for higher concentrations. These findings are in agreement with CMA 6A [30] and confirm the validity of the applied weighted least squares methodology (1/x 2 weighting) for the linear calibration (section on “Calibration and quantification”). For more information on the weighted least square theory, the reader is referred to Neter et al. [31]. The intraday variability was better than 20 % for most of the analytes over the whole concentration range. Higher interday RSDs are noticed with some values >40 % for concentrations at or close to the instrumental CCα occur when no trace was found for at least one out of the five repeated injections (e.g., clenbuterol, cyclophosphamide, fluoxetin, furazolindone, and ketoprofen). Daily external calibration is performed to take interday instrumental variations into account.

Table 1 Parameters of the instrumental and method validation indicating the performance of the analytical method for surface water
Fig. 4
figure 4

The standard deviation (closed circles) on the integrated peak area measured under reproducibility conditions increases proportionally with the concentration of diclofenac in deionized water. The relative standard deviation (open circles) shows a steep decrease at low concentrations (<0.5 μg L−1) and subsequently levels off

The instrumental decision limits ranged from 2.5 to 125 pg injected for all compounds. For some compounds, the peak intensity indicates that even CCαs lower than 2.5 pg (0.01 μg L−1) could be reached with the TOF-MS used in this study, but this could not be confirmed because the lowest tested concentration level was 0.01 μg L−1. Comparable (i.e., between tenfold higher and tenfold lower) instrumental detection limits (IDLs) were found in literature for multi-residue methods for the analysis of the same pharmaceuticals using TOF and triple quadrupole mass spectrometers (Fig. 5) [18, 19, 3237]. On the other hand, up to 100-fold lower instrumental detection limits were found for quadrupole linear-ion trap tandem mass spectrometers [3].

Fig. 5
figure 5

Comparison of the instrumental (IDL, picograms on column) and method (MDL, concentration in matrix) detection limits of multi-residue methods for pharmaceuticals in drinking and surface water using SPE and online-SPE combined with different MS instruments. Only the pharmaceuticals being the same as those used in this research are considered, and the number of corresponding compounds is given (n). The boxplots show the minimal and maximal values, and the 25, 50, and 75 % percentiles

For a majority of the compounds, linearity was demonstrated for a range of at least two orders of magnitude (i.e., up to 1 or 5 μg L−1). However, for nine compounds, a significant deviation from linearity was observed and limited up to 0.1 or 0.5 μg L−1. These results suggest that linear ranges of two orders of magnitude for most compounds are to be expected for the utilized TOF-MS, which is in general at least one order of magnitude less than the linearity of triple quadrupole and quadrupole linear-ion trap tandem mass spectrometers. These findings are in agreement with the findings of other authors [19, 33, 36, 38] using TOF-MS.

Method validation

The results for the method validation for surface and drinking water are given in Table 1 and Table S3, respectively. Retention time deviations between the analytes in matrix and analytical standards were <2.5 % for all analytes. The interday reproducibility for drinking and surface water barely increased compared with the RSDs of the instrumental intraday repeatability (Table S2), indicating that the daily external calibration was effective to reduce the day-to-day variability. At a concentration of 0.5 μg L−1, the average RSDs of the interday reproducibility are 14 and 15 % for surface and drinking water, respectively, which is only a small increase compared with the instrumental intraday repeatability of on average 9 %.

The method decision limits (CCα) and detection capabilities (CCβ) ranged from 0.01 μg L−1 to concentrations higher than 5 μg L−1 for drinking and surface water. CCαs lower than 0.1 and 0.5 μg L−1 were obtained for 35 (50 %) and 51 (74 %) out of the 69 compounds in surface water and for 30 (43 %) and 52 (75 %) out of the 69 compounds in drinking water. For some chemically related pharmaceuticals, such as the iodated X-ray contrast media, typically less good performance limits were obtained. Their short retention times (e.g., iohexol in Fig. 2), and therefore the quite aqueous composition of their elution solvent, which negatively influences the electrospray ionization efficiency, may explain these poorer performance limits. Less good performance limits for some outlying compounds such as the iodated X-ray contrast media are, however, to be expected in multi-residue methods for a broad variety of compounds.

Other authors [3, 16, 33, 34, 3944] reported 10 to 100 times lower method detection limits (MDL) for the same compounds using triple quadrupole MS, quadrupole linear-ion trap tandem MS, and Orbitrap and time-of-flight high-resolution mass spectrometry (TOF-HRMS) (Fig. 5). These authors all applied (online) SPE as enrichment step to increase the method performance limits and reached 100- to 1,000-fold preconcentration factors (Table S4). By applying SPE and online SPE, the total amount of substance injected increased by a factor of 2 to 80 (volume injected × preconcentration factor, assuming 100 % recovery) compared with a 250-μL large-volume injection without SPE enrichment, which may explain the lower obtained performance limits in this study.

Matrix effects are a known drawback related to the use of ESI sources in liquid chromatography-MS. Co-eluting organic and inorganic matrix compounds can induce signal suppression or, less frequently, enhancement and therefore affect the sensitivity of the analytical method, lead to decreased reproducibility or affect linearity [45]. Calculated matrix effects ranged from 58 to 310 % for drinking water and from 15 to 242 % for surface water for all compounds. Similar values for signal suppression and signal enhancement were found in literature even when applying a SPE clean-up step [16, 37, 40]. These results confirm that, as stated by Busetti et al. [27], widespread applied clean-up strategies such as SPE are less effective in removing interfering matrix compounds than commonly thought in multi-residue analysis of water samples, where washing protocols are rather simple. Even without SPE, avoiding highly polar organic and inorganic (salts) compounds in the MS can be achieved by starting the chromatographic gradient with aqueous eluent, which can be diverted to the waste by installing a post-column valve [28]. This ‘wash step’ is highly recommended in LVI-LC [27]. The chromatographic gradient used in our method started by 1 min isocratic with a mixture of aqueous solvent with 4.94 % acetonitrile. The 69 analytes that were targeted in this study eluted within 1.94 and 11.49 min. Therefore, the first 1.6 min of the eluent could be diverted to the waste without compromising the screening capability of our method even for the most polar compounds (e.g., iohexol, log K ow −2.8) of our suspect set. However, even after washing the sample, it is necessary to correct for matrix effects for appropriate quantification.

The mean mass error (Table 2) was independent of both the matrix of the sample and the concentration level and between −0.5 and 0.5 ppm. However, the variability clearly rose at low concentrations: The 95 % confidence limits doubled at CCα and CCβ compared with 5 μg L−1. At 5 μg L−1, the 95 % confidence limit of the mass error was about 5 ppm for all matrices, which is a typical value that can be found in literature for TOF mass spectrometers.

Table 2 Mean mass error and precision (n = 5 observations × 69 pharmaceuticals) of the TOF-MS at the CCα and CCβ of each of the compounds and at 5 μg L−1 for deionized, drinking, and surface water

Application in surface and drinking water

Application of the suspect screening methodology

The developed suspect screening strategy was applied on one drinking water sample and five surface water samples. First, the obtained chromatograms were screened for the presence of peaks having a minimal signal intensity (i) of 100 a.u. and a mass error for which holds that \( \left| ME\right|\le \frac{1.96\cdot \sigma }{ \log (i)}\kern0.3em \left(\mathrm{ppm}\right) \) with σ = 10.96. Second, the resulting restrained peaks were tentatively confirmed when their retention time deviates not more than 1.96 × standard deviation, i.e., within the 95 % confidence interval, from the retention time listed in Table S1.

In the drinking water sample, four pharmaceutical compounds (bisoprolol, enoxacin, propranolol, and propyphenazone) were restrained by the signal-intensity-dependent suspect screening and subsequently confirmed based on the retention time. In the five surface water samples, 30 pharmaceuticals (105 hits) were restrained by the screening strategy in at least one sample and confirmed based on the retention time (Table 3).

Table 3 Results from the suspect screening and target quantification for the five surface water samples

As an additional confirmation, the signal-intensity-dependent screening strategy (minimal signal intensity of 100 a.u. and \( \left| ME\right|\le \frac{1.96\cdot \sigma }{ \log (i)}\kern0.3em \left(\mathrm{ppm}\right) \) with σ = 10.96) was also applied for searching the HE chromatograms for the presence of fragment ions of the respective parent ions. A fragment ion was confirmed when its retention time was within a window of 0.05 min around the retention time of its found parent ion’s peak. For 14 compounds in the five surface water samples, the fragment ions were also restrained. However, the sensitivity of the instrument in the MSE approach (HE function with ramped collision energy) seems not to be sufficient enough to obtain signal intensity i > 100 a.u. for fragment ions of a wide range of analytes at real environmental concentrations. Confirmation based on fragments was only possible for 32 of the 105 hits. In Figure S2, LE and HE chromatograms for atenolol and metoprolol are presented illustrating cases were confirmation based on fragments was successful and not successful, respectively.

Evaluation of the screening performance

In order to evaluate the performance of the applied suspect screening strategy, all peaks found in the surface water samples within a wider mass error tolerance of ±25 ppm are considered. For these peaks, the confirmed (+) and non-confirmed (o) peaks based on the retention time are presented in Fig. 3. The restrained peaks (157 hits related to 37 different suspect compounds) by the suspect screening fall within the 95 % confidence limits. The signal-intensity-based screening showed a good performance with a false-negative rate (i.e., peaks not-restrained by the suspect screening but confirmed by retention time) of 4.6 %. Out of the 157 restrained hits, 52 hits could not be confirmed by retention time and thus labeled as false-positive hits. Taking into account that these 52 hits were restrained in five samples analyzed towards 69 pharmaceuticals, the false-positive rate is about 15 % (i.e., 52/(5 × 69)). This may not be confused with the false discovery rate, which is the number of false-positives (52 hits) divided by the total number of positives (157 hits), and amounts about 33 %. For a more unbiased comparison of future (improved) screening methods, we prefer the use of the false-positive rate since this parameter is, in contrast to the false discovery rate, independent on the contamination level of the sample, i.e., the number of contaminants truly present in the measured samples (in this case, 157–52 = 105 hits).

The importance of the signal-intensity-based mass error is emphasized when a more general and often applied mass error tolerance of ±5 ppm is applied. In that case, the false-negative rate would account for 19 % of the compounds confirmed by retention time. These false-negatives had signal intensities below 800 a.u, which is in the lower intensity range of Fig. 3. The use of a signal-intensity-based mass error is therefore of utmost importance in multi-residue screening at trace levels.

Target quantification

Target quantification was performed on one drinking water sample and five surface water samples following the validated analytical method. In the drinking water sample, no traces exceeding the decision limit were measured. The presence of the four pharmaceutical compounds restrained and confirmed by the suspect screening could not be validated because their concentrations were below the decision limit. In order to be able to quantify drinking water relevant concentrations, at about 100-fold lower decision limits are required [3, 33].

In the five surface water samples, detection and/or quantification was achieved for 17 pharmaceutical compounds in at least one out of the five samples at concentrations ranging from 17 ng L−1 to 3.1 μg L−1 (Table 3). For five compounds (atenolol, caffeine, ibuprofen, roxithromycin, and sotalol), the concentration range exceeded the level of 100 ng L−1 at least once.

All the detected and/or quantified observations in the five surface water samples were also found by the suspect screening except for three hits. Furazolidone, which was detected twice at a concentration above the decision limit, was not restrained once by the suspect screening strategy due to its low signal intensity (<100 a.u.) indicating that the signal intensity limit of 100 a.u. might be too stringent in some cases resulting in unrestrained truly present compounds. Only one quantified observation of sulfamethoxazole was not restrained by the screening due to its too erroneous accurate mass.

Only limited studies reported concentrations of pharmaceuticals in Belgian surface waters. Loos et al. [46] conducted an EU-wide survey—including the river Scheldt, Belgium—of pharmaceuticals. Wille et al. [47, 48] detected and quantified eight pharmaceuticals in seawater (1–855 ng L−1) and marine organisms from the Belgian coastal zone from which five pharmaceuticals were also identified in this study (atenolol, carbamazepine, propranolol, salicylic acid, and sulfamethoxazole). Although for most pharmaceuticals concentration levels found in surface waters in this study are similar to those found in other European countries [13], only limited studies revealed the occurrence of alkylating agents (cyclophosphamide and ifosfamide) in surface waters [49].

Evaluation of large-volume injection UPLC and HRMS for rapid screening and quantification: pros and cons

Large-volume injection showed to be an important advantage of the presented rapid analytical screening and quantification technique. Good and stable (\( \mathrm{deviation}\kern0.3em {t}_{\mathrm{R}}\le 1.96\cdot {\sigma}_{t_{\mathrm{R}}} \)) chromatography was obtained in a 19 min UPLC separation, and the analytical method requires no sample pretreatment (except for filtering the sample). This is in contrast with most published analytical methods for the analysis of micropollutants in surface water applying laborious and time-consuming off-line SPE enrichment steps. Besides, sample enrichment techniques such as SPE preconcentrate compounds selectively, and, as highlighted by Chitescu et al. [16], achieving acceptable recoveries for all compounds is unlikely in multi-residue applications. On the other hand, SPE enables a clean-up of the sample which can be important to prevent contamination of the LC system and to reduce matrix effects for heavily polluted or salty samples. However, the drinking and surface water samples analyzed in this research did not affect the LC system and acceptable matrix effects were calculated. It still needs investigation, however, to point out what will be the potential of LVI-based analysis for a broader variety of environmental (waste)water samples.

Omitting selectivity through sample preparation for multi-residue screening is very relevant to assure a more reliable suspect screening. However, it should be denoted that, as shown in Fig. 5, less good method performance limits are obtained compared with other analytical methods using HRMS mass spectrometers due to the lower amount of analyte injected as a result of both the injection volume and the (online) SPE preconcentration factor. By consequence, validation of LVI-based screening methods is necessary to assure that sufficiently low performance limits are obtained for a broad variety of contaminants. Chitescu et al. [17] recently discussed how low method performance limits should be for multi-residue monitoring of surface waters towards micropollutants, ensuring sufficient protection to the environment. Although the environmental impact of pharmaceuticals is still far from fully understood, a screening threshold of 100 ng L−1, derived from ecotoxicity data, is mentioned for pharmaceuticals in surface water [16], which is similar to the 100 ng L−1 limit for pesticides in drinking water, as regulated by the EU Council Directive 98/83/EC. The method developed in this study showed the potential to detect 35 (50 %) pharmaceuticals at a concentration of 100 ng L−1 or lower, and for 51 (74 %) pharmaceuticals a decision limit of 500 ng L−1 and lower is reached. Although this is a promising result, more work is needed to further improve the sensitivity in order to be able to screen at a level of 100 ng L−1 or lower for a broad range of contaminants. Additionally, improved sensitivity is needed for unequivocal confirmation based on fragments and their ion ratio (2002/657/EC) at environmental relevant concentrations because, in this study, only for 32 out of the 105 hits fragment ions were found in the HE chromatogram.

A second important advantage of the developed suspect screening strategy is that there is no a priori need for analytical standards. For confirmation of the suspect screening results, only analytical standards of the restrained compounds are necessary, and if the aim is also quantification, the validation of only the restrained and confirmed compounds will be sufficient for a reliable quantification. Considering the five surface water samples, mass traces related to 37 different suspect compounds were restrained in the chromatograms (section on “Evaluation of the screening performance”). Analyzing only these 37 compounds as analytical standards would allow the confirmation based on the retention time. Finally, 30 out of the 37 compounds were confirmed by retention time. This means that for 10 % of the suspect compounds, false-positive hits occurred. The application of the developed screening approach prior to target analysis has thus the advantage that less analytical standards (37 instead of 69) are needed and that the workload for the validation can be reduced (30 instead of 69).

Conclusions

This paper investigates a novel methodology for suspect screening towards a broad variety of multi-class pharmaceuticals in surface water based on a newly developed analytical method combining 250 μL LVI-UPLC and QTOF-HRMS.

The signal-intensity-dependent accurate mass error in TOF-MS is taken into account in the screening model, yielding a false-positive rate of 15 % and improving the false-negative rate from 19 % (when a fixed mass error tolerance of 5 ppm is used) to 5 %. The elaborated screening methodology is shown to be a reliable and effective approach to be coupled with subsequent target quantification of the restrained analytes. Screening of five Belgian river water samples revealed the occurrence of 30 out of a list of 69 suspect pharmaceuticals, while the validated quantitative method enabled the detection of 17 pharmaceuticals in a concentration range of 17 ng L−1 up to 3.1 μg L−1.

Overall, LVI-UPLC combined with full-spectrum HRMS is a rapid and promising complement for the widely applied SPE combined with MS/MS for screening and quantification of micropollutants in environmental waters. Therefore, further research and application of LVI in combination with the newest-generation and more sensitive full-spectrum HRMS are encouraged.