Introduction

In 2002, the World Health Organization (WHO) defined endocrine-disrupting chemicals (EDC): “An endocrine disruptor is an exogenous substance or mixture that alters function(s) of the endocrine system and consequently causes adverse health effects in an intact organism, or its progeny, or (sub)populations” (WHO-World Health Organization 2002). One well-known mode of action for example is via receptor binding and subsequently the alteration of protein syntheses (Fig. 1). Since then, numerous studies and opinions have been published on that topic, and on a very high probability, scientific research confirms that EDCs cause adverse effects on our health and in wildlife populations via diseases or developmental disorders (Balabanic et al. 2011; Colborn et al. 1993; Kidd et al. 2007; WHO-World Health Organization 2002). Not only wildlife is affected by EDCs, but also it is hypothesized that several human health issues are associated with the exposure to several chemicals. Major routes of human uptake of EDCs are drinking water, diet, skin, and air (Toppari et al. 1996). Possible human health endpoints linked to EDCs are the decrease of sperm quantity and quality observed since the 1940s (Carlsen et al. 1992, 1995); an increase in testicular, prostate, and breast cancers; and alterations in pituitary and thyroid gland functions (Toppari et al. 1996, Snyder et al. 2003). However, it is discussed controversially among scientists whether these human health effects are de facto caused by EDCs in the environment. Although the existence of EDCs is obvious and the presence of endocrine disruptors in drinking water has been reported by several researchers from different countries (Hecker and Hollert 2011), the immediate evidence of an effect on human reproductive health is not compelling. Too little is known about the effects of chronic exposure of humans which occurs at far lower levels than that of aquatic species (Toppari et al. 1996, Tyler et al. 1998, Safe 2000, Snyder et al. 2003). Based on that background and the ubiquitous distribution of EDCs in the environment, e.g., pesticides (Mnif et al. 2011), flame retardants (Legler and Brouwer 2003), cosmetics (Caliman and Gavrilescu 2009), or additives in plastics (Rubin 2011), the regulation of production and application of known (and potential) EDCs is inevitable (Hecker and Hollert 2011).

Fig. 1
figure 1

Different kinds of receptors for hormones. On the left, thyroid and steroid hormones bind with nuclear receptors and directly affect protein syntheses via gene expression. Membrane receptors influence the cellular function indirectly via second messenger systems after binding with non-steroidal hormones (WHO-World Health Organization and UNEP-United Nations Environment Programme 2013)

At this point, the European Union (EU) member states will review the criteria proposed by the European Commission (2016) to identify and regulate EDCs. Focus of the current discussion is the introduction of the “endocrine mode of action” to determine a substance as an endocrine disruptor, instead of an endpoint for an adverse effect. The endocrine mode of action is not necessarily an (eco)toxicological hazard in itself. While identification (and regulation) on cancerogenic, mutagenic, and reproductive toxicants under EU law can be achieved by means of animal studies, EDC criteria require human data on health effects (Trasande 2016), especially when it comes to the assessment of drinking water-relevant substances. This would delay identification and regulation steps for each chemical by several years and could provoke adverse outcomes on our offspring and the environment as a consequence. Furthermore, in comparison to thousands of established chemicals and newly developed substances year-by-year (and all of their metabolites), resources to assess these products are already at their capacity limit. For example, in the year 2015, 322 million tons of hazardous chemicals were produced in the EU (Eurostat 2017).

Hence, there is a big need for a paradigm shift in the assessment of chemicals in general and in particular for EDCs in drinking water. Fast, easily manageable, and cheap test systems are required to handle the huge amount of chemicals and in best case, there should always be a direct reference to human health issues. In addition, not only precisely defined hazards should be addressed with concrete thresholds, but also risk assessment for toxicologically unknown substances should be promoted.

The health-related indicator value (HRIV) concept developed by the German Federal Environment Agency (UBA) addresses all these important aspects in a precautionary in vitro approach to assess toxicologically known and unknown single substances in drinking water regarding different endpoints: genotoxicity, neurotoxicity, and subchronical/chronical effects (Umweltbundesamt 2003). Endocrine disruption as a major issue regarding environmental quality and human health should be included as an additional mode of action into this concept. Here, we propose a set of well-known but enhanced in vitro bioassays as a hierarchical test battery for the assessment of EDCs in drinking water with regard to adverse human health effects (Grummt et al. 2013). This will provide a comprehensive practical guideline for water providers, public authorities, and industrial companies to secure the use of drinking water and its resources.

Materials and methods

Chemicals

All test substances (Table 1) were selected from a list of drinking water-relevant chemicals established by the RheinEnergie AG, Cologne, Germany and were received from the Federal German Environment Agency (UBA, Bad Elster, Germany). They were chosen by their water solubility (poor–good) and their predicted ER-binding affinities (non-binder to good-binder) to get a wide range of characteristics for evaluating the performance of the bioassays. All substances were dissolved in DMSO (Rotipuran®, = 99.8%, p.a., Carl Roth GmbH + Co. KG, Karlsruhe Germany). All experiments included media (negative) and solvent controls.

Table 1 Prepared concentrations of the test chemicals, solubility in water, and predicted ER-binding affinities

Cell lines and organisms

U2OS cells were purchased from BioDetectionSystems (BDS), Amsterdam, Netherlands. H295r cells were obtained from the American Type Culture Collection (ATCC). Potamopyrgus antipodarum specimens were obtained from the Institute for ecology, evolution, and diversity, Goethe University, Frankfurt a.M., Germany. The BDS cell lines were cultivated according to the provider’s protocol 083b (BioDetection Systems 2012). H295r cells were cultured according to the standard operating procedure “Culturing of the H295R human adrenocortical cell line (ATCC CLR-2128)” (Hecker et al. 2011) and the OECD Guideline 456 “H295R Steroidogenesis Assay” (OECD 2011). The snails were cultured according to Schmitt et al. (2011) in 20-l glass aquaria filled with reconstituted water at 16 ± 1 °C with a light:dark cycle of 16:8 h in a climate room. A population density of 100 snails per 1 l must not be exceeded. Once a week, part of the water had to be exchanged with fresh reconstituted water.

Prediction of ER-binding affinities

Estrogen receptor-(ER) binding affinities (database OASIS) were simulated using the (Q)SAR Toolbox 3.3.5 provided by the Organization for Economic Cooperation and Development (OECD 2015). Simulations were performed for parent compounds, as well as for measured metabolites and metabolites predicted by the liver metabolism simulator of the toolbox (Serafimova et al. 2007). The performance of the model was evaluated in 2016 (Bhhatarai et al. 2016) with more than 1800 compounds and had a sensitivity of > 75% and a specificity of > 86%. Results of the model were qualitatively compared with experimental data from the bioassays.

ER-/AR-CALUX® assay

Hormones or hormone-like substances bind to the respective receptor (ER/AR) and start a specific response in the cell by expressing the genes marked at the corresponding responsive element in the nucleus. Receptor-mediated estrogenic or androgenic effects are highly specific in U2OS cells since they are the only available receptors after genetic modification. So the assay sensitively shows the potential of substances to bind to estrogenic or androgenic receptors.

3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assay according to Blaha et al. (2004) was performed to reveal cytotoxic effects of substances and to define the test concentrations where cell viability was at least 80%.

U2OS human bone osteosarcoma cells had been stably transfected with a human ERα/AR receptor (pSG5-neo-hERα/AR) and an estrogen/androgen-responsive luciferase reporter gene (pEREtata-Luc/pAREtata-Luc, Sonneveld et al. 2005). These modified cell lines are sensitive and highly responsible to (anti)estrogenic/androgenic compounds (Sonneveld et al. 2005; van der Burg et al. 2010). In the case of binding of an estrogen/androgen active agent, the cells produce the enzyme luciferase. The luciferase activity is detected by measurement of light emission after addition of the substrate luciferin. U2OS cells have no capabilities to express CYP450 for metabolic activities (Brinkmann et al. 2014).

The chemically activated luciferase expression (CALUX) assays were performed according to the standard operation procedure P-BDS-085d by BDS (2013). In brief, cells were seeded in a white 96-well plate at a density of 1 × 104 cells per well. After 24-h incubation, the cells were exposed to test substances, positive control (17β-estradiol, E2) and solvent control, respectively, for 24 h at 37 °C and 5% CO2 in triplicate. With U2OS cells, additional tests were performed using a specifically designed S9 mixture (see Table 2) to simulate metabolism. Test substance, medium, and S9 mixture were incubated in brown glass vials (45 × 14.7 mm, VWR International) over night at 37 °C and 5% CO2 before applying the solution to the cells. For the assay read out, the exposure medium was removed, 50 μl of PBS and 50 μl of the substrate SteadyLite® (Perkin Elmer, USA) were added, and luciferase activity was measured by means of a luminescence photometer (Infinite M200, Tecan, Crailsheim, Germany). Mean luminescence values of test substances and E2 reference were corrected for the solvent control response (= 0). All values were then divided by the maximal induction response of E2 concentrations (= 1) to get scaled values between 0 and 1 (Villeneuve et al. 2000). On the basis of these calculations, the lowest observed effect concentrations (LOEC) were derived.

Table 2 Mixture of co-factors and S9 fraction

AR-CALUX was performed with U2OS (AR) cells in the same way as described above for estrogenic effects. But the positive control was dihydrotestosterone (DHT).

Recombinant yeast estrogen screen

The recombinant yeast estrogen screen (RYES) was externally conducted by Incos Bote Cosmetic Gmbh, Alzey, Germany. The test protocol is based on Routledge and Sumpter (1996). In brief, yeast cells (Saccharomyces cerevisiae) are genetically modified with a human estrogen receptor coupled with a reporter gene (LacZ). If an estrogenic active substance binds to the receptor, subsequently, an enzyme (beta-galactosidase) is produced which converts a dye and induces a color reaction. This change can be measured and correlated to the receptor-binding potential of the test substance.

H295r steroidogenesis assay

H295r cells are able to express the genes for all enzymes of the steroidogenesis and can therefore produce all steroid hormones (Fig. 2) found in the adult adrenal cortex and the gonads. The purpose of this assay is the detection of alterations in the E2 and T production by inhibition or induction of the enzymes in the steroid hormone synthesis.

Fig. 2
figure 2

Schematic representation of the steps involved in steroid hormone synthesis and the tissue localization of the reactions within the adrenal gland redrawn from Hilscherova et al. (2004). Marked genes (green) are investigated in the present study. CYP cytochrome P450, HSD hydoxysteroid dehydrogenase, DHEA dehydroepiandrosterone

The exposure of H295r cells was conducted according to the standard operating procedure “Exposure of H295R human adrenocortical carcinoma cells to assess the effects of chemicals on testosterone and 17β-estradiol production” (Hecker et al. 2007, 2011). In brief, cells at a density of 300,000 cells/ml were seeded in 24-well plates and incubated for 24 h at 37 °C and 5% CO2. Test substances were applied in seven concentration steps per substance with additional quality controls (blanks, solvent, forskolin, and prochloraz) and incubated for 48 h. Due to cytotoxic effects and solubility in DMSO, the number of concentration steps could vary. The medium was removed and frozen at − 80 °C for the measurement of hormone concentrations, while the cells were checked for fitness using an MTT assay according to Blaha et al. (2004). The viability of cells should be ≥ 80%. For quantification of hormone production by cells during the exposure period, hormones were extracted from the medium by means of diethyl ether two times and quantified by an enzyme linked immunosorbent assay (ELISA). Hormone analysis was conducted using commercial test kits according to the manufacturer’s manual (Cayman Chemical Company, Ann Arbor, MI, USA). Estradiol concentrations were determined using the “Estradiol EIA Kit” (E2, Item No. 582251) while testosterone concentrations were determined using the “Testosterone EIA Kit” (T, Item No.582701). Hormone concentrations per well were derived from the E2 and T standard curves. Results were normalized to the mean non-specific binding value (NSB-value) control of each test plate and expressed as fold changes relative to the DMSO control. For gene expression experiments, cells were exposed in 6-well plates, washed with phosphate-buffered saline (PBS) and centrifuged in medium. RNA was purified using the RNeasy® Plus Mini Kit (Qiagen GmbH, Hilden, Germany) according to the manufacturer’s protocol. Purified RNA was quantified using a Nanodrop ND-1000 spectrophotometer (Nano Drop Technologies, Wilmington, DE, USA). For each sample, 1 μg of purified RNA was converted into first-strand cDNA using the iScript™ cDNA Synthesis Kit (Bio-Rad Laboratories GmbH, Munich, Germany). Primer sequences (Table 3) were designed using the software Primer3 at http://primer3.wi.mit.edu (Rozen and Skaletsky 2000) or previously published sequences were used (Hilscherova et al. 2004, (Zhang et al. 2005)). ß-actin was used as reference gene.

Table 3 Forward (F) and reverse (R) sequences, amplicon sizes, annealing temperatures, and GC contents of primers used in quantitative RT PCR

Quantitative real-time polymerase chain reaction (qPCR) was performed with a reaction mixture containing 2 × concentrated Power SYBR® Green PCR master mix (Applied Biosystems), gene-specific primers, and nuclease-free water. The reaction was performed in Applied Biosystems Step OneTM Plus Real time PCR System (Applied Biosystems, Foster City, CA, USA). The quantification of target gene expression (see Fig. 2) was carried out based on two different methods: a standard curve method and the comparative CT method according to Simon (2003). Target gene expression for cells from test substance exposures was always expressed as fold change compared to the DMSO control.

Reproduction toxicity assay with Potamopyrgus antipodarum

The experiment was conducted according to Schmitt et al. (2013). Adult female snails (n = 10) were exposed via water phase in 1-l glass beakers, with lids containing a hole for aeration, for a period of 28 days in a semi-static renewal system in which the complete medium and test substance were replaced twice a week. Test substances were applied in four concentration steps, solvent control, and artificial water (negative control), respectively, in four replicates. After exposure, the snails were narcotized in 2.5% magnesium chloride hexahydrate (MgCl2 × 6 H2O) dissolved in deionized water for a minimum of 45 min up to 90 min maximum before their shells were opened with tweezers and dissected under a stereomicroscope. Endpoints are mortality, (should be less than 20%), and abundance of embryos/eggs in the brood pouch according to their development state (shelled/non-shelled, see Fig. 3) in comparison to the negative control. An increase in non-shelled embryos and eggs is interpreted as an estrogenic effect. The snails are parthenogenetic and get stimulated to produce more offspring. A decrease stands for androgenic or anti-estrogenic effects. The breeding stock of P. antipodarum was build up with snails obtained from the small river Inde near Stolberg, Germany. Collecting sites, distant from sewage, treatment plant outlets and factories were chosen. The snails were acclimated in a 10-l glass aquarium with one part of water collected from the river and one part of reconstituted water under standard conditions (16 ± 1 °C, light:dark cycle 16:8 h). After 6 weeks, they were moved to a 20-l glass aquarium filled only with reconstituted water.

Fig. 3
figure 3

Reproduction assay with P. antipodarum. Embryos and eggs after exposure. 1 shelled embryo, 2 non-shelled embryo, and

The following graphs, curves, and statistics were created with GraphPad Prism 5 (GraphPad Software, Inc., La Jolla, CA, USA).

Results

CALUX

The MTT cell viability assay was performed to determine suitable test concentrations with a cell viability of at least 80% in comparison to the solvent (DMSO) control (data not shown). Quality control criteria for binding to estradiol receptors (ER)/androgen receptors (AR) were derived from the E2/DHT standard curve. A correlation factor of at least 0.98, a minimum induction factor of 5/20, and an EC50 within a range of 1.9 × 10−12 M and 1.9 × 10−11 M/1.4 × 10−10 M and 1.4 × 10−9 M had to be reached for good data fitting of the curve.

The calculation for the curve fit was: \( y=\frac{\mathrm{a}0}{1+{\left(\frac{x}{a1}\right)}^{a2.}} \)

y :

response in relative light units (RLU)

x :

concentration in pM E2 in well

a0:

maximum response

a1:

EC50 of the curve

a2:

slope of the curve

No tested substance showed effects in the ERα CALUX without the addition of S9 mixture, although QSAR prediction and RYES results indicated a positive result for some substances. When adding the S9 mixture, five substances showed an increase of ER activation, namely benzo(a)pyrene, 3,3′-dichlorobenzidine, 2,4-dichlorophenol, 3,4-dichloroaniline, and atrazine (Table 3). The LOECs were in the same concentration range as the RYES LOECs.

No substance showed effects in the AR-CALUX with the U2OS cell line (data not shown).

H295r steroidogenic assay

The MTT cell viability assay was performed to determine suitable test concentrations with a cell viability of at least 80% in comparison to the solvent (DMSO) control (data not shown). Quality control criteria for inducing and inhibiting estradiol (E2) and testosterone (T), respectively, were according to the OECD Guideline no. 456 (2011). Induction by 10 μM forskolin should be ≥ 7.5 times the solvent control for E2, and ≥ 1.5 times the solvent control for T. Inhibition by 3 μM prochloraz should be ≤ 0.5 times the solvent control for both E2 and T. The median of the induction of 10 μM forskolin was 19.74-fold for E2 and 2.52-fold for T. The median of the inhibition of 3 μM prochloraz was 0.28-fold the solvent control for E2 and 0.06-fold for T (data not shown).

Perfluorooctanoic acid, Tris (1-chloro-2-propyl) phosphate, sulfamethoxazole, diclofenac, and diatrizoic acid showed no significant concentration-dependent effect on estradiol and testosterone production. atrazine induced estradiol dose-dependent from 4.37-fold (at 2.62 mg/l) to 17.7-fold (at 65.5 mg/l). Testosterone was slightly induced 1.4-fold (at 13.1 mg/l). 2,4-Dichlorophenol (DCP) also induced E2 dose-dependent from 1.6-fold (at 0.18 mg/l) to 2.49-fold (at 4.5 mg/l), whereas testosterone production was not altered significantly. In contrast to these two substances, tributyltin oxide (TBTO) inhibited E2 production concentration-dependent from 0.55-fold (at 3.64 × 10−6 mg/l) up to 0.29-fold (at 9.11 × 10−4 mg/l). T production was decreased from 0.74-fold (at 3.64 × 10−6 mg/l) up to 0.68-fold (at 9.11 × 10−4 mg/l). Benzo(a)pyrene induced E2 and T at one concentration each for an 1.57-fold increase of E2 at the highest concentration of 10−1 mg/l (p < 0.01) and a significant 1.25-fold increase of T at the lowest concentration (10−5 mg/l).

The effects on hormone production were used as screening in order to select effective substances for the assessment of gene expression (Fig. 5a–c). Based on these results, atrazine, TBTO, and diclofenac were selected for gene expression experiments: atrazine as an inducer of hormone production, TBTO as an inhibitor of hormone production, and diclofenac as a substance with no pronounced effects on hormone production but of high environmental relevance.

Atrazine and TBTO altered the expression of mRNA in genes CYP19A1 and CYP11A1 significantly in correlation to the hormone production results: atrazine induced CYP19A1 up to 2.47-fold and CYP11A1 up to 1.38-fold at highest concentration. TBTO reduced mRNA expression in the same genes down to 0.24-fold and 0.62-fold, respectively. Diclofenac caused no dose-dependent responses.

Reproduction toxicity test with Potamopyrgus antipodarum

17α-Ethinylestradiol (EE2) as a reference compound for estrogenic activity, carbamazepine (CBZ) as a pharmaceutical substance, and benzo(a)pyrene (BaP) as an industrial chemical were tested with the Potamopyrgus assay. The BaP approach was invalid due to algae growth on the snails and thin, easily breakable shells. Only concentrations which caused no mortality over 20% were evaluated.

Carbamazepine showed no clear dose-dependent effects on the production of embryos in the brood pouch of female snails, although there was a significant increase of unshelled (and total) embryos at a concentration of 200 ng/l. EE2 altered the production of embryos in different ways. Causing a significant increase of unshelled embryos up to a concentration of 25 ng/l, the abundance drops down at concentrations of 50 and 100 ng/l. Shelled embryos are effected in a dose-dependent manner.

Discussion

Due to improved chemical analysis methods, more and more substances and their by-products and metabolites can be identified and quantified in water bodies and other compartments. As a consequence, requirements on regulation and laboratories to assess these countless numbers of chemicals are increasing as well. Especially endocrine-disrupting chemicals (EDC) are in focus of public authorities, academia, and public interest, because of their ubiquitous occurrence, effect levels at low concentrations and adverse outcome not only to individuals but to populations as well. Hence a fast, reliable, and easy-to-handle approach for water suppliers, industry, and authorities is demanded. Assessment of known and unknown substances in drinking water aims for the health of humans and the environment.

For chemically identified single substances, the chosen group of bioassays shows good results in detecting effects of micropollutants of different solubility, mode-of-actions, and metabolical state regarding endocrine disruption (estrogenic and androgenic effects). The selected substances are drinking water relevant and the proposed test strategy aims for sensitive, human health-related endpoints. If effects are detected in a single assay, the substance will move on to the HRIV concept for setting precautionary values. With the results conducted in our study, we propose a HRIV that should be set between HRIV0 (0.01 μg/l) and HRIV1 (0.1 μg/l) (Dieter 2014) due to the lowest effect concentrations of TBTO (0.03 μg/l; Figs. 4f and 5b). In combination with the test batteries in modules genotoxicity and neurotoxicity, a new holistic approach in the risk assessment of micro pollutants is possible. In a next step, anti-estrogenic or anti-androgenic effects could be applied to the system. Figure 7 gives an overview on our proposed hierarchical test battery.

Fig. 4
figure 4figure 4figure 4

ai Changes in 17β-estradiol (E2, white) and testosterone (T, black) production by H295R cells relative to the DMSO control (= 1, marked by the dotted line). Observed after exposure to test chemicals (ai) for 48 h in two independent replicates (with three technical replicates each). Boxes represent the 25th and 75th percentiles, lines in the boxes the median. Whiskers represent the range from minimum to maximum values which are within 1.5 times the range between the 25th and 75th percentile. Outliers from this range are represented as dots. Statistical significance analyzed by Kruskal-Wallis one-way ANOVA on ranks with Dunn’s post-hoc test. *P < 0.05, **P < 0.01

Fig. 5
figure 5

ac Changes in expression of CYP11A1 (1), CYP17A1 (2), CYP19A1 (3), and 3βHSD2 (4) mRNA in H295R cells after 48 h of exposure to atrazine (a), TBTO (b), and diclofenac (c) relative to the DMSO control in two independent exposure experiments with three internal replicates each. Cycle threshold (C T) values were normalized to β-Actin. Boxes represent the 25th and 75th percentiles, lines in the boxes the median. Whiskers range from minimum to maximum values which are within 1.5 times the range between the 25th and 75th percentile. Dots indicate outliers from this range. Statistical significance analyzed by Kruskal-Wallis one-way ANOVA on ranks with Dunn’s post-hoc test. *P < 0.05, **P < 0.01

Fig. 6
figure 6

Total number of embryos, unshelled embryos, and shelled embryos for exposure with carbamazepine (CBZ) and ethinylestradiol (EE2). Statistical significance analyzed by one-way ANOVA with Dunnett’s multiple comparison test (*P ≤ 0.5, **P ≤ 0.01, ***P ≤ 0.001, ****P ≤ 0.0001)

Fig. 7
figure 7

Hierarchical test strategy for in vitro and in vivo assessment of endocrine effects for micropollutants in drinking water

There are further concepts for the assessment of toxicological unknown substances but the HRIV concept has some specific advantages. With regard to drinking water, these substances appear usually temporarily and regional in the production process, so no long (bio-/environmental-)monitoring studies with epidemiological assessment or long-range follow-up studies (Goldman and Koduru 2000) are needed. The TTC concept provides a similar approach but it is designed for food components and additives (Kroes et al. 2005). So bioavailability, polarity, solubility, and concentrations of substances are different to substances in drinking water. The HRIV furthermore provides a holistic and hierarchical test battery for users (water suppliers, public health agencies) that is standardized, quick, and easy to handle.

Before starting the bioassay test battery, a QSAR analysis of the substance should be done. This could provide information on possible active metabolites. But we also discovered some mismatches between prediction and actual effects. SMX and atrazine are predicted to have strong ER-binding metabolites and SMX is supposed to be a strong ER-binding parent substance, but in the receptor-mediated ER-CALUX and RYES, no effects occurred, even with S9 addition. So, QSAR is useful in getting a first impression but the method still has its disadvantages.

Therefore, we recommend as a first step sensitive in vitro test systems with human cell lines and if necessary, the addition of (S9) metabolically activating mixtures. The CALUX assays with U2OS cells (no metabolic competence) showed clearly that although parent substances like benzo(a)pyrene or 2,4-dichlorophenol caused no effects, their metabolites, after inducing metabolism with a S9 mixture, can activate the estrogen receptor. Since S9 fragments are cytotoxic, prior tests should be done to detect the best mixture combining the highest metabolic activity and tolerable cytotoxic effects on the cells (> 80% viable cells). For U2OS cells, the S9 fraction should be applied with 1.75% of co-factor solution, but this can vary depending on the used cells. Furthermore, S9 seems to resolve estrogenic-active substances out of microplates, which causes effects in the negative control (and every other batch). Hence, we recommend incubation of a test substance and S9 mixture in separate glass vials and, subsequently, the exposure to the cells after 24 h (data not shown, in prep.). Further testing is needed on that issue. Although S9 causes some handling issues, in combination with the U2OS cells, it is, in our opinion, a better system than metabolic competent cell lines like T47D-luc or yeast. U2OS cells are human cells and express only one specific receptor (ERα); hence, there is a better possible extrapolation to the human body than with yeast cells and no cross-talk or interference with other receptors as in T47D-luc cells. Of course, there are limitations with regard to which metabolites are formed and in which proportions they are produced in contrast to endogenous enzyme activity (Mersch-Sundermann et al. 2004) to keep in mind. It is important to use structure-activity relationship tools to check on the possible metabolic activation of substances, so that S9 mixtures or analogical solutions can be applied if necessary. LOECs for CALUX (+S9) and RYES, yeast with metabolical abilities, were in the same concentration range and can be used with equal importance. For pure parent compounds, CALUX without S9 is the tool to use. These results are similar to a study by Kunz et al. (2017).

It is also important to cover different kinds of mode-of-action with the assays (Maletz et al. 2013). Tributyltin oxide, for example, causes no effect in the CALUX assay, which aims for receptor-mediated effects, but the H295r steroidogenic assay, which aims for alterations in the hormone synthesis, shows significant inhibiting effects for TBTO. The mode of action of TBTO which leads to decrease in hormone production identified in the present study was an effect on two cytochrome P450-mediated enzymes: CYP11A and CYP19A gene expressions were significantly decreased (0.59 and 0.23-fold, respectively, at 9.11 10–4 mg/l). These findings demonstrate that TBTO firstly inhibits the first and rate-limiting step in steroid biosynthesis which is the conversion of cholesterol to pregnenolone and further inhibits the aromatization of androgens to estrogens catalyzed by the aromatase CYP19A (Fig. 2) both leading to decreased hormone production. These observed endocrine-disrupting effects of TBTO are consistent with the broad range of reports on TBT compounds in literature where the occurrence of imposex in gastropod mollusks due to TBT originating from antifouling paints on ship is most frequently mentioned (Matthiessen 2013, Matthiessen and Gibbs 1998). In the case of diclofenac, which was selected as test substance since it is one of the most frequently detected pharmaceutically active compounds in the water cycle (Ternes 1998), no or inconclusive results were obtained, while this substance showed effects in the ERα CALUX with S9. Therefore, it is possible that this response represents an artifact which is strongly assumed since also another study did not find changes in E2 and T production in H295R cells after exposure to up to 20 mg/l diclofenac (Ji et al. 2010). Metabolic effects or different kinds of mode-of-action could explain these differences between these in vitro tests. Results of hormone concentration (ELISA) and gene expression (RT-qPCR) showed always good correlation and can be further used to explore the pathway of alteration within the steroidogenesis. For example, the major mechanism of action of atrazine in the steroidogenic pathway appeared to be the effect on CYP19A expression. The gene CYP19A encodes for the enzyme aromatase cytochrome P450 which catalyzes the conversion of testosterone to estradiol (Sanderson et al. 2002, Higley et al. 2010). Furthermore, a 1.44 to 1.68-fold induction of 3βHSD2 which also positively correlated with E2 production was found in the present study. Hence, CYP19A is not the only point in the steroid pathway which is affected by atrazine but also 3β-hydroxysteroid-dehydrogenase catalyzed reactions may be influenced. These include the production of androstenedione, the intermediate for T production, as well as progesterone which may further lead to alterations in aldosterone and cortisol production (Fig. 2).

Following fast in vitro screening for endocrine effects, chronic bioassays should be performed in a second step to check on long-term effects on the population level at lower concentrations. Regarding the idea of quick toxicological responses for not thoroughly assessed substances that appear temporarily and localized in the water cycle, this would be reasonable for substances that gave positive effects in the initial in vitro screening. For substances with no effects in the first in vitro screening steps, chronic in vivo assays could be given lower priority. Nevertheless, chronic effects in general are assessed in a later step of the HRIV concept by literature research, if available (Dieter 2014). Comparative studies of the hormonal systems of mollusks and vertebrates have shown similar structures so that effects on endpoints like P450 activation or receptor-binding should be transferrable (Duft et al. 2007; Stange et al. 2012). Ethinylestradiol, as the positive control for estrogenic effects, showed the expected rise of new, unshelled embryos (Jobling et al. 2003) although there was a drop in embryo abundances at higher concentrations (> 50 ng l−1) which can be explained due to brittle shells of the female snails. The results for carbamazepine are not dose-dependent but show a slight tendency for an estrogenic effect (increase in unshelled embryos) as also shown by Nentwig (2006). The reproduction test with Potamopyrgus antipodarum was chosen because it is available as a standardized OECD Guideline no. 242 (OECD 2016) and the hormone systems of snails and vertebrates are quite similar with regard to the expression of hormone receptors (deFur 2004, Duft et al. 2007; Stange and Oehlmann 2012; Stange et al. 2012). So the reproduction test has been proven to be a valuable addition to the in vitro test battery.

Example with TBTO, the substance and its metabolites are predicted by means of QSAR to be non-binders to the ER (Table 4), so S9 mixture for metabolic activation is not obligatory. The H295r assay shows dose-dependent effects (Figs. 4f and 5b), AR-CALUX, ERα-CALUX, and RYES show no effects (Table 4). With at least one positive effect, a HRIV assessment can be done (Dieter 2014) and the substance could be defined as an endocrine-disrupting compound. The next step of the battery, the in vivo bioassay, can additionally be performed for in vivo confirmation. In the case of TBT(O), the reproduction assay was done in the past by Rupert et al. (2017) and showed EC50 between 0.04 and 0.19 μg/l. With the positive H295r results and the confirmation in vivo, the HRIV can be set as proposed between 0.01 and 0.1 μg/l.

Table 4 Results for testing substances in U2OS cells without (− S9) and with S9 (+ S9) mixture also in comparison to RYES results carried out at Incos Bote Cosmetic GmbH, Alzey, Germany. Values are the lowest observed effect concentrations (LOEC: EEQ ≥ LOQ (0.7 ng/l)) in milligrams per liter and no effect was < LOD (level of detection). QSAR predictions for estrogen receptor-binding affinities based on the OASIS database (parent compound; metabolites (based on experimental data or calculated on simulated data, e.g., structure models); n.a. (not assessed))

Conclusion

The combination of receptor-mediated bioassays, an assay on the steroidogenesis and a chronic in vivo reproduction assay, proved to be suitable to detect estrogenic and androgenic effects of a broad range of drinking water-relevant substances with different modes of action. These tests are well-known and internationally evaluated, so the acceptance in daily work of the water suppliers and public health organizations should be easy to apply. The aims of the study are to propose a hierarchical test battery to assess toxicological unknown substances regarding if there are EDCs and to establish “endocrine effects” as an additional endpoint in the HRIV concept (0.01–0.1 μg/l) was successful.