Key words

1 Introduction

The field of toxicology i s challenged with tens of thousands of synthetic chemicals that are on the market and released to the environment with little or no safety information. These synthetic chemicals span an enormous range of physico-chemical properties, which makes it nearly impossible to address the backlog of unknown bioactivity using traditional toxicology approaches. Refocusing toxicity testing from high-cost, low-throughput mammalian models to alternative systems has been underway for some time. The key is to use a model, preferably a whole animal model, which is able to rapidly detect different modes of bioactivity and thus uncover molecular response pathways. Alternative model selection is guided by the 3Rs (Replacement, Reduction, and Refinement) that minimize the use of animals to advance science. Current efforts to address this need are being led by multiple federal agencies such as the US Environmental Protection Agency—National Center for Computational Toxicology (NCCT), National Toxicology Program, National Center for Advancing Translational Sciences (NCATS), and US Food and Drug Administration (FDA). Together, the federal agencies developed a partnership termed “Tox21 ” to test a large set of compounds (>10,000) that previously had little or no toxicological data.

The Tox21 research initiative relied on more than a hundred different in vitro, high-throughput assays that were target-specific and mechanism-based to conduct toxicity pathway profiling followed by intensive computational approaches to interpret the findings. The first two phases of Tox21 provided the opportunity to optimize the HTS assays and test > 10,000 compounds from which it was learned that the biological coverage of the assays and relevance of the data were lacking. In phase 3, high-content assays were refined to include nonmammalian whole animal models, e.g., worms and zebrafish.

EPA-NCCT developed the ToxCast program to assess a large number of chemicals in a diverse set of in vitro assays [1]. The main difference from the Tox21 program was that ToxCast focused on evaluating a significantly smaller set of chemicals in a larger, biologically relevant assay space ToxCast Phase 1 consisted of ~300 well studied chemicals with existing toxicity information evaluated in ~600 in vitro HTS assays, which allowed for multidimensional signatures to predict animal toxicity using the traditional toxicity data to gauge accuracy. In the next phase, an additional ~700 chemicals with little to no traditional toxicity data were tested in the same assays. Phase 3 is currently ongoing, and consists of 1001 chemicals that will be evaluated in a refined set of assays identified in Phase 2 to provide insight on the mode of action of the chemicals.

There are several potential limitations in collecting in vitro data and then translating it to human hazard potential. The most obvious is that an in vitro assay queries only a small fraction of biological space, thus, very many different cell types and assays are used which increases the cost of chemical assessments. The net effect is that these large in vitro efforts may result in data with substantial uncertainty regarding translation to higher levels of organization. Whole animal models, that span embryonic development, express and present nearly all the potential cell types and gene products that are interacting in concert. This unique life stage offers an ideal time to determine if a test chemical has the inherent structure to interact with and perturb any of the carefully orchestrated signaling events necessary for normal development. If sufficient perturbations occur, the normal developmental plan will be disrupted resulting in a chemical-induced phenotype. The purpose of screening in zebrafish thus is to more rapidly identify chemically induced phenotypes. To increase throughput, most of these phenotypes are simply visual assessments of major organ defects [2], but have also been extended to include motion-tracking behavioral assays in 96-well plate format [3, 4]. Using this general approach, developing zebrafish can therefore be considered as a sensitive biosensor of chemical activity.

2 Experimental Considerations for Optimal Zebrafish Use in High-Throughput Screens

As a biosensor amenable to high-throughput screens, the model is still only as good as the techniques and assays employed. In one effort at standardization, formulation of the OECD Fish Embryo Acute Toxicity Test (FET) [5] is aimed at acute toxicity of chemicals to embryonic zebrafish. Newly fertilized eggs are exposed to test chemicals for 96 h, with renewals every 24 h. Six concentrations and a control are tested in 24-well plates with 20 embryos per concentration, and every 24 h, four apical endpoints are recorded: (1) coagulation of fertilized embryos, (2) lack of somite formation, (3) lack of tailbud detachment from yolk sack, and (4) lack of a heartbeat. From these measured endpoints, the concentration associated with 50 % lethality (LC50) is computed. However, this approach is laborious, has a limited number of endpoint and may be less sensitive because the chorion is intact for approximately the first half of the assay duration (48 h of development). Less apparent, but essential details for adoption of a standardized protocol are also lacking. These include standardization of the number of embryos per well, multi-well plate format and plastic type, as well as an optimized statistical analysis. The OECD standardization is thus a guideline for moderate throughput chemical screening in the developmental zebrafish.

The goal of high-throughput screening using a whole animal is to evaluate chemicals more rapidly and at a lower cost so the results can be utilized more immediately. To implement and harness the power of the zebrafish, there are design considerations that can help to make this model amenable to efficient HTS (Table 1). These considerations are detailed in the following sections.

Table 1 Experimental factors to consider

2.1 Optimizing and Harmonizing Dose Volumes

The small size of the embryonic zebrafish allows for the use of multi-well plates. Each of the multi-well plates differs by the number of wells each plate holds (i.e., 6, 12, 24, 48, 96, 384) and subsequently the maximum volume per well. The number of wells per plate is an important factor to take into consideration as the volume per well for an individual well of a 6-well plate can hold up to 16.8 mL but less than 300 μL in 96-well plate [6]. Well diameter can be as large as 34.8 mm down to only 6.4 mm in a 96 well format. Plate type and well format used for HTS could certainly influence the outcome of zebrafish behavioral assays that have a swimming activity readout. The OECD guidelines for a FET require the use of 24-well plates with at least 2 mL of chemical. But others have successfully reported using 96-well plates [2, 7] and with a robust behavioral component. The volume typically used in the 96-well plate exposures by our group is 100 μL which strikes a balance between minimal test chemical consumption and maximal water volume to support development and swim activity by 5 days post fertilization (dpf).

2.2 Chorion Status

The chorion, an acellular matrix surrounding the embryo, can be removed enzymatically as early as 4 h post fertilization (hpf) [2]. When reared at 28 °C an embryo hatches out of its chorion between 48 and 72 hpf [8]. In our view, for HT screening, it is ideal to remove the chorions to maximize chemical bioavailability as the chorion can impact chemical partitioning and toxicity [9, 10] despite the observation that the chorion has pores which are approximately 0.6–0.7 μm [11]. The presence of the chorion therefore is a critical concern when evaluating the bioactivity of chemicals and its presence could result in an increased false negative rate.

2.3 Exposure Paradigms

With embryonic zebrafish developing so rapidly, it is challenging to initiate exposures immediately after fertilization. It is also difficult to identify viable fertilized embryos until approximately 3–4 hpf thus, many assays begin their exposures starting at varying life stages, but we have found that exposure commencement in the 6–8 hpf window with embryos dechorionated at 4 hpf is readily achievable in a HTS environment. The 6–8 hpf embryos are mature enough to withstand gentle handling while still at a sufficiently early life stage for assessments of important developmental events. Exposure commencements at 24 hpf are certainly less technically challenging, but are too late as many critical stages of tissue differentiation and primary organogenesis would be missed [12].

Once viable embryos are in the well, there are important considerations in determining the best way to deliver chemicals and to encourage solubility in the test medium.

2.3.1 Liquid Handler/Manual Pipetting/Digital Dispensing

With advancements in liquid handling technology, test chemical delivery to the experimental chamber has evolved. The utility of manual pipetting has always been constrained by the need to work with relatively large volumes (>5 μL) for consistency in serial dilutions and dispensing into multi-well plates. Indeed, the need for serial dilution is solely a function of volume constraints with traditional pipetting, and serial diluting itself is an effective means of propagating error. Moreover, plastic (polypropylene) pipette tips are suspected to adsorb hydrophilic and low surface tension chemicals, further increasing uncertainty about how much compound was actually delivered to the test chamber [13]. Enter the age of digital dispensing platforms which use different approaches such as inkjet technology [14] or acoustic vibration [15] as the motive force to rapidly dispense single droplet streams of test solution. Droplet size is currently as small as 13 pL with one commercial platform [14, 16] and will almost certainly be driven lower by improving technology and growing market demand. The net result of sequentially transferring such small volumes of millimolar range solutions is that serial diluting is obviated. Reproducibility of chemical delivery dramatically increases while chemical sorption losses and pipettable volume excesses shrink. Because the dispensing is software-controlled, additional advantages such as complex mixtures and randomized plate layouts become trivial to execute, and a traceable record of all dispense events is automatically stored. Of course there are limitations. Entry level pricing is $25K, consumable costs are not trivial, and efficient mixing, taken for granted with manual pipetting of traditional volumes, requires additional steps to execute in the picoliter realm.

2.3.2 Continuous vs. Static Exposure

Static and continual renewal regimens have both emerged in the zebrafish chemical screening literature. When we consider that the embryonic zebrafish has a functioning liver by 48 hpf and a phase I and II metabolism similar to humans [17, 18], it is apparent that renewal regimen can profoundly influence the outcome. For instance, continual renewal reduces concerns about chemical liability over the typical 5 day experiment. But continuous renewal maintains a high and possibly nonrepresentative exposure to the parent compound with the potential for false positives, and necessitates potentially confounding embryo handling to remove and replenish the test solution. There are tools available that facilitate the process such as 96-well plate inserts to lift all embryos at the same time into a new plate [7], but require large exposure volumes to ensure the entire embryo is covered, and when transferring embryos into a new solution tray, chemical carryover and cross contamination is a concern. The availability and cost of these inserts and plates somewhat reduce their practicality.

Static renewal has the advantages of minimizing test chemical consumption, manipulation and labor costs. A disadvantage is that, with little or no prior knowledge of the individual chemistries in a large library, false negatives will occur because the concentrations of some labile compounds may be reduced by the time their potential targets are developmentally expressed. There is a gray area between the pros and cons. For instance, zebrafish complete organogenesis within 48 hpf [8], so that the test chemical is potentially undergoing metabolism and degradation early in the experiment. Parent compound, metabolite and byproducts are thus players throughout the experiment. In a continual renewal, this complexity is removed daily, potentially missing toxicity associated, especially, with degradation. But one could similarly argue that a static exposure is biased toward, or even confounded with, the toxicity of metabolites and degradation products, obscuring the original goal of characterizing the parent toxicity. What we do know is that the chemical bioactivity associated with each regimen often differs. Our position is that the choice is a matter of throughput. Static exposure supports HTS while a continual renewal regimen supports moderate throughput screening.

2.4 Chemical Solubility

To increase solubility, chemicals are typically delivered in a biocompatible solvent such as dimethyl sulfoxide (DMSO) or ethanol. Embryonic zebrafish develop phenotypically normal to 5 dpf in ≤1 % ethanol or DMSO [19]. The most common solvent by far for HTS i s DMSO, though its hygroscopic nature is especially troublesome because rapid absorption of water by DMSO can accelerate degradation and precipitation of test compounds [20]. Keeping compounds in DMSO dry is emerging as a major consideration around library storage and handling, and new commercial platforms to monitor sample hydration and more efficiently store samples under a dry atmosphere are readily searchable online. Once dispensed into an aqueous environment where the vehicle is only 1 % of the bath composition, maintaining solubility and hence, bioavailability of hydrophilic test chemicals can be a challenge. We have empirically determined that for PAHs, possibly a worst case solubility scenario, gentle mixing (235 rpm) of the exposure plate on an orbital mixer overnight routinely shifts the concentration response downward by five to tenfold. This is relative to the same exposure mixed thoroughly for 15 s at 235 rpm, immediately after digital dispensing , then left undisturbed overnight. After confirming that the overnight mixing motion had no effect on zebrafish developmental morphology or behavioral endpoints, we have instituted the overnight mixing for all HTS exposures in our lab.

2.5 Experimental Duration

We have benchmarked much of the discussion so far to a 5 day (120 hpf) screen. The current OECD guideline for the zebrafish embryo test is to conduct the experiment until 96 hpf. At this stage, embryos have hatched, completed organogenesis, are metabolically highly active, and about 3.7 mm in length [8]. In many countries, the 96 hpf embryo is not considered a “living organism,” and not regulated as an animal used for experimental and other scientific purposes in the European Union (EU). For these reasons, the FET test is terminated at 96 hpf. However, it is possible to go until 120 hpf where larval behavioral endpoints offer more information with which to detect chemical bioactivity. Visual observation of morphology changes is also easier because the larva is bigger. Thus, the heart and circulation are easily observed, and the brain, eye, snout, and jaw are more distinguished [8].

2.6 Endpoint Definitions

There are numerous developmental endpoints that can be adversely affected by chemical exposure in a HTS. We routinely assay 22 visual endpoints and 3 behavioral endpoints that we have determined provide a rich profile of chemical bioactivity and insights into potential mechanisms of toxicity [3]. The OECD guideline stipulates only four endpoints that primarily bin outcomes as either normal, dead, or abnormal. Endpoint scoring can range from simple presence/absence binary data to a scale of severity scores for each endpoint. Severity scores offer the advantage of tracking dose-response effects more accurately, but are inherently more variable across multiple scorers [7]. To circumvent this, one approach is to automate, as much as practical, the measurements of basic descriptors (e.g., length and width) [21] or to restrict the focus by screening in a transgenic reporter line for abnormalities in just the labeled tissue (e.g., vascular system: [22]). We have found that simple presence/absence data for as many endpoints as practical strikes the best balance among data quality, throughput and cost-efficiency [3, 4, 23, 24].

2.7 Statistical Analysis

The field of toxicology has established several standardized metrics of toxicity, such as the concentration that caused 50 % lethality in the embryos (LC50), maximum concentration causing no mortality within the test period (NOEC), minimum concentration causing 100 % mortality (LC100), and lowest concentration that causes a significant effect when compared to control (LOEC). The statistical analysis to calculate these readouts is by probit analysis, logistic regression models, geometric means, and ANOVAs. In addition to these readouts, for HTS data, the lowest effect level (LEL) is utilized to describe the lowest level that induced any effect, which is similar to the LOEC, however, the method to compute this readout is by using a fisher’s exact or binomial test [3] for each endpoint rather than overall. This should be applied to each endpoint due to the fact that the endpoints are highly correlated, which makes it difficult to discern the LOEC, but not the LEL. If more endpoints are used for high-throughput zebrafish screening, the current readouts established in the OECD guidelines are inappropriate because they do not account for the fact that as lethality increases as a function of concentration, the number of viable embryos to evaluate diminishes, prohibiting the use of logistic regression models, and probit analysis. The most appropriate statistical test is Fishers exact, and a binomial test for binary data. An alternative is a summation of the endpoints (including mortality), and fitting a regression model to the data to compute the OECD guideline readouts. A limitation to this aggregation method is determining what weight each endpoint should receive. Regardless of the statistical method, the current readouts used for zebrafish HTS can be optimized for greater relevance and efficiency to better handle the large volume of data.

2.8 Data Management

HTS generates “big data” that are essentially impossible to curate or share without electronic database storage and management through a lab information management system (LIMS). The system should track assay plates by barcodes permanently associated to plate layout, test chemicals, experimental dates and results, and other useful metadata such as hi-res digital images of the animals. A database enables immediate and nearly effortless checks on test chemical scheduling, data integrity and reproducibility . Moreover, the necessary organizational constraints of well-constructed database tables automatically ensures that third party downstream reduction and analysis tools can easily read and process the data.

3 Conclusions and Future Perspectives

As the use of zebrafish for HTS grows in popularity, the field will benefit greatly from continuing to embrace the developmental zebrafish as a vertebrate biosensor of chemical bioactivity while shedding the limited view that it is just fish model for developmental toxicity. High-throughput screens are still very young in this model, but have already, and will continue to give us great discoveries. The considerations and recommendations outlined here are primarily from a successful toxicological perspective, but they are equally amenable to therapeutics discovery. Going forward, embedding high-throughput transcriptomics into vertebrate HTS will be the next huge change in this field. The technology is there, but the cost is still high enough that inclusion of transcriptomics is often a choice rather than a given. But the continually declining cost and increasing power of the technology may change that soon. Once these efforts are widely integrated, truly predictive toxicology will be a reality.