Keywords

1 Introduction

Fragment screening methods have evolved in the last decade from a serendipitous observation of solvent molecules in crystal structures to a new technology for generating ligand binding information in drug discovery [15]. In contrast to screening methods of random compound libraries like high-throughput screening (HTS), fragment screening uses a compound library that contains substances that are selected to follow mainly three basic rules: a molecular weight of less than about 300 Da, no more than three hydrogen bonding donors or acceptors, and a computed partition coefficient (clogP) of less than three [4]. Additional selection criteria might be added, such as no more than three rotatable bonds and a polar surface of less than 60 Å2. The small size and limited potential for formation of diverse interaction of the fragments leads to a higher degree of promiscuity of binding. These properties lead to a number of advantages. Compared to HTS, the screening effort can be limited to hundreds or a few thousand compounds to explore the chemical space of a binding site. Optimization of a hit or lead towards a drug molecule benefits from favorable physicochemical properties and low chemical complexity. Also, the ligand efficiency, as defined by binding energy per nonhydrogen atom [6, 7], is typically higher for fragments than for HTS hits.

The disadvantage of the fragment screening approach is the psychological and technological hurdle to synthetic chemistry efforts with ligand binding affinities in the micromolar (μM) to millimolar (mM) range, including the higher error in their determination. Another consequence of the low binding affinity can be the lack of functional activity of such compounds in cellular and in vivo assays of the initial hits, and the requirement of sometimes significant chemistry effort to synthesize compounds that show such properties. The potential lack of selectivity of small compounds (promiscuity) is in our experience not a problem and selectivity is quickly achieved during lead optimization.

In this review, we highlight the importance of biophysical methods like surface plasmon resonance (SPR), NMR, isothermal titration calorimetry (ITC) and others for the fragment screening approach because these methods are used as a primary filter to select the compounds with higher likelihood of being visualized with X-ray crystallographic methods compared to the use of X-ray methods for primary screening. Advances regarding sensitivity and throughput, especially in SPR methods, have enabled evaluation of several thousand compounds in a few days or weeks and many examples of successful identification of new binding motifs have been reported. Some challenges like deviations in the buffer conditions between the methods remain and are potential ways for further improvement of the procedures.

2 Biophysical Methods for Fragment Screening

It has been well recognized that the application of several screening technologies in parallel, followed by diligent analysis of the data on the basis of the strengths and limits of the respective methods, is crucial for the identification of novel chemical scaffolds with high potential for generating new therapeutic agents [8, 9].

An enzymatic or ligand displacement assay as generally used in HTS campaigns seems to be the most straightforward approach in the identification of biochemically active fragments and there is a wide variety of such assays used by different companies [10]. The lack of sensitivity of such assays for the characterization of fragment binding in routine HTS settings demands alternative methods, despite the success in investigating particular protein targets where more sensitive biochemical assays could be established [11].

In this review, an approach is described that overcomes these difficulties using biophysical methods. The advantages and limitations of the various technologies are discussed in some detail in order to give guidance for the selection of the most appropriate methods for fragment screening [12]. The main focus, however, is on a detailed description of how to use SPR-based methods.

2.1 Biophysical Methods for Detection of Ligand Binding: An Overview

Among biophysical methods, high-throughput crystallography is the most elegant way to detect ligand binding since it provides direct structural information [13]. Several technological innovations such as improvements in expression systems and methodology of cloning and expression; advances in robotics, liquid handling and miniaturization; improvements in working with large cocktails of test compounds; and increased efficiency in data collection, processing, and analysis have made this a realistic and practical proposition. Application, however, is limited to targets for which a robust crystallographic system is available that allows the production of large numbers of diffraction quality crystals for soaking or cocrystallization experiments. The essential prerequisites of this technology are discussed by Davies and Tickle [14].

All other methods for label-free binding studies can be assigned to basically two different classes. In one class, the binding event is measured in homogeneous solutions (homogeneous assays) and in the other at a solid–liquid interface with one of the binding partners immobilized on the solid phase (heterogeneous assays). The class of homogeneous assays are based on detection technologies such as NMR [8, 1518], mass spectrometry [19, 20], ITC [21, 22], thermal shift assays (also called ThermoFluor) [2325], and backscattering interferometry [26, 27]. Among these, NMR is the most widely used technique in fragment screening [13, 28]. For all the other technologies, their application in the assessment of fragment binding has been demonstrated, but there is only limited data available on applications in screening of large fragment libraries.

The strength of NMR-based technologies is the ability to use changes in one or more NMR parameters, including chemical shift (1H, 15N, 13C), anisotropy measurements, transverse and longitudinal relaxation of the ligand or protein, cross relaxation in the protein–fragment complex, or cross relaxation between the fragment and the protein-bound water-molecules. Zartler and Huaping [28] emphasize that the type of NMR method selected for a screening effort depends on different factors (size of target protein and quality of spectra, protein consumption, number of measurements planned etc.) but that the first and foremost factor should be the type of information that is expected from the experiment. They identified five different data types:

  1. 1.

    Does the ligand bind (Yes/No)?

  2. 2.

    Which ligand is binding (from a mixture)?

  3. 3.

    How is the ligand binding?

  4. 4.

    Where is the ligand binding?

  5. 5.

    What is the structural and dynamic implication of binding?

The large number of experimental NMR methods can be subdivided into two main classes: ligand-observed and protein-observed methods. The ligand-observed experiments are differentiated by the type of the magnetization and how the pulse-sequence delays are set. The two main unlabeled experiments are STD [28, 29] and WaterLOGSY [8, 15, 30]. The ligand-observed experiments deliver data about whether a ligand binds or not, about the identity of the bound ligand and also (with certain limitations) how, i.e., in what orientation, the ligand binds to the target [28]. Protein-observed methods are limited to targets with a molecular weight less than 30–40 kDa due to line width and relaxation considerations. Protein-observed methods frequently require spectrum simplification through 15N and 13C labeling. Assignment of resonance lines to amino acids is advantageous because it allows determination of structural constraints. The investment in time and effort required for full assignment of resonances can be reduced by selective isotope labeling of one or more types of amino acid. Typically, these protein-observed experiments are heteronuclear single quantum coherence (HSQC) experiments using 15N or 13C as the heteronucleus [31]. The first example of protein-based screening was structure–activity relationship (SAR) by NMR [17]. In the meantime, automated data evaluation methods were developed to handle large sets of heteronuclear correlation spectra [32, 33].

The class of heterogeneous assays includes detection technologies that are based on optical transducers such as SPR [34], guided mode reflectance filter [35], and white light interferometer devices. All the optical devices are able to detect either a small change in the refractive index [36] or a change of the thickness of an adlayer occurring upon binding of molecules to their surface. Although examples of the use of all these technologies to monitor small ligand binding have been presented at scientific meetings, the only application to fragment screening reported in scientific journals has been for the SPR-based systems from Biacore [3740], FujiFilm [41], and SensiQ [42]. The limited feasibility of the methods to work with fragment-sized compounds results from special limitations and challenges. Since the refractive index change, and hence the response, scales with the molecular mass of the ligand, the technology has to be pushed to its detection limit. Consequently, immobilization strategies must be developed that lead to high densities of active biomolecule on the surface. Due to the low affinity of the fragments, screening has to be performed at high concentrations, which makes the method susceptible to unspecific binding and false positive hits. The use of suitable control proteins is highly recommended to circumvent this problem. Hämäläinen et al. [37] suggested for thrombin fragment screening the use of a blocked thrombin as control for unspecific binding as well as proteins like serum albumin and carbonic anhydrase as further control proteins. Nordström et al. [38] worked with an active site mutated matrix metalloproteinase, MMP-12, as a control protein to identify fragments that interact specifically with the active site of the protein. Perspicace et al. [39] used the zymogen form of chymase as a control protein in which an N-terminally attached small proregion is bound to the active site and blocks the protein. An additional method to validate active site binding of a ligand is a competition assay with known active site binders [39].

2.2 Choice of Assay Methods (Criteria for Selection)

Combining complementary technologies is beneficial for identifying and reconfirming new chemical scaffolds that can be exploited in a fragment-based approach. For cost and efficiency reasons it is advantageous to select one leading method as a workhorse for the primary screen and to use the other method solely for hit confirmation. Some of the main aspects to be considered in such a method selection are discussed below in more detail. The most prominent methods used in label-free interactions analyses are listed in Table 1 with their respective properties. Potential weaknesses highlighted here need not disqualify a certain method, but should indicate that the impact of any potential issues must be carefully considered in the application of the technique to fragment screening.

Table 1 Comparison of various biophysical methods

2.2.1 Statistical Assay Control

Screening is about making a decision on the interaction of a particular compound with a biological target. Independent from the read-out technology used for screening, the data on which such decision is based are subject to variability and hence uncertainty. However, the degree of uncertainty can be evaluated and estimated by application of statistical tools. These statistical criteria are useful to monitor during all states of a screening workflow as they help to assess reliability, reproducibility and sensitivity of a given assay and hence deliver experimental facts to investigate the quality of the assay. Finally the statistical tools can be used in data analysis to distinguish, based on statistical arguments, between positive and negative signals in screenings. Along this line, reproducibility and robustness of SPR like assays can be tested with the same tools as biochemical HTS assays [37, 39]. There are technologies, however, such as NMR, ITC and thermofluor where such statistical assay controls could not be defined because the response evolution is not independent from the molecule under investigation.

2.2.2 Material Consumption

Although fragment screening involves testing of a relatively low number of compounds compared to HTS, material consumption and costs are an important argument in technology selection. Cost considerations should include all disposables (plates, tips, sensor chips etc.) as well as the biological material consumed. For example protein production in a quantity and quality required for methods with low sensitivity like NMR, ITC or Thermofluor can limit the number of compounds tested in a fragment screening effort or the application of the method overall. Methods with higher sensitivity due to the high density of the immobilized biological target and, in addition, opportunities for regeneration (such as the SPR based) have a clear advantage in this respect since the once immobilized protein can be regenerated and reused for many experiments. This is not the case for all methods working in homogeneous solutions and also not for the SPR based technology from Corning. The intrinsic sensitivity for the Corning technology is the same as for other SPR based methods, however, the higher demand for protein results from the set-up in disposable micro titer well plates requiring freshly immobilized protein in each and every experiment.

2.2.3 Throughput

Fragment screening involves testing for binding of many candidate ligands requiring a robust and reliable approach for data acquisition as well as data evaluation. Thermofluor and the SPR method from Corning are the methods with the highest potential throughput. Both technologies are based on 384 well plates, and the high degree of parallelization allows carrying out several ten-thousands of binding experiments per day. At the other end of the scale is ITC with much lower degree of automation and throughput of experiments. All the other techniques have a throughput of several hundreds to a few thousand binding experiments a day, which is sufficient to deal with several thousands of fragment molecules that are typical for such libraries.

2.2.4 False Positive Susceptibility

All assay technologies are susceptible to false positives that are caused by the imperfection of the in-vitro model system. Compared to biochemical or functional assays, direct binding assay technologies add unspecific binding as a possible cause for false positives. If properly designed (exclusion of pH, salt effects) protein observed NMR is probably the technology with the lowest susceptibility to unspecific binding. Ligand promiscuity, i.e., aggregation of ligands in solution will give responses in STD and WaterLOGSY experiments similar to that in the presence of protein binding [8] and can be eliminated by performing control experiments in the absence of target protein. Label free methods with the binding event occurring at a solid/liquid interface are highly susceptible to false positive hits since any deposition of material at this interface will lead to a positive response if no special measures are taken. Unspecific binding can be accounted for in such methods by parallel immobilization of reference proteins [36, 37, 39, 43]. Reference proteins could either be an unrelated protein (for example carbonic anhydrase), or better the identical protein target with a blocked binding site. Blockage of the active site of an enzyme or the anticipated drug binding site can be achieved by several methods like introduction of binding site destructive mutants or the binding of irreversible inhibitors.

2.2.5 Modification of Target

The reliability of the biological system under investigation is extremely important. Consequently, assay methods are preferred where none of the interacting partners has to be modified by labeling or immobilization. Immobilization can induce a severe modification of the protein with respect to structure, flexibility and consequently activity. It must be part of the assay development to select an adequate immobilization procedure that does not modulate protein activity and to thoroughly checking intactness of target protein with control measurements using positive controls.

2.2.6 Dynamic Range

Dalvit [8] described dynamic range limitations for certain technologies that could lead to higher numbers of false negatives. He argues that with many of the technologies protein/fragment interactions can be detected only at fragment concentrations close to the equilibrium binding constant K D. For these technologies, for instance SPR, the observed response is directly proportional to the ratio L/K D with L being the fragment concentration in solution. In SPR experiments with fragments the lower limit of LT/K D leading to a detectable response is probably in the order of 0.2. This limitation is of concern for very low affinities when K D is significantly higher than the solubility limits of the fragments. In this respect the thermofluor methods is probably the method with the highest limits because the low affinity of the fragments might not lead to a strong stabilization of the protein and hence to a non detectable shift in the protein melting temperature [12]. By contrast, it has been shown that NMR methods can have a higher dynamic range. In WaterLOGSY experiments it was demonstrated that binding was still observed even if the ratio of LT/K D is as low as 0.07 [44].

2.3 The SPR Based Binding Assay for Screening

2.3.1 The Hardware

Commercially available instruments that can be used for SPR based experiments are available from several vendors (see Table 1). The set-up of an SPR based binding assay given in this review is related to the use of a Biacore A100 instrument that achieves higher throughput of measurements by parallelization. It enables parallel testing of four ligands independently in four flow through channels. Each channel provides the possibility to immobilize four proteins in parallel. For example one channel allows the measurement of the wt-protein and 1–3 reference proteins to eliminate false positives due to unspecific binding or to characterize specificity of binding. The sensor chips most often used are the so called CM5 sensor for covalent immobilization of the target via amide coupling chemistry. Recently a C7 sensor was launched with a much higher binding capacity for protein immobilization. The CM5 and the CM7 sensors are both equipped with a carboxymethyldextran adlayer [45]. Alternatively, sensors with immobilized Ni-chelator or streptavidin have been used if the protein that has to be immobilized contains the appropriate affinity tags, a poly-histidine sequence [4648] or biotin [49, 50].

2.3.2 The Immobilization Strategy

The most frequently applied method to immobilize soluble proteins on the sensor chip surface is amine coupling. Covalent binding is achieved by activation of carboxylic acid groups on the surface of a CM7 (CM5) sensor and subsequent linkage of these activated carboxylic acids via the amino acid side chains of lysine of the protein. No upfront biomolecular engineering or chemical modification of the protein is necessary. Amine coupling is probably the method by which the highest density of immobilized protein can be achieved. The irreversible covalent coupling makes the set-up extremely robust with respect to leakage of protein, however, the random immobilization is often seen as a disadvantage because of the potential for loss of active protein [51]. Another disadvantage is the lack of feasibility of regenerating a sensor chip after adhesion of undesired compounds (promiscuous binders) as it has been demonstrated that such adhered substances can significantly influence the outcome of follow-up binding experiments in a screening effort.

In this respect reversible capturing of histidine tagged proteins has clear advantages because it enables full regeneration of the surface (removal of protein and ligand) and reconstruction with fresh histidine-tagged protein after each binding experiment [48]. However, such strategies require larger amounts of protein.

2.3.3 Assay Quality

The quality of an SPR based direct binding assay can be described in a similar manner to an HTS assay by measures that characterize the robustness and the reproducibility of the assay. In order to determine the reproducibility of a screen, a set of compounds is tested in replicate. It is important that all experimental steps of a given screen such as sample preparation, injection mode, washing procedures, data evaluation are included. The statistical data of the correlation (for instance slope and standard error) are indicative for the reproducibility of the data [39].

The Z′ factor introduced by Zang is a well accepted measure for the robustness of HTS screens. It is calculated according to (1):

$$ Z\prime = 1 - \frac{{3{\sigma_{\text{s}}} + 3{\sigma_{\text{b}}}}}{{{R_{\text{s}}} - {R_{\text{b}}}}}. $$
(1)

In this equation the indices s and b denote the variation (σ) or the average response (R) of the positive (s) and a negative (b) control. R s is determined at saturation concentration of the positive response. With certain limitations, the Z′ factor can be used for expressing the robustness of an SPR based fragment screen. One such limitation is the molecular weight dependency of SPR responses. Since SPR measurements are dependent on the molecular weight of the compound, Z′ factors are only relevant measures for robustness if they are determined for control compounds that have a molecular weight comparable to the average molecular weight of the compounds to be tested in a screen. Low Z′-factors result either from high standard deviations for the negative control and/or the positive controls. This can often be optimized by optimizing the running buffer, regeneration and washing conditions but also the sample preparation steps. Another source of small values of Z′ is a low density of active protein on the sensor surface that is reflected in small saturation responses for the positive controls. In this case the optimization of the assay might be achieved by a better purity and activity of the target protein as well as an alternative immobilization procedure. It has been discussed, that molecular weight dependent Z′-factors can be used to determine the minimum molecular weight and the percentage of compounds of a given library for which statistically relevant data could be expected for that screen [39].

2.3.4 The Screening Cascade

The screening cascade in SPR based fragment screening contains a series of assays that enable the application of different filter criteria for the selection of true positive binders. An overview on the most commonly used filters is given in Table 2.

Table 2 Overview of selection filters and the respective assay types

2.3.5 Single Concentration Affinity Filter

The measured responses at the given concentration should be located in a window that is defined by the average responses and the respective standard deviation of negative and positive controls. The lower limit of a positive response is usually taken as three times the standard deviation of a negative control. The upper limit of such a window is less well defined as many of the compounds show over-stoichiometric binding at high concentration. Nonoptimal behavior with respect to stoichiometry does not per se disqualify compounds as interesting binders that could appear as positives by X-ray crystallography. It has been suggested to differentiate between nonstoichiometric binders and “superstoichiometric” binders (>5 times the saturation response of positive control) which are disqualified for follow up work [52].

2.3.6 Promiscuity Filter

The term promiscuous binders has recently been applied to a class of compounds that often show up in high through put screens as false positive hits due to their ability to inhibit a broad spectrum of different protein classes. The investigation of promiscuous binding in solution indicates that such compounds form soluble or colloidal aggregates that envelop the protein. It was recently demonstrated that such promiscuous binding can easily be identified in time resolved SPR experiments and a number of mechanisms for the inhibition of protein function is suggested. The classification scheme presented in this work can be used during the evaluation of single concentration data to rapidly characterize and eliminate such compounds [52].

2.3.7 Specificity Filters

In SPR technology, any adsorption of compounds at the sensing surface will lead to a signal response and the observed signal is a superposition of specific binding to the desired binding sites on the target biomolecule and nonspecific binding to any place on the surface of the biomolecule or anywhere on the surface of the sensor. Special care is required to design an experimental setup that can distinguish between specific and nonspecific binding in order to deselect compounds that lack specific binding properties to the site of interest. Most of the approaches are based on preparing reference channels by immobilizing proteins that are structurally related to the target, but have a blocked active site or binding site of interest disabling specific binding of the analyte.

Blockage of the binding site can be achieved by site directed mutagenesis, i.e., by impairing or modifying the targeted site of a given protein via the exchange of one or several essential amino acids [43, 51] possibility is the use of a covalent irreversible inhibitor [36, 37] or an inactive form of the active protein (a zymogen) as reference protein [39].

Another possibility to filter for specific binding is a competition experiment with compounds known to bind to the binding site of the target [39, 43]. In this case the binding experiments have to be performed with (1) the pure test analyte (2) the reference compound and (3) with a mixtures of both substances. Generally, the compound concentrations in mixtures are the same as those in the solutions that contain analyte and reference alone. In the case of noncompetitive binding (different binding sites of analyte and reference compound) the sensor signal of the mixture is the sum of the sensor signals that were measured for the two compounds alone. In the case of competitive binding (binding of the analyte and reference compound to the same binding site) the resulting signal of the mixture is of intermediate strength between the two signals measured for the compounds alone. The expected signal for the mixture can be calculated taking the fractional occupancies at the binding site by the competitor and test analyte into consideration [39]. They can be derived by applying the law off mass action under the assumption that the concentration of the compounds in solution is not changed upon binding (this assumption is true for experiments in a flow system).

2.3.8 Dose Response Filters

Dose response filters are very valuable but require the highest workload. They are based on data recorded for dilution series of compounds (8–10 concentrations per compound). Exclusion criteria are based on the fit of the experimental data points to theoretical curves with respect to curve slope and saturation behavior. Sigmoidal dose response (response versus logarithm of concentration) or hyperbolic (response versus concentration) functions are both used as theoretical fit functions. Due to throughput limitations of most of the presently available SPR systems, complete dose response curves can only be recorded for a limited number of compounds. In case no specificity filter can be applied, a primary screen could deliver several hundreds of positives, as hit rates of 10–20% are observed for some targets. For such projects, a rough dose response filter can be applied by measurement of two concentrations per compound. Based on the theoretical background dose/response, the behavior of a compound can be tested by measuring the response at two different concentrations and comparing the resulting response ratio with the theoretically expected one.

The number and accuracy of filters varies from target to target and strongly depends on the properties of the target and the feasibility of developing the respective assays. Sometimes the application of the selectivity filter is not feasible as the ligand binding site is not well defined (e.g., targets with many allosteric sites) or an appropriate reference ligand is not available to perform competition assays. In such cases selection of positive hits relies on filters such as affinity, promiscuity and dose response, only. Material and measurement time is saved, if several of the filter criteria can be covered by a single assay. With the flexibility offered by modern SPR instruments, many reference proteins can be immobilized in parallel with the target protein making data for selectivity, promiscuity and affinity criteria available in one single assay. The sequence of assays in a screening cascade is guided by efficiency consideration, i.e., assays with lower time demand per tested compound are generally located at the top of the cascade whereas more time consuming assays are at the bottom when filtering has already reduced the number of test compounds. In general, the more complex an assay the more stringent the filter criteria related to it, i.e., the filtering becomes more and more stringent along the screening cascade.

2.4 SPR Based Screening with Pharmaceutically Relevant Targets

In order to illustrate the theoretical considerations of a fragment screening effort we selected two targets, chymase and β-secretase (BACE) for a more detailed discussion of the experimental set-up.

2.4.1 Chymase

Binding experiments were performed with the wt-protein and the zymogen (an inactive proprotein) immobilized via amine coupling on a CM5 sensor chip. Figure 1 depicts typical binding curves monitored for the two proteins during the contact with solutions containing a positive control compound at different concentrations. The figure indicates that this set-up is an ideal filter to distinguish between selective active site binding and nonselective binding of compounds. Nonselective binding would lead to positive response in the channel with the proprotein as well as in the channel with the wt-protein. Based on the saturation response of about 60 RU and the response of about 6,000 RU monitored upon immobilization of the protein, the relative amount of active protein was estimated to be 66% considering the two different molecular weights of the protein (30,000 Da) and the positive control (456 Da). The equilibrium dissociation constant for the positive control compound was determined to be 290 nM. The Z′ factor determined for this positive control was 0.83 indicating excellent quality data. For the determination of the robustness of the assay with fragments, one has however to consider a MW corrected Z′ factor [39]. For the average molecular weight of 214 Da of the library screened the Z′ factor is around 0.73. Figure 2 shows the results from reproducibility testing with the samples of one 96 well plate. The statistical data of the correlation of the responses from each plate indicate that the measurements are highly reproducible. The slope of the correlation is as expected 1.0 and the standard error is about 2.4.

Fig. 1
figure 1_225

Sensograms monitored from sensor surface with immobilized active chymase (left) and zymogen (right) in contact with solutions at different concentrations of the positive control (structure shown in the inset). This set-up is highly valuable to differentiate between binders that bind to active site (same pattern as for positive control) or to a different site (no response monitored from the surface with the immobilized zymogen). For the active protein the experimental response curves are overlaid with the theoretical curves obtained by fitting the experimental curves with the mathematical equations for a 1/1 kinetic model. Kinetic (k on and k off) as well as equilibrium binding parameters of the positive control given in the inset are extracted using this model

Fig. 2
figure 2_225

The graph shows the reproducibility of the assay. 96 compounds are measured twice and the responses correlated with each other. Positives are marked in the inset

Two thousand two hundred and twenty-six fragments were tested in the assay described above. Figure 3 shows the results of the screen of one 96-well plate in a graphical representation with the responses at the report points as vertical bars. The plate contained 96 solutions of test compounds and 8 solutions of the control compound. In addition four negative controls (buffer with DMSO) were injected during the run. Figure 3 shows binding of many compounds to the immobilized chymase but only for a few of them a significant difference in binding between the wt-type protein and the inactive zymogen is observed, indicating selective binding to the active site.

Fig. 3
figure 3_225

Responses monitored in a screening set-up for compounds in a 96-well plate from the surface with active chymase (black bars) and with zymogen (white bars). Responses marked with a star are from injections of the positive control. The high quality of the assay is obvious from the amplitude of the signal as well as from the stable ratio of active and zymogen response. Signals from positive compounds showing selectivity are marked with triangles, signals from positive nonselective compounds with filled circles. The dotted line marks three times the standard deviations of the response of the negative control, and corresponds to the threshold for the positive hits

Selection of the primary positives was first based on a promiscuity filter applying the criteria defined by Gianetti et al. [52], and by the affinity and the selectivity filter. Compounds were taken as positives if they exhibit no indication of promiscuity, show a response on the active protein that is higher than three times the standard deviation of the negative control and have a ratio of the responses on the active protein and the zymogen greater than two. One hundred and eighty fragments passed all the filters and were defined as positives.

In addition, these positives where confirmed in a competition assay with a positive control leaving 80 compounds for further characterization. The next validation step consisted of the determination of the K D’s via 10 point dose response experiments. This left 36 substances with well defined dose response in an affinity range from 10 to 60 μM for further characterization in X-ray crystallographic experiments.

2.4.2 BACE

A similar assay set-up was used for the fragment screening of BACE [11]. BACE was immobilized (12,000 RU) by standard amine coupling chemistry on a CM5 sensor. A mutant protein with the essential active site aspartate D39 mutated to alanine was used as a reference protein in a second channel. Figure 4 shows a typical sensogram monitored for the wild-type and mutant protein when contacted with a known high affinity (60 nM) small molecule inhibitor. The set-up is well suited as a selectivity filter, as compounds with selective binding to the active site of BACE show no or a reduced signal on the channel with the mutated protein. Figure 5 shows the screening results obtained from 96 compounds demonstrating the importance of such a selectivity filter for the BACE screen. Application of the affinity filter (response greater than 3 times the standard deviation) and the promiscuity filter alone would lead to a hit rate of about 60%, but the specificity filter that considers the ratio of the responses of wild-type and mutant protein reduces this number to 2.1%. It has to be mentioned in this context that a 60% hit rate without specificity filter is not frequently observed. This hit rate for primary positives depends on the screening concentration and the target protein. Whereas the screening concentration was not exceptionally high (250 μM) the properties of the target protein could be responsible for this high primary positive rate. The protein used for this screening was the full length protein that contains a hydrophobic membrane anchor and this area could be the source of the numerous unspecific positives.

Fig. 4
figure 4_225

Sensograms monitored from sensor surface with immobilized active BACE-1 (left) and blocked BACE-1 (right) in contact with solutions of different concentrations of the positive control (structure shown in the inset). This set-up enables differentiation of binders that bind to active site (same pattern as for positive control) or to a different site (no response monitored from the surface with the immobilized zymogen). For the active protein the experimental response curves are overlaid with the theoretical curves obtained by fitting the experimental curves with the mathematical equations for a 1/1 kinetic model. The kinetic and equilibrium binding parameters of the positive control given in the inset are extracted using this model

Fig. 5
figure 5_225

Responses monitored in a screening set-up for compounds in a 96-well plate from the surface with active BACE-1 (black bars) and with active site mutated BACE-1 (white bars). Responses marked with a star are from injections of the positive control. The high stability of the set-up is obvious from the amplitude of the signal as well as from the stable ratio of active and blocked BACE-1 response. Signals from positive compounds showing selectivity are marked with triangles. The dotted line marks three times the standard deviations of the response of the negative control, and corresponds to the threshold for the positive hits

All specific primary positives were confirmed in a competition assay using a known high affinity active site binder as competitor compound followed by dose response experiments to determine the K D values.

3 X-Ray Crystallography

Although direct crystallographic screening can be successfully applied for fragment screening, and offers a number of advantages, it is now less commonly used in this way compared with biophysical or biochemical assays that require less resource [14, 53]. However, most fragment based drug discovery programs that have advanced beyond mere screening have used structural biology [54] to drive hit progression. Indeed, only a few groups have applied the fragment approach to target classes like transmembrane proteins (e.g., GPCRs and ion channels) where protein structures are not easily accessible [55]. The additional information coming from the structures of hits in complex with their target helps to select the most promising candidates for subsequent fragment growth or fragment optimization. Structure based molecular modeling allows more efficient optimization of low affinity fragment hits to leads. Indeed for targets whose 3D structure is not available a fragment screening often is not considered at all. X-ray crystallography is the preferred biostructural technique, because it can be applied to most protein targets and delivers exact structural information for structure based optimization of chemical leads.

3.1 Prerequisites to Generate Fragment Complex Structures

To optimize the resources needed in following up a fragment screening with crystal structures, the setup of an efficient crystallographic workflow is important. This includes a good supply of crystallization grade protein, a reproducible crystal form diffracting to high resolution or robust soaking system, a reliable crystal harvesting procedure and an optimized X-ray data collection and structure determination process.

3.1.1 Protein

Generating suitable protein is often the most labor intensive step on the way to 3D structures. Enough protein to create hundreds of crystals is needed and therefore care needs to be taken with expression and purification procedures. High yield expression and simple and effective purification protocols are beneficial. Optimized protein constructs for crystallization often lack glycosylation sites and carry affinity tags for purification. All standard structural biology protein expression systems are used to produce the proteins for fragment cocrystallization, i.e., E. coli, Baculo virus insect cell systems or mammalian cell lines.

3.1.2 Crystallization System

Many crystals will be needed for determining complex structures of the typically 102 hits from a fragment screening. Reproducible production of well diffracting crystals is therefore important for the crystallization system used and care is taken to optimize the crystallization procedure. To obtain cocrystal structures there are two options: soaking fragment hits into existing crystals and cocrystallization. Each has advantages and disadvantages. For soaking, the crystals can be produced in advance in a few crystallization experiments and low amounts of protein are consumed. Also higher ligand concentrations can be used during soaking to shift the binding equilibrium towards full occupancy. DMSO can frequently be added to concentration as high as 20% favoring solubilization of compounds. In contrast, addition of DMSO in ligand cocrystallization experiments often prevents crystal growth.

Disadvantages of soaking could be the target conformation in the crystal that may not be optimal for binding of a specific ligand. The crystal packing may hinder the diffusion of the ligand to the pocket as the fragment hit needs to be able to diffuse through the crystal solvent channels to the binding site. Therefore crystal packing differences are expected to influence the soaking success. The use of several different crystal forms with different packing of the molecules is one way to reduce false negatives.

During cocrystallization the complex is already formed in solution and the target protein is free to assume any conformation necessary to bind the specific ligand. However for cocrystallization the protein consumption may be higher and the maximum concentration of ligand is limited, because ligand and organic solvent may influence crystallization. Both soaking and cocrystallization methods require that the fragment binding site of the target is not blocked by the lattice packing of the protein target crystal and therefore some crystal forms of the target may not be suitable at all.

3.1.3 Diffraction Data Collection and Structure Determination

Following up a fragment screening with X-ray structures can involve collecting hundreds of datasets. Access to state of the art synchrotron beam lines equipped with modern fast detectors such as PILATUS [56] and automated sample changers such as CATS [57] greatly reduces the time needed for data collection. For example at the beam line X10SA at the SLS equipped with the PILATUS detector, about 60 data sets are now routinely collected per shift of 8 h and diffraction data for the hits from a typical fragment screening campaign can be collected in less than a day. Often data from the obtained diffraction images are processed by automated scripts that output difference electron density maps without the need for manual interference. For well behaving crystals the crystallographer’s task is reduced to inspection of difference electron density maps, building and refining the model of the complex structure. Tracking the big number of crystallization, soaking and diffraction experiments done in parallel is a challenge in itself and book keeping is best managed with a Lab information management system (LIMS) that supports this comprehensive workflow [58]. In Pharma companies the resulting complex structures are deposited in in-house databases that are similar to the PDB, which however can be accessed by medicinal chemists easily e.g., by querying the ligand properties or generate superpositions based on the ligand binding pockets (e.g., Proasis2® from http://www.desertsci.com). Last but not least the obtained cocrystal structures are communicated to the drug discovery chemists in front of the computer screen in a modeling session. For this purpose molecular graphics such as Moloc [59] and PyMOL (http://www.pymol.org) are used.

3.2 Determinants for Success in Cocrystallization

The success rate of getting cocrystal structures of fragment screening hits varies greatly. Whereas only very few hit structures were achieved in several independent fragment screenings on BACE (reviewed in [60]), for some other targets (e.g., chymase), cocrystal structures were obtained for about a third of the selected fragment screening hits [39]. What determines the differences in success rate has not yet been well assessed and is somewhat controversial. Here we want to list several factors without the claim for comprehension. (1) ability of the target or the ligand binding site investigated to bind small molecules (drugability) and the resulting potency of the fragment hits (2) the packing environment in the available crystal forms (3) the difference in solubility and binding affinity of ligands between the crystallization or soaking conditions and the assay conditions for the upstream fragment screening (SPR). Whereas little can be done for (1) and (2), the following paragraphs give some considerations of how to optimize the experimental set-up for (3).

3.2.1 Matching of Conditions for SPR-Screening/Cocrystallization

For fragments containing ionizable groups or interacting with acidic or basic groups of the target protein their protonation state greatly influences the K D and, therefore, fragment binding is pH dependent. Differences in pH between the screening conditions and the crystallization or soaking conditions can lead to reduced or increased affinity of the fragment and failure to get complex structures. Fragment solubility is dependent on buffer pH and buffer composition. The precipitants in protein crystallization experiments are selected to reduce solubility of proteins and are, unfortunately, effective to small molecules as well. During the biophysical screening often organic solvents such as DMSO are present or detergents are added to increase the solubility of organic compounds. Such additives or solvents, however, could prevent growth or even dissolve crystals and are therefore often omitted from the crystallization experiment.

Ideally the conditions from which the cocrystal structures are obtained should be identical to those where the upstream screening experiment was performed. Matching the conditions between the primary screening and the crystallization or soaking experiment as closely as possible is one strategy to increase the yield of structures. If crystals do not grow at such conditions, the search for crystal soaking conditions that match the screening conditions can be tried. Another approach was used by AstaZeneca [44]. They used the surrogate protein endothiapepsin to get complex structures of BACE fragment screening hits as endothiapepsin crystallizes at pH 4.6 which is closer to the acidic assay conditions. In contrast, BACE crystallization conditions have a neutral pH. If the crystallization conditions cannot be changed, it may be possible to run the primary biophysical screening assay at conditions like pH, buffer and salt concentration closer to those of the one suitable for the X-ray crystal system.

Overall, we have to accept that a perfect match of experimental conditions is not feasible and that a lack of hit confirmation may not result from an issue with a particular biophysical method. Further we need to accept that some valid hits will not be confirmed and, consequently, not considered for follow up work.

3.2.2 Prioritization of Ligands for X-Ray Experiments: K D and Solubility

The cocrystallization and structure determination needs more time and resources than the primary screening methods like SPR. In order to limit the number of X-ray experiments, prioritization of the experiments is important. This enables a focus on the effort of crystallization experiments with those ligands where chances of complex structures are highest and to deprioritize experiments with ligands yielding less likely structures or not at all. Amongst others, binding affinity and solubility of ligands can be used as criteria to prioritize experiments.

The affinities of fragment screening hits range from a few μM to mM. Most fragment screening hits therefore have lower affinities than compounds from already advanced chemistry series or HTS with affinities in the nM to μM range. It is important to note that there seems to be no minimum affinity required for successful determination of complex structures and even mM compounds have been reported [11]. The experience of many fragment projects suggests that it takes more effort to get complex structures of low affinity fragments. The main reason could be the high compound concentration to be required for the experiment (about >10 times the K D), which can result in concentrations as high as 10–50 mM which are often at the solubility limit of the compounds. Analysis of Roche fragment screening efforts indicated however that K D alone was a better indicator of cocrystallization success than compound solubility (data not shown). The pH influences both affinity and solubility of the ligand. Besides matching the crystallization conditions to the assay conditions, the use of two or more independent X-ray systems with different crystal packing and different crystallization conditions and pH could also be expected to increase the yield of cocrystallization efforts and this was indeed the case in our labs (data to be published elsewhere).

3.2.3 Hit Expansion

Another approach to get more structural information from fragment screening hits is to use hit expansion [11, 44]. Hit expansion is a similarity search or in silico screening for potent analogs of the initial fragment screening hits from public or proprietary compound libraries. Application of synthetic chemistry by growing fragments (for example addition of solubilizing groups) or exchange of moieties or single atoms is more resource intensive, but can quickly generate SAR information for fragments. In addition, further compounds for cocrystallization with lower K D albeit with higher molecular weight and possibly lower ligand efficiency than the original screening hits are established. In the BACE fragment screening at Roche, additional binders were identified during hit expansion which subsequently delivered several structures that could be used for computer assisted molecular modeling [11].

3.3 Making Use of Structural Information in Synthetic Chemistry

Much has been reported about drug discovery facilitated by fragment screening and the transformation of fragments into clinical candidates and there are some excellent reviews on this subject. Here we focus on three examples because of their association with Roche to exemplify such a drug discovery effort.

One of the most intensively characterized targets regarding fragment screening is BACE, and many complex structures of different fragments targeting the active site of this aspartyl protease have been published [11, 44, 6163]. The primary fragment screening methods included SPR, NMR, crystal soaking and computational methods. The resulting hits belong to different scaffolds. All hits hydrogen bond directly or indirectly to the catalytic aspartates, have hydrophobic interactions at the S1 pocket and often to one of the other subpockets in the BACE substrate binding site (Fig. 6a). Taken together they map the most tractable or drugable part of the BACE substrate binding pocket [60]. From the complex structures two binding hot spots could be identified, which are the side chains of the two catalytic aspartates 32 and 228 and the S1 pocket. Amines or other basic groups are observed binding the aspartates and always a benzyl ring filling S1. Subpocket S3 is frequently occupied, and a variety of hydrophobic groups are accepted there. From the structures of these fragments a common pharmacophore can be derived, which can guide the computer aided molecular modeling of BACE inhibitors with new chemical scaffolds. Further optimization and growth of the fragments by structure guided medicinal chemistry efforts resulted in potent inhibitors that extend to the prime side subpockets S1′ and S2′. One larger and more potent compound even displaces the Tyrosine sidechain of the flap loop opening up a new pocket that does not exist in the unliganded and most of the fragment complex structures (Fig. 6c; [60]).

Fig. 6
figure 6_225

(a) Fragments bound to the BACE active site aspartates (PDB entries 2OHK, 2BRA). (b) More structures after fragment hit expansion. Compounds occupy more of the BACE active site. (PDB entries 2OHM, 2OHQ, 3BUG, 3BUH, 2V00). (c) BACE inhibitor leads from fragments extend into prime sites (PDB entries 2OHT, 2OHU, 2VA7)

Many cocrystal structures of fragments were obtained for the serine protease chymase [39]. The common feature of all fragments was an aromatic group binding to the S1 pocket and many fragments had an acidic group or oxygen atom in the oxyanion hole. The observation can be explained with the substrate specificity of chymase, which cleaves after aromatic side chains. The structures of the fragment screening hits highlight the importance of the S1 pocket and the oxyanion hole as hot spots for inhibitor binding. The different binding geometries exemplify the possibilities and limitations for groups fitting S1 and for possible exit vectors from S1 to the rest of the binding site (Fig. 7). The high hit rate suggested a good drugability of the target that was soon be confirmed by rapid progress in drug discovery.

Fig. 7
figure 7_225

Fragments bound to Chymase. The S1 pocket is always filled by aromatic rings, although these are not precisely oriented due to the lack of hydrogen bonds. Only the ring plane is very well conserved. Figure 7a relates to Fig. 7b by 90° rotation around the vertical axis

The example with best progress for a fragment screening derived lead compound is the B-Raf protein kinase inhibitor discovered at Plexxikon and shown to be successful in advanced clinical studies for Melanoma at Roche. In a fragment screening at Plexxikon with several kinases a nonspecific kinase binding fragment was found. Subsequently, selectivity was built into this novel lead series during fragment growth taking in the information from X-ray structures with several kinases into consideration. The lead compound PLX4720 binds to a pocket almost unique to the activated B-Raf. It is highly selective and shows nanomolar affinity for the oncogenic B-Raf(V600E) mutant. Studies in animal models have confirmed its therapeutic potential for treating B-Raf(V600E)-driven tumors [64].

4 Discussion and Conclusions

4.1 Combination of Efforts for Fragment Screening in a Seamless Workflow

There are numerous ways to establish a workflow for fragment screening that can successfully be applied in drug discovery projects. Today, most of the fragment screening efforts reported in literature are performed with a combination of biophysical methods. Figure 8 outlines one possible workflow as applied in a number of projects at Roche. A method with the ability of high(er) throughput like SPR is used for screening a fragment library of several thousand compounds, and hit confirmation is carried out with the same assay, as outlined in this review. The filtered and confirmed hits are further characterized by an orthogonal assay in order to improve the confidence that the fragment identified really binds to the target and to the binding site of interest. Here the role of X-ray crystallography is of extreme importance to visualize the binding of the fragment in detail and to facilitate analysis of the binding mode by computational chemistry. This fragment binding information leads to the establishment or refinement of pharmacophore models as well as gives insight into new patterns of interaction of small molecules with their protein targets. At this point, or earlier in the workflow after confirmed hits from the SPR screening are analyzed, a hit expansion can be performed by a similarity or pharmacophore search. Additional compounds from the internal library or purchased from external vendors are in our experience a great source for improving the chance of finding more potent ligands as well as to get X-ray structures before synthetic chemistry efforts are required. Several cycles of screening, hit analysis and characterization can be applied to a target and essential information for the drug discovery project derived.

Fig. 8
figure 8_225

Workflow for fragment screening used at Roche

There is no doubt that the results from a fragment screening effort influence the decision making of a project. In particular, the use of such data to inspire synthetic chemistry by identification of new binding scaffolds for a particular target is well established. Such information can be used to initiate novel lead series or to optimize chemical leads by replacement of a moiety. Another use of the results from a fragment screening is the assessment of the small molecule drugability of a target. The hit rate as well as the average ligand efficacy of the best fragments of a respective target gives a good indication about the effort required to identify potent small molecule ligands.

4.2 Outlook

There are still a number of ways to further improve the success of fragment screening efforts as defined as the identification and validation of novel binding motifs for a drug target in order to inspire synthetic chemistry efforts. Improvement of the fragment library (potential areas are solubility, structural diversity, compound purity etc.) and better alignment of assay conditions should both be considered. Addition to the workflow of further assay methods with a protein consumption, throughput and sensitivity profile similar to SPR, but without the need for immobilization (i.e., a homogenous assay) would also be welcomed.

We see the greatest value of this approach for novel drug targets with limited knowledge regarding small molecule ligands, targets with perceived low drugability or in projects with limited chemical space. This includes protein–protein interactions as well as targets like proteases or other enzymes.

Encouraging progress regarding the application of biophysical methods to transmembrane proteins including GPCR’s will pave the way for the extension of the application of fragment screening to further targets classes.