The theoretical foundation of binding free energy calculations (BFE)—free energy perturbation (FEP)—was laid down by Zwanzig [1] in the 1950s, and later refined by Bennett, who derived the optimal analysis to estimate the free energy differences from simulations [2], and others [3]. A related technique for free energy calculations, thermodynamic integration (TI), was invented by Kirkwood even earlier [4]. In the 1980s, a number of groups demonstrated that such free energy methods could be used to compute the hydration free energies of small molecule solutes [5, 6] and the binding free energies between protein receptors and small molecule ligands [7,8,9,10,11,12]. Techniques were introduced to compute either the individual binding free energy between a ligand and a receptor (by so-called “absolute” binding free energy calculations, or ABFE for short) [13, 14], or the difference in the binding free energies between two ligands against the same receptor (by “relative” binding free energy calculations, or RBFE) [15, 16]. In the early days, however, BFE calculations were hard to set up and they took a long time to run, and they seemed a long way away from commonplace utility in drug discovery.

The simple and elegant theoretical foundation for BFE belies the subtleties in performing correct and efficient calculations. The statistical precision of an FEP calculation depends on the extent of change in the equilibrium distribution of molecular configurations from the initial state to the end state of the alchemical transformations: the smaller the change, the higher the precision [17]. Key to efficient BFE calculations is to limit this change by restraining the ligand in position and in conformation during the transformations, in such a way that the restraints’ contribution to the free energies can be accounted for [13, 18]. A general set of criteria for setting the restraints can be derived by separability of integrals in the partition functions.

One by one, the technical challenges of performing correct and precise BFE calculations have been resolved by a number of, primarily academic, groups. We have learned how to avoid numerical instabilities in BFE calculations by the introduction of softcore potentials [19, 20], how to treat ligands with net charges [21, 22], how to enhance the sampling of the ligand binding pose and the conformation of the binding pocket [23,24,25,26], how to treat the non-negligible contribution of the omitted dispersion interactions between atoms beyond the cutoff distance by a mean field approximation [27], and how to best analyze the results [3, 28] and estimate the statistical errors [29]. BFE has been validated against the independent method of computing binding affinities by long molecular dynamics (MD) simulations of reversible protein-ligand binding [22]. The best practices for BFE are summarized in a recent review [30]. Academic drug hunters—and a few industrial early adopters—have developed their own BFE solutions and successfully applied BFE in identifying potent drug candidates [31,32,33,34,35,36,37]. Expertise in BFE, however, was necessary in such early successes.

To bring binding free energy calculations (BFE) from academia to the drug discovery industry [38,39,40] (so that non-experts can use them effectively), one had to implement the simple and elegant idea from physics in the messy reality of chemistry, at scale, with sufficient accuracy and throughput. An integrated tool chain has to be developed to prepare the ligands (in their correct protonation and tautomeric states), parametrize their force field, generate their binding poses, map atoms from one ligand to another in RBFE calculations, submit and monitor the many simulations in BFE, analyze the output, and report the predicted binding free energies with associated error estimates (Fig. 1).

As is common in developing an academic concept into an industrial product, one group needs to assemble in a complete solution all the puzzle pieces worked out by many—scattered in various papers, books, presentations, and personal communications—and then some.

Fig. 1
figure 1

The Tower of Binding Free Energy Calculations. Built on the simple foundations of free energy perturbation theory and enabled by advances in force field models and the readily available computing power afforded by graphic processing units (GPUs), BFE required an integrated tool chain for it to become a routine computational tool in industrial drug discovery

Here I share a brief personal account of the inception and development of the FEP+ software, probably the most widely used commercial implementation of BFE in the pharmaceutical industry today. Its intellectual seed was planted when I learned extensively about BFE in the academic research by my lab-mates in Ken Dill’s group [41], where I was a postdoc. Later, after working with my colleagues in D. E. Shaw Research (DESRES) to finish an early version of the DESMOND MD simulation program [42] in 2006, I started to develop BFE as an extension—which I called the Gibbs module—to DESMOND. In that same year, the computational chemistry software company Schrodinger expressed an interest in DESMOND, intending it to be a tool to sample the protein’s conformations in docking studies [43]. Soon that interest pivoted to developing a new software solution for BFE (to complement MCPRO+ [16]). A handful of scientific developers in DESRES and Schrodinger, in collaboration with a few academic groups, persisted through early disappointing results and prevalent skepticism (Outside the BFE experts, BFE was joked to be the most expensive random number generator). In 2013, almost three decades after the first proof-of-concept BFE calculation was published and seven years after I implemented the bare-bone functionality of BFE in DESMOND, Schrodinger started to ship the new BFE solution, bundled with Schrodinger’s OPLS3 force field [44] and branded FEP+, to customers of pharmaceutical companies. A validation study on eight different targets was published in 2015 [45]. Others have since developed their own toolboxes for running BFE calculations using a variety of MD programs, including AMBER [46, 47], OpenMM [48], and GROMACS [49, 50].

Two concurrent developments drove the adoption of BFE in drug discovery. First, graphic processing units (GPUs) became ubiquitous and a number of MD software packages implemented GPU-accelerated codes that were an order-of-magnitude faster than the CPU codes [51,52,53]. What used to take a month could now be completed in only three days, which fit in a typical weekly design-predict-make-test-analyze (DPMTA) cycle in drug discovery. Second, the force field models for both proteins and, importantly, small drug-like molecules were finally good enough to make BFE predictions adequately accurate for prioritizing the candidate molecules by their predicted affinities.

The large-scale deployment of BFE exposed unexpected problems, each requiring its own solution. For example, during the RBFE calculations, the molecular geometry may be distorted when the system is in the midst of changing between two ligands and thus its Hamiltonian does not correspond to one of a realistic molecular system, which leads to numerical instabilities. A solution to this problem was to introduce additional bonded interactions within an alchemical group that is no longer interacting with the rest of the molecular system: these extraneous interactions help maintain reasonable molecular geometries, their contributions to the BFE results canceling out because of the separability of integrals in the partition functions. Another example is RBFE calculations between enantiomers: a restraining potential is required to ensure the correct chirality as one molecule is transformed to its mirror image. For each problem encountered, a programmatic solution must be coded into the standard tool chain, so that the same problem should never have to be solved more than once.

Fig. 2
figure 2

The number of published applications of binding free energy calculations in drug discovery each year. Only journals in medicinal chemistry and drug discovery—including a few general journals—are considered (see Supporting Information for the Pubmed search query used). The empty bar represents the incomplete year of 2022. As discussed in the main text, the true numbers of publications reporting BFE applications in drug discovery may be twice as many

Despite the increasing adoption of BFE in drug discovery projects [54, 55], the number of published studies reporting discovery and optimization of small molecule drugs by BFE is—albeit growing—still relatively small. Out of more than 790 citations (per Google Scholar) garnered by Schrodinger’s landmark BFE paper [45], only 19 (2.4%) reported drug discovery efforts resulting in new chemical matters or new activities [56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74], out of which four did not report the actual use of BFE [62,63,64, 69] and one was unclear [68]. Since 1988, 3646 papers have been published that contain key words related to binding free energy calculations, but only 145 (4%) of these are published in the medicinal chemistry and drug discovery journals (Fig. 2 and Supplementary Information). Even if the number of publications reporting drug discovery employing BFE is twice the total count in Fig. 2 (7 out of the above-mentioned 19 publications citing the Schrodinger paper are counted in Fig. 2, implying an under-counting factor of \((19 - 4.5)/7 \approx 2.\)), it still represents a tiny fraction (0.4%) of the total number of publications in the medicinal chemistry and drug discovery journals (74,179 since 1988, Supplementary Information).

The following are some active areas of research that may help broaden the use of BFE in drug discovery.

The more dissimilar a pair of molecules are, the harder it is to compute their binding free energy difference by RBFE [75], but the more valuable such predictions are, because they allow larger chemical modifications—which often entails higher cost in synthesis—to be explored computationally. For example, RBFE attained much wider adoption after it accommodated scaffold hopping [76, 77]. Its domain of applicability will continue to expand as we enable RBFE to predict the binding free energy changes associated with ever larger chemical transformations.

One type of change of particular interest to drug discovery is a ligand modification associated with the displacement of a water molecule inside the binding pocket [78], as large binding affinities may be gained if the displaced water molecule is of high free energy. A number of approaches have been proposed to take into account such “water hopping” in RBFE [79,80,81,82,83]; this functionality should come standard in future BFE toolboxes.

The accuracy of BFE calculations is fundamentally limited by the accuracy of the underlying force field models. One promising avenue of research is multi-fidelity modeling: BFE first uses conventional force field models in the MD simulations, then more accurate but more computationally expensive energy models—such as QM/MM models [84, 85] or ML models trained on QM results [86,87,88,89]—are applied sparingly, so that an energy difference between the models can be computed and applied to correct the BFE results by FEP.

Often not all relevant molecular conformations are sampled in the simulations of BFE, and their contributions to the binding free energies are thus unaccounted for. A fruitful area of research is to combine conformational free energy calculations with BFE to incorporate the effect of receptor conformational flexibility and potentially multiple binding poses of each ligand [90, 91] into BFE [23, 92, 93]. For example, RBFE may be used to compute the difference in the binding free energies \(\Delta \Delta G^{\mathrm {bind}}_{ab,\mu }\) between two ligands a and b to each receptor conformation \(\mu \), and an enhanced conformational sampling method [94, 95] may be used to compute the conformational free energy differences between any two conformations \(\mu \) and \(\nu \) of either the apo receptor (\(\Delta \Delta G^{\mathrm {conf}}_{\mu \nu }\)) or the receptor in complex with a ligand a (\(\Delta \Delta G^{\mathrm {conf}}_{a,\mu \nu }\)), as illustrated in Fig. 3. From these results, the conformation-specific binding free energy, \(\Delta G^{\mathrm {bind}}_{a,\mu }\), of each ligand a to each receptor conformation \(\mu \) may be solved from the (over-determined) simultaneous equations

$$\begin{aligned} \Delta G^{\mathrm {bind}}_{a,\mu } - \Delta G^{\mathrm {bind}}_{b,\mu }= & {} \Delta \Delta G^{\mathrm {bind}}_{ab,\mu } \nonumber \\ (\Delta G^{\mathrm {bind}}_{a,\mu } - \Delta G^{\mathrm {bind}}_{a,\nu }) + (\Delta G^{\mathrm {conf}}_\mu - \Delta G^{\mathrm {conf}}_\nu )= & {} \Delta \Delta G^{\mathrm {conf}}_{a,\mu \nu } \nonumber \\ \Delta G^{\mathrm {conf}}_\mu - \Delta G^{\mathrm {conf}}_\nu= & {} \Delta \Delta G^{\mathrm {conf}}_{\mu \nu } \end{aligned}$$
(1)

where \(\Delta G^{\mathrm {conf}}_\mu \) (or \(\Delta G^{\mathrm {conf}}_\nu \)) is the conformational free energy of the apo receptor in conformation \(\mu \) (or \(\nu \)). The collection of pairwise free energy differences in Eq. 1 may be planned and analyzed using an optimal measurement network of pairwise differences [96]. The overall binding free energy of a ligand a to the receptor is derived from the combination of the conformation-specific binding free energies:

$$\begin{aligned} \Delta G^{\mathrm {bind}}_a = -kT \ln \sum _{\mu } \exp \left( -(\Delta G^{\mathrm {bind}}_{a,\mu } + \Delta G^{\mathrm {conf}}_{\mu })/(kT)\right) \end{aligned}$$
(2)

where k is the Boltzmann constant and T the temperature. Note that in the above the binding free energies for a set of ligands are determined up to a constant (\(\Delta G_0\) in Fig. 3).

Fig. 3
figure 3

Conformational free energy calculations and conformation-specific RBFE can be combined to properly account for conformational flexibility of receptor-ligand complexes in BFE calculations. Each vertex represents a specific ligand binding pose in a specific receptor conformation

In drug discovery, many molecules need to be considered in each DPMTA cycle, which calls for an efficient plan of RBFE calculations between well-chosen pairs of molecules [97]. New computational methods have recently been published that optimize the organization of RBFE calculations for many molecules, using the theory of experimental design to minimize the total statistical uncertainty in the calculations [96, 98, 99]. Bennett’s method has also been extended to the analysis of such calculations [100].

A related and exciting area of research is to efficiently integrate BFE and other computational and experimental techniques in a seamless workflow to drastically accelerate (by 10\(\sim \)100 times) the exploration of chemical space in the DPMTA cycle. For example, starting with 10,000 molecular designs from generative models [101,102,103], one may perform BFE on 100 diverse molecules chosen by a machine-learning model of quantitative structure-activity relationship (QSAR) trained on previous experimental and computational results. The QSAR model is then updated by the new BFE results (and new experimental results when available) and guides the selection of another 100 molecules for a second round of BFE. So on and so forth. Such active learning [104] may enable tens of thousands of molecular designs to be computationally generated and ranked by a feasible number of rigorous BFE calculations each week and substantially shorten the times of hit-to-lead and lead-optimization in drug discovery.

I would like to end with a personal reflection. I was fortunate to enjoy the long-time friendship with many people who shared an unwavering interest in BFE. We believed that together we could harness our understanding of physics to make a difference in the development of medicine for patients. When there was limited acceptance of BFE in drug discovery, attending free energy workshops and being surrounded by these friends helped sustain my interest and spur me to contribute to this endeavor. It is no small comfort to see that the workshops grew bigger each year and that BFE has started to play a key role in the development of molecules currently in clinical trial [105].