Introduction

Statistical Assessment of the Modeling of Proteins and Ligands (SAMPL) is a recurring series of blind prediction challenges for the computational chemistry community [1, 2]. Through these challenges, SAMPL aims to evaluate and advance computational tools for rational drug design. SAMPL has driven progress in a number of areas over seven previous rounds of challenge cycles [3,4,5,6,7,8,9,10,11,12,13,14,15] by focusing the community on specific phenomena relevant to drug discovery poorly predicted by current models, isolating that phenomenon from other confounding factors in well-designed test systems, evaluating tools prospectively, enabling data sharing to learn from failures, and releasing the resulting high-quality datasets into the community as benchmark sets.

As a stepping stone to enabling the accurate prediction of protein–ligand binding affinities, SAMPL has focused on evaluating how well physical and empirical modeling methodologies can predict various physicochemical properties relevant to binding and drug discovery, such as hydration free energies (which model aspects of desolvation in isolation), distribution coefficients (which model transfer from relatively homogeneous aqueous to nonpolar environments), and host–guest binding affinities (which model high-affinity association without the complication of slow protein dynamics). These physicochemical property prediction challenges—in addition to assessing the predictive accuracy of quantities that are useful in various stages of drug discovery in their own right—have been helpful in pinpointing deficiencies in computational models that can lead to substantial errors in affinity predictions.

Neglect of protonation state effects can lead to large modeling errors

As part of the SAMPL5 challenge series, a new cyclohexane–water distribution constant (log D) prediction challenge was introduced, where participants predicted the transfer free energy of small drug-like molecules between an aqueous buffer phase at pH 7.4 and a nonaqueous cyclohexane phase [16, 17]. While octanol–water distribution coefficient measurements are more common, cyclohexane was selected for the simplicity of its liquid phase and relative dryness compared to wet octanol phases. While the expectation was that this challenge would be relatively straightforward given the lack of complexity of cyclohexane phases, analysis of participant performance revealed that multiple factors contributed to significant prediction failures: poor conformational sampling of flexible solute molecules, misprediction of relevant protonation and tautomeric states (or failure to accommodate shifts in their populations), and force field inaccuracies resulting in bias towards the cyclohexane phase. While these findings justified the benefit of future iterations of blind distribution or partition coefficient challenges, the most surprising observation from this initial log D challenge was that participants almost uniformly neglected to accurately model protonation state effects, and that neglect of these effects led to surprisingly large errors in transfer free energies [16,17,18]. Careful quantum chemical assessments of the magnitude of these protonation state effects found that their neglect could introduce errors up to 6–8 kcal/mol for some compounds [18]. This effect stems from the need to account for the free energy difference between the major ionization state in cyclohexane (most likely neutral state) and in water phase (which could be neutral or charged).

To isolate these surprisingly large protonation state modeling errors from difficulties related to lipophilicity (log P and log D) prediction methods, we decided to organize a set of staged physicochemical property challenges using a consistent set of molecules that resemble small molecule kinase inhibitors—an important drug class replete with multiple titratable moieties. This series of challenges will first evaluate the ability of current-generation modeling tools to predict acid dissociation constants (pKa). It will be followed by a partition/distribution coefficient challenge to evaluate the ability to incorporate experimentally-provided pKa values into prediction of distribution coefficients to ensure methodologies can correctly incorporate protonation state effects into their predictions. A third challenge stage will follow: a new blinded partition/distribution coefficient challenge where participants must predict pKa values on their own. At the conclusion of this series of challenges, we will ensure that modern physical and empirical modeling methods have eliminated this large source of spurious errors from modeling both simple and complex phenomena.

This article reports on the experiments for the first stage of this series of challenges: SAMPL6 pKa prediction challenge. The selection of a small molecule set and collection of experimental pKa data are described in detail.

Conceptualization of a blind pK a challenge

This is the first time a blind pKa prediction challenge has been fielded as part of SAMPL. In this first iteration of the challenge, we aimed to assess the performance of current pKa prediction methods and isolate potential causes of inaccurate pKa estimates.

The prediction of pKa values for drug-like molecules can be complicated by several effects: the presence of multiple (potentially coupled) titratable sites, the presence of heterocycles, tautomerization, the conformational flexibility of large molecules, and ability of intramolecular hydrogen bonds to form. We decided to focus on the chemical space of small molecule kinase inhibitors in the first iteration of pKa prediction challenge. A total of 24 small organic molecules (17 drug-fragment-like and 7 drug-like) were selected for their similarity to known small molecule kinase inhibitors, while also considering properties predicted to affect the experimental tractability of pKa and log P measurements such as solubility and predicted pKas. Macroscopic pKa values were collected experimentally with UV-absorbance spectroscopy-based pKa measurements using a Sirius T3 instrument, which automates the sample handling, titration, and spectroscopic measurements to allow high-quality pKa determination. The Sirius T3 is equipped with an autosampler which allowed us to run 8–10 measurements per day. Experimental data were kept blinded for three months (25 October 2017 through 23 January 2018) to allow participants in the SAMPL6 pKa challenge to submit truly blinded computational predictions. Eleven research groups participated in this challenge, providing a total of 93 prediction submission sets that cover a large variety of contemporary pKa prediction methods.

Our selected experimental approach determines macroscopic pK a values

Whenever experimental pKa measurements are used for evaluating pKa predictions, it is important to differentiate between microscopic and macroscopic pKa values. In molecules containing multiple titratable moieties, the protonation state of one group can affect the proton dissociation propensity of another functional group. In such cases, the microscopic pKa (group pKa) refers to the pKa of deprotonation of a single titratable group while all the other titratable and tautomerizable functional groups of the same molecule are held fixed. Different protonation states and tautomer combinations constitute different microstates. The macroscopic pKa (molecular pKa) defines the acid dissociation constant related to the observable loss of a proton from a molecule regardless of which functional group the proton is dissociating from, so it doesn’t necessarily convey structural information.

Whether a measured pKa is microscopic or macroscopic depends on the experimental method used (Fig. 1). For a molecule with only one titratable proton, the microscopic pKa is equal to the macroscopic pKa. For a molecule with multiple titratable groups, however, throughout a titration from acidic to basic pH, the deprotonation of some functional groups can take place almost simultaneously. For these multiprotic molecules, the experimentally-measured macroscopic pKa will include contributions from multiple microscopic pKas with similar values (i.e., acid dissociation of multiple microstates). Cysteine provides an example of this behavior with its two macroscopic pKas observable by spectrophotometric or potentiometric pKa measurement experiments [19, 20].

While four microscopic pKas can be defined for cysteine, experimentally observed pKa values cannot be assigned to individual functional groups directly (Fig. 2, top). More advanced techniques capable of resolving individual protonation sites—such as nuclear magnetic resonance (NMR) [21], Raman spectroscopy [22, 23], and the analysis of pKas in molecular fragments or derivatives—are required to unambiguously assign the site of protonation state changes. On the other hand, when there is a large difference between microscopic pKas in a multiprotic molecule, the proton dissociations won’t overlap and macroscopic pKas observed by experiments can be assigned to individual titratable groups. The pKa values of glycine provide a good example of this scenario (Fig. 2, bottom) [19, 20, 22]. We recommend the short review on the assignment of pKa values authored by Darvey [20] for a good introduction to the concepts of macroscopic vs microscopic pKa values.

Fig. 1
figure 1

Here, the illustrative diagram style of microstates were adopted from [24], and NMR-determined microscopic pKas for cetirizine were taken from [25]

Comparison of macroscopic and microscopic pKa measurement methods. Filled circles represent protonated sites and empty circles represent deprotonated sites with the order of carboxylic acid (1), piperazine nitrogen (2), and piperazine nitrogen (3). Protonation state populations shown for pH-metric and UV-metric pKa measurement methods are simulations, calculated using NMR-based microscopic pKa values. a Cetirizine has \(n = 3\) titratable sites, shown in bold. b Left: 8 microstates (\(2^n\)) and 12 microscopic pKas (\(n 2^{n-1}\)) of cetirizine. Right: relative population of microspecies with respect to pH. Potentially all microstates can be resolved via NMR. c Simulated pH-metric (potentiometric) titration and macroscopic populations. For a polyprotic molecule, only macroscopic pKas can be measured with pH-metric titration. Microstates with different total charges (related to the number of protons) can be resolved, but microstates with the same total charge are observed as one macroscopic population. d Simulated microscopic populations for UV-metric (spectrophotometric) titration of cetirizine. Since only protonation of the titration sites within four heavy atoms of the UV-chromophore is likely to cause an observable change in the UV-absorbance spectra, microstates that only differ by protonation of the distal carboxylic acid cannot be differentiated. Moreover, populations that overlap may or may not be resolvable depending on how much their absorbance spectra in the UV region differ. Both UV-metric and pH-metric pKa determination methods measure macroscopic pKas for polyprotic molecules, which cannot easily be assigned to individual titration sites and underlying microstate populations in the absence of other experimental evidence that provides structural resolution, such as NMR. Note that macroscopic populations observed in these two methods are composed of different combinations of microstates depending on the principles of measurement technique.

Fig. 2
figure 2

Subfigure of cysteine microscopic pKas was reproduced based on [19]

Assignment of cysteine and glycine pKa values. \(pK_{a1}\), \(pK_{a2}\), and \(pK_{a3}\) are macroscopic acid dissociation constants for cysteine and glycine [26]. When pKa values of a polyprotic molecule are very different, such as in the case of glycine, it is possible to assign the pKas to individual groups since the dissociation of protons is stepwise [19]. However, stepwise dissociation cannot be assumed for cysteine, because \(pK_{a2}\) and \(pK_{a3}\) are very close in value. Four underlying microscopic pKas (\(pK_{a,S}\), \(pK_{a,N}\), \(pK_{a,S'}\), and \(pK_{a,N'}\)) for cysteine were measured using UV spectra analysis of cysteine and derivatives [27]. Notice that the proximity of \(pK_{a,S}\) and \(pK_{a,N}\) values indicates similar probability of proton dissociation from these groups.

The most common methods for measuring small molecule pKas are UV-absorbance spectroscopy (UV-metric titration) [28,29,30], potentiometry (pH-metric titration) [30, 31], capillary electrophoresis [32, 33], and NMR spectroscopy [21], with NMR being the most time-consuming approach. Other, less popular pKa measurement techniques include conductometry, HPLC, solubility or partition based estimations, calorimetry, fluorometry, and polarimetry [34]. UV-metric and pH-metric methods (Fig. 3) of Sirius T3 are limited to measuring aqueous pKa values between 2 and 12 due to limitations of the pH electrode used in these measurements. The pH-metric method relies on determining the stoichiometry of bound protons with respect to pH, calculated from volumetric titration with acid or base solutions. Accurate pH-metric measurements require high concentrations of analyte as well as analytically prepared acid/base stocks and analyte solutions. By contrast, UV-metric pKa measurements rely on the differences in UV absorbance spectra of different protonation states, generally permitting lower concentrations of analyte to be used. The pH and UV absorbance of the analyte solution are monitored during titration.

Fig. 3
figure 3

UV-metric (spectrophotometric) and pH-metric (potentiometric) pKa measurements of pyridoxine HCl with Sirius T3. Spectrophotometic pKa measurement (ac) relies on differences in the UV absorbance spectra between microscopic protonation states to deconvolute the population of macrostate species as a function of pH. While highly sensitive (and therefore requiring a very low analyte concentration of ~ 50 μM), this approach can only resolve changes in protonation states for titratable sites near chromophores and cannot separate the populations of microstates that change in the same manner as a function of pH. a Multiwavelength UV absorbance versus pH. Purple lines represents absorbance at distinct wavelengths in UV region. b Derivative of multiwavelength absorbance with respect to pH (dA/dpH) versus pH is plotted with purple lines. In a and b, blue, red, and green triangles represent population of protonation states (from most protonated to least protonated) as calculated from a global fit to experimental UV absorbances for all pH values, while thin lines denote model fits that utilize the fitted model pKas to compute populations. pKa values (green flags) correspond to inflection point of multiwavelength absorbance data where change in absorbance with respect to pH is maximum. c Molar absorption coefficients versus wavelength for each protonation state as resolved by TFA. d f illustrate potentiometric pKa measurement where molar addition of acid or base is tracked as pH is titrated. d Mean molecular charge versus pH. Mean molecular charge is calculated based on the model provided for the analyte: predicted number and nature of titratable sites (acid or base type), and number of counter ions present. pKa values are calculated as inflection points of charge versus pH plot. e Predicted macroscopic protonation state populations versus pH calculated based on pKa values (\(\text {H}_{2}\text {A}^{+}\): blue, \(\text {HA}\): red, and \({\text {A}^{-}}\): green) f Buffering index versus pH profile of water (grey solid line, theoretical) and the sample solution (blue triangles represent experimental data points). A higher concentration of analyte (\(\sim 5\,\text {mM}\)) is necessary for the potentiometric method than the spectrophotometric method in order to provide large enough buffering capacity signal above water for an accurate measurement

Both UV-metric and pH-metric pKa determination methods measure macroscopic pKas for polyprotic molecules, which cannot be easily assigned to individual titration sites and underlying microstate populations in the absence of other experimental evidence that provides structural information, such as NMR (Fig. 1). Macroscopic populations observed in these two methods are composed of different combinations of microstates depending on the principles of measurement technique. In potentiometric titrations, microstates with same total charge will be observed as one macrostate, while in spectrophotometric titrations, protonation sites remote from chromophores might be spectroscopically invisible, and macrostates will be formed from collections of microstates that manifest similar UV-absorbance spectra.

For UV-metric method to resolve populations of microstates, sufficiently different UV spectra between microstates and sufficiently non-overlapping change of populations with respect to pH are needed. However, relative tautomer populations of microstates with the same total charge do not depend on pH and stay constant while pH is titrated (Fig. 1b), therefore they cannot be resolved by UV-metric method. The pH-metric method also cannot resolve microstates that have the same total charge as shown in Fig. 1c.

Spectrophotometric pKa determination is more sensitive than potentiometric determination, requiring low analyte concentrations (50–100 μM) —especially advantageous for compounds with low solubilities—but is only applicable to titration sites near chromophores. For protonation state changes to affect UV absorbance, a useful rule of thumb is that the protonation site should be a maximum of four heavy atoms away from the chromophore, which might consist of conjugated double bonds, carbonyl groups, aromatic rings, etc. Although potentiometric measurements do not suffer from the same observability limitations, higher analyte concentrations (\(\sim 5\) mM) are necessary for the analyte to provide sufficiently large enough buffering capacity signal above the inherent buffering capacity of water to produce an accurate measurement. The accuracy of pKas fit to potentiometric titrations can also be sensitive to errors in the estimated concentration of the analyte in the sample solution, while UV-metric titrations are insensitive to concentration errors. We therefore decided to adopt spectrophotometric measurements for collecting the experimental pKa data for this challenge, and selected a compound set to ensure that all potential titration sites are in the vicinity of UV chromophores.

Here, we report on the selection of SAMPL6 pKa challenge compounds, their macroscopic pKa values measured by UV-metric titrations using a Sirius T3, as well as NMR-based microstate characterization of two SAMPL6 compounds with ambiguous protonation states associated with the observed macroscopic pKas (SM07 and SM14). We discuss implications of the use of this experimental technique for the interpretation of pKa data, and provide suggestions for future pKa data collection efforts with the goal of evaluating or training computational pKa predictions.

Methods

Compound selection and procurement

To select a set of small molecules focusing on the chemical space representative of kinase inhibitors for physicochemical property prediction challenges (pKa and lipophilicity) we started from the kinase-targeted subclass of the ZINC15 chemical library [35] and applied a series of filtering and selection rules as depicted in Fig. 4a. We focused on the availability “now” and reactivity “anodyne” subsets of ZINC15 in the first filtering step [http://zinc15.docking.org/subclasses/kinase/substances/subsets/now+anodyne/]. The “now” label indicates the compounds were availabile for immediate delivery, while the “anodyne” label excludes compounds matching filters that flag compounds with the potential for reactivity or pan-assay interference (PAINs) [36, 37].

Next, we identified resulting molecules that were also available for procurement through eMolecules [38] (free version, downloaded 1 June 2017), the supplier that would be used for procurement in this exercise. To find the intersection of ZINC15 kinase subset and eMolecules database, we matched molecules using their canonical isomeric SMILES strings, as computed via the OpenEye OEChem Toolkit (version 2017.Feb.1) [39].

To extract availability and price information from eMolecules, we queried using a list of SMILES (as reported in eMolecules database) of the intersection set. We further filtered the intersection set (1204 compounds) based on delivery time (Tier 1 suppliers, 2-week delivery) and at least 100 mg availability in powder form (format: Supplier Standard Vial). We aimed to purchase 100 mg of each compound in powder form with at least 90% purity. We calculated 100 mg was enough for optimization and replicate experiments to measure pKa, log P, and solubility measurements with the Sirius T3. Each UV-metric and pH-metric pKa measurement requires a minimum of 0.01 mg and 1.00 mg compound [solid or delivered via dimethyl sulfoxide (DMSO) stock solution], respectively. log P and pH-dependent solubility measurements require around 2 mg and 10 mg of solid chemical, respectively.

Filtering for predicted measurable pK as and lack of experimental data

The Sirius T3 (Pion) instrument used to collect pKa and log P/log D measurements requires a titratable group in the pKa range of 2–12, so we aimed to select compounds with predicted pKas in the range of 3–11 to allow a \(\sim 1\) pKa unit margin of error in pKa predictions. pKa predictions for compound selection were calculated using Epik (Schrödinger, LLC) sequential pKa prediction (scan) [40, 41] with target pH 7.0 and tautomerization allowed for generated states. We filtered out all compounds that did not have any predicted pKas between 3–11, as well as compounds with two pKa values predicted to be less than 1 pKa unit apart in the hopes that individual pKas of multiprotic compounds could be resolved with spectrophotometric pKa measurements. With the goal of selecting compounds suitable for subsequent log P measurements, we eliminated compounds with OpenEye XlogP [42] values less than \(-\,\)1 or greater than 6. Subsets of compounds with molecular weights between 150–350 and 350–500 g/mol were selected for fragment-like and drug-like categories respectively. Compounds without available price or stock quantity information were eliminated. As the goal was to provide a blind challenge, compounds with publicly available experimental log P measurements were also removed. The sources we checked for publicly available experimental log P values were the following: DrugBank [43] (queried with eMolecules SMILES), ChemSpider [44] (queried by canonical isomeric SMILES), NCI Open Database August 2006 release [45], Enhanced NCI Database Browser [46] (queried with canonical isomeric SMILES), and PubChem [47] (queried with InChIKeys generated from canonical isomeric SMILES with NCI CACTUS Chemical Identifier Resolver [48]).

Filtering for kinase inhibitor-like scaffolds

In order to include common scaffolds found in kinase inhibitors, we analyzed the frequency of rings found in FDA-approved kinase inhibitors via Bemis–Murcko fragmentation using OEMedChem Toolkit of OpenEye [49, 50]. Heterocycles found more than once in FDA-approved kinase inhibitors are shown in Fig. 4b. In selecting 25 compounds for the fragment-like set and 10 compounds for the drug-like set, we prioritized including at least one example of each heterocycle, although we failed to find compounds with piperazine and indazole that satisfied all other selection criteria. We observed that certain heterocycles (shown in Fig. 4c) were overrepresented based on our selection criteria; therefore, we limited the number of these structures in the SAMPL6 challenge set to at most one in each set. To achieve broad and uniform sampling of the measurable log P dynamic range, we segregated the molecules into bins of predicted XlogP values and selected compounds from each bin, prioritizing less expensive compounds.

Filtering for UV chromophores

The presence of UV chromophores (absorbing in the 200–400 nm range) in close proximity to protonation sites is necessary for spectrophotometric pKa measurements. To filter for molecules with UV chromophores, we looked at the substructure matches to the SMARTS pattern [n,o,c][c,n,o]cc which was considered the smallest unit of pi-conjugation that can constitute a UV chromophore. This SMARTS pattern describes extended conjugation systems comprised of four heavy atoms and composed of aromatic carbon, nitrogen, or oxygen, such as 1,3-butadiene, which possess an absorption peak at 217 nm. Additionally, the final set of selected molecules was manually inspected to makes sure all potentially titratable groups were no more than four heavy atoms away from a UV chromophore.

25 fragment-like and 10 drug-like compounds were selected, out of which procurement of 28 was completed in time. pKa measurements for 17 (SM01–SM17) and 7 (SM18–SM24) were successful, respectively. The resulting set of 24 small molecules constitute the SAMPL6 pKa challenge set. For the other four compounds, UV-metric pKa measurements show no detectable pKas in the range of 2–12, so we decided not to include them in the SAMPL6 pKa challenge. Experiments for these four compounds are not reported in this publication.

Python scripts used in the compound selection process are available from GitHub [https://github.com/choderalab/sampl6-physicochemical-properties]. Procurement details for each compound can be found in Supplementary Table 1. Chemical properties used in the selection of compounds are summarized in Supplementary Table 2.

Fig. 4
figure 4

Compound selection for the SAMPL6 pKa challenge, with the goal of running subsequent log P/log D challenges on the same compound set. a Flowchart of filtering steps for the selection of compounds that resemble kinase inhibitors and their fragments. Numbers next to arrows indicate the number of compounds remaining after each filtering step. A total of 25 fragment-like and 10 drug-like compounds were selected, out of which procurement and pKa measurements for 17 fragment-like and 7 drug-like compounds were successful, respectively. b Frequent heterocycles found in FDA approved kinase inhibitors, as determined by Bemis–Murcko fragmentation into rings [49]. Black structures were represented in SAMPL6 set at least once. Compounds with piperazine and indazole (gray structures) could not be included in the challenge set due to library and selection limitations. c Structures of heterocycles that were overrepresented based on our compound selection workflow. We have limited the number of occurrences of these heterocycles to at most one

UV-metric pK a measurements

Experimental pKa measurements were collected using the spectrophotometric pKa measurement method with a Sirius T3 automated titrator instrument (Pion) at 25.0 °C and constant ionic strength. The Sirius T3 is equipped with an Ag/AgCl double-junction reference electrode to monitor pH, a dip probe attached to UV spectrophotometer, a stirrer, and automated volumetric titration capability. The Sirius T3 UV-metric pKa measurement protocol measures the change in multi-wavelength absorbance in the UV region of the absorbance spectrum while the pH is titrated over pH 1.8–12.2 [28, 29]. UV absorbance data is collected from 160–760 nm while the 250–450 nm region is typically used for pKa determinations. Subsequent global data analysis identifies pH-dependent populations of macrostates and fits one or more pKa values to match this population with a pH-dependent model.

DMSO stock solutions of each compound with 10 mg/mL concentration were prepared by weighing 1 mg of powder chemical with a Sartorius Analytical Balance (Model: ME235P) and dissolving it in 100 μL DMSO (Fisher Bioreagents, CAT: BP231-100, LOT: 116070, purity \(\ge 99.7\%\)). DMSO stock solutions were capped immediately to limit water absorption from the atmosphere due to the high hygroscopicity of DMSO and sonicated for 5–10 min in a water bath sonicator at room temperature to ensure proper dissolution. These DMSO stock solutions were stored at room temperature up to 2 weeks in capped glass vials. 10 mg/mL DMSO solutions were used as stock solutions for the preparation of three replicate samples for the independent titrations. For each experiment, 1–5 μL of 10 mg/mL DMSO stock solution was delivered to a 4 mL Sirius T3 glass sample vial with an electronic micropipette (Rainin EDP3 LTS 1–10 μL). The volume of delivered DMSO stock solution, which determines the sample concentration following dilution by the Sirius T3, is optimized individually for each compound to achieve sufficient but not saturated absorbance signal (targeting 0.5–1.0 AU) in the linear response region. Another limiting factor for sample concentration was ensuring that the compound remains soluble throughout the entire pH titration range. An aliquot of 25 μL of mid-range buffer (14.7 mM \(\text {K}_{2}\text {HPO}_{4}\) and 0.15 M \(\text {KCl}\) in \(\text {H}_2\text {O}\)) was added to each sample, transferred with a micropipette (Rainin EDP3 LTS 10–100 μL) to provide enough buffering capacity in middle pH ranges so that pH could be controlled incrementally throughout the titration.

pH is temperature and ionic-strength dependent. A peltier device on the Sirius T3 kept the analyte solution at \(25.0\,\pm \,0.5\) °C throughout the titration. Sample ionic strength was adjusted by dilution in 1.5 mL ionic strength-adjusted water (ISA water \(= 0.15\,\text {M KCl}\) in \(\text {H}_2\text {O}\)) by the Sirius T3. Analyte dilution, mixing, acid/base titration, and measurement of UV absorbance was automated by the Sirius T3 UV-metric pKa measurement protocol. The pH was titrated between pH 1.8 and 12.2 via the addition of acid (0.5 M HCl) and base (0.5 M KOH), targeting 0.2 pH steps between UV absorbance spectrum measurements. Titrations were performed under argon flow on the surface of the sample solution to limit the absorption of carbon dioxide from air, which can alter the sample pH to a measurable degree. To fully capture all sources of experimental variability, instead of performing three sequential pH titrations on the same sample solution, three replicate samples (prepared from the same DMSO stock solution) were subjected to one round of pH titration each. Although this choice reduced throughput and increased analyte consumption, it limited the dilution of the analyte during multiple titrations, resulting in stronger absorbance signal for pKa fitting. Under circumstances where analyte is scarce, it is also possible to do three sequential titrations using the same sample to limit consumption when the loss of accuracy is acceptable.

Visual inspection of the sample solutions after titration and inspection of the pH-dependent absorbance shift in the 500–600 nm region of the UV spectra was used to verify no detectable precipitation occurred during the course of the measurement. Increased absorbance in the 500–600 nm region of the UV spectra is associated with scattering of longer wavelengths of light in the presence of colloidal aggregates. For each analyte, we optimized analyte concentration, direction of the titration, and pH titration range in order to maintain solubility over the entire experiment. The titration direction was specified so that each titration would start from the pH where the compound is most soluble: low-to-high pH for bases and high-to-low pH for acids. While UV-metric pKa measurements can be performed with analyte concentrations as low as 50 μM (although this depends on the absorbance properties of the analyte), some compounds may yet not be soluble at these low concentrations throughout the pH range of the titration. As the sample is titrated through a wide range of pH values, it is likely that low-solubility ionization states—such as neutral and zwitterionic states—will also be populated, limiting the highest analyte concentration that can be titrated without encountering solubility issues. For compounds with insufficient solubility to accurately determine a pKa value directly with a UV-metric titration, a cosolvent protocol was used, as described in the next section.

Two Sirius T3 computer programs—Sirius T3 Control v1.1.3.0 and Sirius T3 Refine v1.1.3.0—were used to execute measurement protocols and analyze pH-dependent multiwavelength spectra, respectively. Protonation state changes at titratable sites near chromophores will modulate the UV-absorbance spectra of these chromophores, allowing populations of distinct UV-active species to be resolved as a function of pH. To do this, basis spectra are identified and populations extracted via TFA analysis of the pH-dependent multi-wavelength absorbance [29]. When fitting the absorbance data to a titratable molecule model to estimate pKas, we selected the minimum number of pKas sufficient to provide a high-quality fit between experimental and modeled data based on visual inspection of pH-dependent populations.

This method is capable of measuring pKa values between 2 and 12 when titratable groups are at most 4–5 heavy atoms away from chromophores such that a change in protonation state alters the absorbance spectrum of the chromophore. We selected compounds where titratable groups are close to potential chromophores (generally aromatic ring systems), but the possibility exists that our experiments did not detect protonation state changes of titratable groups distal from UV chromophores.

Cosolvent UV-metric pK a measurements of molecules with poor aqueous solubilities

If analytes are not sufficiently soluble during the titration, pKa values cannot be accurately determined via aqueous titration directly. If precipitation occurs, the UV-absorbance signal from pH-dependent precipitate formation cannot be differentiated from the pH-dependent signal of soluble microstate species. For compounds with low aqueous solubility, pKa values were estimated from multiple apparent pKa measurements performed in ISA methanol:ISA water cosolvent solutions with various mole fractions, from which the pKa at 0% methanol (100% ISA water) can be extrapolated. This method is referred to as a UV-metric psKa measurement in the Sirius T3 Manual [51]. psKa value is the apparent pKa value measured in the presence of a cosolvent.

The cosolvent spectrophotometric pKa measurement protocol was very similar to the standard aqueous UV-metric pKa measurement protocol, with the following differences: titrations were performed in typically in 30%, 40%, and 50% mixtures of ISA methanol:ISA water by volume to measure apparent pKa values (psKa) in these mixtures. Yasuda–Shedlovsky extrapolation [52, 53] was subsequently used to estimate the pKa value at 0% cosolvent (Fig. 5) [31, 54, 55].

$${\text{p}}_{{\text{s}}} {\text{K}}_{{\text{a}}} + \log [{\text{H}}_{2} {\text{O}}] = A/\epsilon + B$$
(1)

Yasuda–Shedlovsky extrapolation relies on the linear correlation between \({\text{p}}_{{\text{s}}} {\text{K}}_{{\text{a}}} + \log [\text {H}_2\text {O}]\) and the reciprocal dielectric constant of the cosolvent mixture (\(1/\epsilon\)). In Eq. 1, A and B are the slope and intercept of the line fitted to experimental datapoints. Depending on the solubility requirements of the analyte, the methanol ratio of the cosolvent mixtures was adjusted. We designed the experiments to have at least 5% cosolvent ratio difference between datapoints and no more than 60% methanol content. Calculation of the Yasuda–Shedlovsky extrapolation was performed by the Sirius T3 software using at least 3 psKa values measured in different ratios of methanol:water. Addition of methanol (80%, 0.15  M KCl) was controlled by the instrument before each titration. Three consecutive pH titrations at different methanol concentrations were performed using the same sample solution. In addition, three replicate measurements with independent samples (prepared from the same DMSO stock) were collected.

Fig. 5
figure 5

Determination of SM22 pKa values with cosolvent method and Yasuda–Shedlovsky extrapolation. ac show psKa of SM22 determined at various methanol concentrations: 59.07%, 49.72%, 40.08% by weight. Purple solid lines indicate the derivative of the absorbance signal with respect to pH versus pH at multiple wavelengths. psKa values (green flags) were determined by Sirius T3 Refine Software. Blue, red, and green triangles show relative populations of macroscopic protonation states with respect to pH calculated from the experimental data. Notice that as cosolvent concentration increases, psKa1 decreases from 1.90 to 1.47 and psKa2 increases from 7.84 to 8.24. d Yasuda–Shedlovsky extrapolation plot for SM22. Red datapoints correspond to psKa determined at various cosolvent ratios. Based on linear fitting to \(p_{s}K_{a} + log [\text {H}_2\text {O}]\) versus \(1/\epsilon\), pKa1 and pKa2 in 0% cosolvent (aqueous solution) was determined as 2.45 and 7.42, respectively. R2 values of linear fits are both 0.99. The slope of Yasuda–Shedlovsky extrapolation shows if the observed titration has acidic (positive slope) or basic (negative slope) character dominantly, although this is an macroscopic observation and should not be relied on for annotation of pKas to functional groups (microscopic pKas)

Calculation of uncertainty in pK a measurements

Experimental uncertainties were reported as the standard error of the mean (SEM) of three replicate pKa measurements. The standard error of the mean (SEM) was estimated as

$${\text{SEM}} = \frac{\sigma }{{\sqrt N }}{\mkern 1mu} ;\quad \sigma = \sqrt {\frac{1}{N}\sum\limits_{{i = 1}}^{N} {(x_{i} - \mu )^{2} } } ;\quad \mu = \frac{1}{N}\sum\limits_{{i = 1}}^{N} {x_{i} }$$
(2)

where \(\sigma\) denotes the sample standard deviation and \(\mu\) denotes the sample mean. \(x_i\) are observations and N is the number of observations.

Since the Sirius T3 software reports pKa values to only two decimal places, we have reported the SEM as 0.01 in cases where SEM values calculated from three replicates were lower than 0.01. SEM calculated from replicate measurements were found to be larger than non-linear fit error reported by the Sirius T3 Refine Software from UV-absorbance model fit of a single experiment, thus leading us to believe that running replicate measurements and reporting mean and SEM of pKa measurements is better for capturing all sources of experimental uncertainty. Notably, for UV-metric measurements, the measured pKa values should be insensitive to final analyte concentration and any uncertainty in the exact analyte concentration of the original DMSO stock solution, justifying the use of the same stock solution (rather than independently prepared stock solutions) for multiple replicates.

Quality control for chemicals

Compound purity was assessed by LC–MS using an Agilent HPLC 1200 Series equipped with auto-sampler, UV diode array detector, and a Quadrupole MS detector 6140. ChemStation version C01.07SR2 was used to analyze LC & LC/MS. An Ascentis Express C18 column (3.0 × 100 mm, 2.7 μm) was used, with column temperature set at 45 °C.

  • Mobile phase A: 2 mM ammonium formate aqueous (pH 3.5)

  • Mobile phase B: 2 mM ammonium formate in 90:10 acetonitrile:water  (pH 3.5)

  • Flow rate: 0.75 mL/min

  • Gradient: starting with 10% B to 95% B in 10 min then hold at 95% B for 5 min

  • Post run length: 5 min

  • Mass condition: ESI positive and negative mode

  • Capillary voltage: 3000 V

  • Drying gas flow: 12 mL/min

  • Nebulizer pressure: 35 psi

  • Drying temperature: 350 °C

  • Mass range: 5–1350 Da; fragmentor: 70; threshold: 100

The percent area for the primary peak is calculated based on the area of the peak divided by the total area of all peaks. The percent area of the primary peak is reported as an estimate of sample purity. The purity of primary LC peak was checked by ChemStation software with threshold 995, to check that there is no significant impurity underneath the main peak.

NMR determination of protonation microstates

In general, the chemical shifts of nuclear species observed in NMR spectra report on and are very sensitive to the chemical environment. Consequently, small changes in chemical environment, such as the protonation events described in this work, are manifest as changes in the chemical shift(s) of the nuclei. If perturbation occurs at a rate which is fast on the NMR timescale (fast exchange), an average chemical shift is observed. This phenomena has been exploited and utilized as a probe for determining the order of protonation for molecules with more than one titratable site [56]. In some cases, direct observation of the titrated nuclei can be difficult, for example nitrogen and oxygen, due to sample limitations and/or low natural abundance of the NMR active nuclei (0.37% for 15N and 0.038% for 17O)—amongst other factors. In these situations, chemical shifts changes of the so-called “reporter” NMR nuclei—1H, 31P, or 13C nuclei, which are directly attached to or are a few bonds away from the titrated nuclei—have been utilized as the probe for NMR-pH titrations [21, 57, 58]. This approach is advantageous since the sensitive NMR nuclides (1H and 31P) are observed. In addition, 31P and 13C offer large spectral widths of ~300 ppm and ~200 ppm, respectively, which minimize peak overlap.

However, reporter nuclei chemical shifts provide indirect information subject to interpretation. In complex systems with multiple titratable groups, such analysis will be complicated due to a cumulative effect of these groups on the reporter nuclide due to their close proximity or the resonance observed in aromatic systems. In contrast, direct observation of the titratable nuclide where possible, affords a more straight-forward approach to studying the protonation events. In this study, the chemical shifts of the titratable nitrogen nuclei were observed using the 1H–15N-HMBC (heteronuclear multiple-bond correlation) experiments — a method that affords the observation of 15N chemical shifts while leveraging the sensitivity accrued from the high abundance 1H nuclide.

The structures of samples SM07 and SM14 were assigned via a suite of NMR experiments, which included 1H NMR, 13C NMR, homonuclear correlated spectroscopy (1H–1H COSY), heteronuclear single quantum coherence (1H–13C HSQC), 13C heteronuclear multiple-bond correlation (1H–13C-HMBC) and 15N heteronuclear multiple-bond correlation (1H–15N-HMBC)—see SI. All NMR data used in this analysis were acquired on a Bruker 500 MHz spectrometer equipped with a 5 mm TCI CryoProbeTM Prodigy at 298 K. The poor solubility of the analytes precluded analysis in water and thus water-d2/methanol-d4 mixture and acetonitrile-d3 were used as solvents. The basic sites were then determined by titration of the appropriate solutions of the samples with equivalent amounts of deutero-trifluoroacetic acid (TFA-d) solution.

SM07

5.8 mg of SM07 was dissolved in 600 μL of methanol-d4:water-d2 (2:1 v/v ratio). A 9% v/v TFA-d solution in water-d2 was prepared, such that each 20 μL volume contained approximately 1 equivalent of TFA-d with respect to the base. The SM07 solution was then titrated with the TFA-d solution at 0.5, 1.0, 1.5, and 5.0 equivalents with 1H–15N HMBC spectra (optimized for 5 Hz) acquired after each TFA addition. A reference 1H–15N HMBC experiment was first acquired on the SM07 solution prior to commencement of the titration.

SM14

5.5 mg of SM14 was dissolved in 600 μL of acetonitrile-d3. A 10% v/v TFA-d solution in acetonitrile-d3 was prepared, 20 μL of which corresponds to 1 equivalent of TFA-d with respect to the base. Further 1:10 dilution of the TFA-d solution in acetonitrile-d3, allowed measurement of 0.1 equivalent of TFA-d per 20 μL of solution. The SM14 solution was then titrated with the TFA-d solutions at 0.0, 0.5, 1.0, 1.1, 1.2, 1.3, 1.5, 1.8, 2.0, 2.1, 2.6, 5.1, and 10.1 equivalents. The chemical shift changes were monitored by the acquisition of 1H–15N HMBC spectra (optimized for 5 Hz) after each TFA addition.

Results

Spectrophotometric pK a measurements

Spectrophotometrically-determined pKa values for all molecules from the SAMPL6 pKa challenge are shown in Fig. 6 and Table 1. The protocol used—cosolvent or aqueous UV-metric titration—is indicated in Table 1 together with SEM of each reported measurement. Out of 24 molecules successfully assayed, five molecules have two resolvable pKa values, while one has three resolvable pKa values within the measurable pKa range of 2–12. The SEM of reported pKa measurements is low, with the largest uncertainty reported being 0.04 pKa units (pKa1 of SM06 and pKa3 of SM18). Individual replicate measurements can be found in Supplementary Table 3. Reports generated for each pKa measurement by the Sirius T3 Refine software can also be found in the Supplementary Information. Experimental pKa values for nearly all compounds with multiple resolvable pKas are well-separated (more than 4 pKa units), except for SM14 and SM18.

Fig. 6
figure 6

Molecules used in the SAMPL6 pKa challenge. Experimental UV-metric pKa measurements were performed for these 24 molecules and discernable macroscopic pKas are reported. Uncertainties are expressed as the standard error of the mean (SEM) of three independent measurements. We depicted neutral states of the molecules as sites of protonation were not determined by UV-metric methods. 2D structures were created with OpenEye OEDepict Toolkit [59]. Canonical isomeric SMILES of molecules in this figure and pKa values measured in replicate experiments can be found in Table SI 1 and Table SI 3, respectively

Table 1 Experimental pKas of SAMPL6 compounds

Impact of cosolvent to UV-metric pK a measurements

For molecules with insufficient aqueous solubilities throughout the titration range (pH 2–12), we resorted to cosolvent UV-metric pKa measurements, with methanol used as cosolvent. To confirm that cosolvent UV-metric pKa measurements led to indistinguishable results compared to aqueous UV-metric measurements, we collected pKa values of 12 highly soluble SAMPL6 compounds—as well as pyridoxine—using both cosolvent and aqueous methods. Correlation analysis of pKa values determined with both methods demonstrated that using methanol as cosolvent and determining aqueous pKas via Yasuda–Shedlovsky extrapolation did not result in significant bias (Fig. 7), since 95% CI for mean deviation (MD) between two measurements includes zero. Means and standard errors of UV-metric pKa measurements with and without cosolvent are provided in Supplementary Table 5. pKa measurement results of individual replicate measurements with and without cosolvent can be found in Supplementary Table 4.

Fig. 7
figure 7

pKa measurements with UV-metric method with cosolvent and UV-metric method in water show good correlation. 17 pKa values (blue marks) of 13 chemicals were measured with both UV-metric pKa method in water and UV-metric pKa method with methanol as cosolvent (Yasuda–Shedlovsky extrapolation to 0% methanol). Dashed black line has slope of 1, representing perfect correlation. Dark and light green shaded areas indicate \(\pm\,0.5\) and \(\pm\,1.0\) pKa unit difference regions, respectively. Error bars are plotted as the SEM of replicate measurements, although they are not visible since the largest SEM is 0.04. Mean difference (MD),  mean absolute deviation (MAD), root-mean-square deviation (RMSD) are reported. Confidence intervals (reported in brackets) report the 95% CI calculated over 10,000 bootstrap samples. Experimental data used in this plot is reported in Supplementary Table 4

Purity of SAMPL6 compounds

LC–MS based purity measurements showed that powder stocks of 23 of the SAMPL6 pKa challenge compounds were >90% pure, while purity of SM22 was 87%—the lowest in the set (Supplementary Table 6). Additionally, molecular weights detected by LC–MS method were consistent with those reported in eMolecules, as well as supplier-reported molecular weights, when provided. It is recommended by Sirius/Pion technical specialists to use compounds with \(\sim 90\%\) purity to minimize the impact on high-accuracy pKa measurements. Impurities with no UV-chromophore, or elute too late in LC may not be detected with this method, although chances are small. The peak purity check of primary peak can detect the presence of a large impurity underneath the main peak, but if the UV spectrum of the impurity is exactly same with analyte in the main peak, it may not be resolved. HPLC UV detector’s wavelength inaccuracy is \(<1\%\). Mass inaccuracy of MS instrument is ~0.13 um within the calibrated mass range in the scan mode.

Characterization of SM07 microstates with NMR

15N Chemical shifts (ppm, referenced to external liquid ammonia at 0 ppm) for N-8, N-10 and N-12—measured from the 1H–15N HMBC experiments—were plotted against the titrated TFA-d equivalents (0.0, 0.5, 1.0, 1.5, and 5.0 equivalents) (Fig. 8a). A large upfield shift of ~82 ppm is observed for N-12. The initial linear relationship between chemical shift and TFA equivalents, shown in Fig. 8a for N-12, is expected for strong monoprotic bases—as is the case for SM07. The large upfield chemical shift change (82 ppm) is consistent with a charge delocalization as shown in the resonance structures in Fig. 8a. Further evidence for this delocalization is observed for N-8, which exhibited a downfield chemical shift change of ~28 ppm compared to just ~1.5 ppm for N-10. Titration of SM07 with more than 1 equivalent of TFA-d did not result in further significant chemical shift changes—establishing that SM07 is a monoprotic base.

Characterization of SM14 microstates with NMR

Determining the protonation sites for SM14, which has pKa values of 2.58 and 5.30 (Table 1), was more challenging due to multiple possible resonance structures in the mono- and di-protonated states. We noticed that the water/methanol co-solvent exhibited strong solvent effects, which complicated the data interpretation for SM14. For instance, titration of SM14 in methanol/water (Figs. SI 36) showed incomplete protonation of N-9 even after 5 equivalents of TFA-d were added. This observation is consistent with UV-metric psKa measurements done in the presence of methanol as cosolvent, where both psKa values were decreasing as the percentage of methanol was increased, making observation of these protonation states more difficult. Thus the utilization of an aprotic solvent was necessary for unambiguous interpretation of the data.

Due to the problem just delineated for the methanol/water cosolvent, acetonitrile-d3 was selected as our solvent of choice. Titration of SM14 (5.5 mg) with up to 10 equivalents of TFA-d in acetonitrile-d3 (0.0, 0.5, 1.0, 1.1, 1.2, 1.3, 1.5, 1.8, 2.0, 2.1, 2.6, 5.1, and 10.1 equivalents), provided a much clearer picture of its protonation states (Fig. 8b). N-9, with the large upfield chemical shift change ~72 ppm at 1 equivalent of TFA-d, clearly is the site of first protonation. Concurrently, the downfield chemical shift changes observed for N-7 (\(\Delta \delta \approx 6.5\)) and N-16 (\(\Delta \delta \approx 5\)) can be attributed to electronic effects rather than a direct protonation. The large upfield shift for N-9 indicates this to be the site of first protonation; complete protonation was attained at roughly 2.5 equivalents of TFA-d, suggesting that SM14 is a weak base under these experimental conditions. Following the protonation of N-9, a second protonation event occurs at N-16 nitrogen as evident by the upfield chemical shift change observed for N-16. However, a continuous change in the chemical shift of N-16 even after addition of ten equivalents of TFA-d indicates that this protonation event is incomplete but provides evidence for N-16 being the second protonation site. This observation is consistent with N-16 being even a weaker base than N-9, which is expected of the aniline-type amines. Other notable observations were the slight downfield chemical shift changes for N-7 and N-9, during the second protonation event. These changes were attributed to electronic effects from the protonation of N-16.

Fig. 8
figure 8

Dominant protonation microstates of SM07 and SM14 characterized by NMR. a Sequence of protonation sites of SM07 were determined by 1H–15N HMBC experiments in 1:2 water:methanol mixture. Left: the plot of 15N chemical shifts of the N-10, N-12, and N-8 resonances of SM07 versus titrated TFA-d equivalents, showing the mono-protonation of N-12 as evidenced by its large upfield chemical shifts change. Acidity of the medium increased as more equivalents of TFA-d were added. Electronic effects due to protonation of N-12 caused downfield chemical shift change of N-10 and N-8 between 0 and 1 equivalents of TFA-d. Right: NMR-based model of the order of dominant protonation states for SM07. The protonation event was only observed at N-12. Microstates shown in the figure are the most likely contributors to the UV-metric pKa of \(6.08 \pm 0.01\). b Sequence of protonation sites of SM14 were determined by 1H–15N HMBC experiments in acetonitrile. Left: the plot of 15N chemical shifts of N-9, N-7, and N-16 of SM14 versus titrations of TFA-d equivalents, showing two sequential protonation events. The first protonation occurred at N-9; a large upfield chemical shift change of 71.6 ppm was seen between 0 and 1 equivalents of TFA-d. Downfield chemical shift changes observed for N-7 and N-19 in this region were due the electronic effect from the protonation of N-9. N-16 also exhibited a small upfield chemical shift change of 4.4 ppm between 2.5 and 10 equivalents of TFA-d, which indicated N-16 as the second site of protonation. Right: NMR based model of the order of dominant protonation states for SM14, showing two sequential protonation events. Also, two pKa values were detected with UV-metric pKa measurements for SM14. Assuming that the sequence of protonation events will be conserved between water and acetonitrile solvents, SM140 and SM14+1 microstates shown in the figure are the major contributors to the UV-metric pKa value \(5.30 \pm 0.01\). SM14+1 and SM14+2 microstates shown in the figure are the major pair of microstates contributing to the UV-metric pKa value \(2.58 \pm 0.01\). There could be minor microstates with very low populations that could not be distinguished in these NMR experiments

Discussion

Effect of sample preparation and cosolvents in UV-metric measurements

Samples for UV-metric pKa measurements were prepared by dilution of up to 5 μL DMSO stock solution of analyte in 1.5 mL ISA water, which results in the presence of \(\sim 0.3\%\) DMSO during titration, which is presumed to have a negligible effect on pKa measurements. For UV-metric or pH-metric measurements, it is possible to prepare samples without DMSO, but it is difficult to prepare samples by weighing extremely low amounts of solid stocks (in the order of 0.01–0.10 mg) to target 50 μM analyte concentrations, even with an analytical balance. For experimental throughput, we therefore preferred using DMSO stock solutions. Another advantage of starting from DMSO stock solutions is that it helps to overcome kinetic solubility problems of analytes.

A lower analyte concentration is needed for spectrophotometric pKa measurement than potentiometric method. With spectrophotometric method, very dilute analyte solutions as low as 10−5–10−6 M can be used [28] with strength of the UV signal as limiting factor. In this study we used analyte concentrations around  50 μM, which is 2 orders of magnitude lower than the minimum concentration required for typical potentiometric pKa measurements. Theoretically, low analyte concentrations lead to more accurate pKa measurements by minimizing the potential for the solute to influence solvent properties. In the extreme, if it were possible to measure the pKa at the infinite dilution of the analyte that would be the best. But of course, in practice the minimum analyte concentration is limited by the detection strength of the UV signal. The higher the analyte concentration the more it affects the solvent properties such as ionic strength and dielectric constant. Also, the risk of analyte aggregation or precipitation increases with higher concentration.

In UV-metric measurements, both water and methanol (when used as cosolvent) stock solutions were ionic strength adjusted with 150 mM KCl, but acid and base solutions were not. This means that throughout pH titration ionic strength slightly fluctuates, but on average ionic strength of samples were staying around 150–180 mM. By using ISA solutions the effect of salt concentration change on pKa measurements was minimized.

If an analyte is soluble enough, UV-metric pKa measurements in water should be preferred over cosolvent methods, since pKa measurement in water is more direct. For pKa determination via cosolvent extrapolation using methanol, the apparent pKas (psKa) in at least three different methanol:water ratios must be measured, and the pKa in 0% cosolvent computed by Yasuda–Shedlovsky extrapolation. The number and spread of psKa measurements and error in linear fit extrapolation influences the accuracy of pKas determined by this approach. To test that UV-metric methods with or without cosolvent have indistinguishable performance, we collected pKa values for 17 SAMPL6 compounds and pyridoxine with both methods. Figure 7 shows there is good correlation between both methods and the mean absolute deviation between two methods is 0.12 (95% CI [0.07, 0.18]). The mean deviation between the two sets is − 0.04 (95% CI [− 0.12, 0.03]), showing there is no significant bias in cosolvent measurements as the 95% CI includes zero. The largest absolute deviation observed was 0.41 for SM06.

Impact of impurities to UV-metric pK a measurements

Precisely how much the presence of small amounts of impurities impact UV-metric pKa measurements is unpredictable. For an impurity to alter UV-metric pKa measurements, it must possess a UV-chromophore and a titratable group in the vicinity of the chromophore—otherwise, it would not interfere with absorbance signal of the analyte. If a titratable impurity does possess a UV-chromophore, UV multiwavelength absorbance from the analyte and impurity will be convoluted. How much the presence of impurity will impact the multiwavelength absorbance spectra and pKa determination depends on the strength of the impurity’s molar absorption coefficient and concentration, relative to the analyte’s. In the worst case scenario, an impurity with high concentration or strong UV absorbance can shift the measured pKa value or create the appearance of an extra pKa. As a result, it is important to use analytes with high purities to obtain high accuracy pKa measurements. Therefore, we confirmed the purities of SAMPL6 compounds with LC–MS.

Interpretation of UV-metric pK a measurements

Multiwavelength absorbance analysis on the Sirius T3 allows for good resolution of pKas based on UV-absorbance change with respect to pH, but it is important to note that pKa values determined from this method are often difficult to assign as either microscopic or macroscopic in nature. This method potentially produces macroscopic pKas for polyprotic compounds. If multiple microscopic pKas have close pKa values and overlapping changes in UV absorbance spectra associated with protonation/deprotonation, the spectral analysis could produce a single macroscopic pKa that represents an aggregation of multiple microscopic pKas. An extreme example of such case is demonstrated in the simulated macrostate populations of cetirizine that would be observed with UV-metric titration (Fig. 1).

If protonation state populations observed via UV-metric titrations (such as in Fig. 3b) are composed of a single microstate, experimentally measured pKas are indeed microscopic pKas. Unfortunately, judging the composition of experimental populations is not possible by just using UV-metric or pH-metric titration. Molecules in the SAMPL6 pKa challenge dataset with only one pKa value measured in the 2–12 range could therefore be monoprotic (possessing a single titratable group that changes protonation state by gain or loss of a single proton over this pH range) or polyprotic (gaining or losing multiple protons from one or more sites with overlapping microscopic pKa values). Similarly, titration curves of molecules with multiple experimental pKas may show well-separated microscopic pKas or macroscopic experimental pKas that are really composites of microscopic pKas with similar values. Therefore, without additional experimental evidence, UV-metric pKas should not be assigned to individual titratable groups.

Sometimes it can be possible to assign pKas to ionizable groups if they produce different UV-absorbance shifts upon ionization, but it is not a straight-forward analysis and it is not a part of the analysis pipeline of Sirius T3 Refine Software. Such an analysis would require fragmentation of the molecule and determining how UV-spectra of each chromophore changes upon ionization in isolation.

UV-metric pKa values for nearly all compounds in our dataset with multiple resolvable pKas are well-separated (more than 4 pKa units), except for SM14 and SM18. Tam et al. states that spectrophotometric pKa values of multiprotic molecules can be unambiguously assigned to the functional groups as microscopic pKas “if the pKa values are at least 4 pH units apart (i.e. \(pK_{a,2} - pK_{a,1} \ge 4\))” based on general knowledge of functional groups and consideration of electronic and inductive effects [28]. In this study, we refrained from reporting such a knowledge-based assignment of pKa values to functional groups without experimental evidence.

Determination of the exact microstates populated at different pH values via NMR can provide a complementary means of differentiating between microscopic and macroscopic pKas in cases where there is ambiguity. As determination of protonation microstates via NMR is very laborious, we were only able to characterize microstates of two molecules: SM07 and SM14.

In UV-metric pKa measurements with cosolvent, the slope of the Yasuda–Shedlovsky extrapolation can be interpreted to understand if the pKa has dominantly acidic or basic character. As the methanol ratio is increased, psKa values of acids increase, while psKa values for bases decrease. However, it is important to remember that if the measured pKa is macroscopic, acid/base assignment from cosolvent psKa trends is also a macroscopic property, and should not be used as a guide for assigning pKa values to functional groups [60].

NMR microstate characterization

The goal of NMR characterization was to collect information on microscopic states related to experimental pKa measurements, i.e., determine exact sites of protonation. pKa measurements performed with spectrophotometric method provide macroscopic pKa values, but do not provide information on the specific site(s) of protonation. Conversely, most computational prediction methods primarily predict microscopic pKa values. Protonation sites can be determined by NMR methods, although these measurements are very laborious in terms of data collection and interpretation compared to pKa measurements with the automated Sirius T3. Moreover, not all SAMPL6 molecules were suitable for NMR measurements due to the high sample concentration requirements (for methods other than proton NMR, such as 13C and 15N based 2D experiments) and limiting analyte solubility. Heavy atom spectra that rely on natural isotope abundance require high sample concentrations (preferably in the order of 100 mM). It is possible that drug or drug-fragment-like compounds, such as the compounds used in this study, have insufficient aqueous solubility, limiting the choice of solvent and pH. It may be necessary to use organic cosolvents to prepare these high concentration solutions or only prepare samples at pH values that correspond to high solubility states (e.g., when the charged state of the compound is populated).

We performed NMR based microstate characterization only for SM07 and SM14. We were able to identify the order of dominant protonation microstates, as shown in Fig. 8. These pairs of microstates and the order of microscopic transitions can be associated with experimental pKas determined by UV-metric titrations, under the assumption that different organic solvents used in NMR measurements will have negligible effect on the sequence of microstates observed as the medium was titrated with acid, although shift in pKa values is expected. NMR measurements for SM07 and SM14 were done in water:methanol [1:2 (v/v)] and acetonitrile solutions, respectively. On the other hand, pKa values of these two compounds were determined by UV-metric titrations in ISA water. Additional UV-metric pKa measurements of these compounds with methanol as a cosolvent showed that their psKa values decreased as the cosolvent ratio increased (i.e., dielectric constant decreased) as expected from base titration sites. Identification of SM07 and SM14 titratable sites type as base is consistent between NMR based models and UV-metric cosolvent titrations. The order of microstates observed in the titration of NMR samples are very likely to corresponds to the dominant microstates associated with UV-metric pKa measurements. N-12 of SM07 was observed as the only protonation site of SM07 during TFA-d titration up to 5 equivalents which supports that SM07 is mono-protic and UV-metric pKa value \(6.08 \pm 0.01\) corresponds to microscopic protonation of N-12. For SM14, two protonation sites were observed (N-16 and N-9, in the order of increasing psKa). Microstate pairs shown in Fig. 8b were determined as dominant contributors to UV-metric pKas \(2.58 \pm 0.01\) and \(5.30 \pm 0.01\), although minor microspecies with very low populations (undetected in NMR experiments) could be contributing to the macroscopic pKa values observed by the UV-metric method.

In addition to SM07, there were five other 4-aminoquinazoline derivatives in the SAMPL6 set: SM02, SM04. SM09, SM12, and SM13. For this series, all the potential titratable sites are located in 4-aminoquinazoline scaffold and there are no other additional titratable sites present in these compounds compared to SM07. Therefore, based on structural similarity, it is reasonable to predict that N-12 is the microscopic protonation site for all of these compounds. We can infer that UV-metric pKa values measured for the 4-aminoquinazoline series are also microscopic pKas and they are related to the protonation of the same quinazoline nitrogen with the same neutral background protonation states as shown for SM07 in Fig. 8a.

Recommendations for future pK a prediction challenges

Most high-throughput pKa measurement methods measure macroscopic pKas. One way to circumvent this problem is to confine our interest in future pKa challenges to experimental datasets containing only monoprotic compounds if UV-metric or pH-metric pKa measurements are the method of choice, allowing unambiguous assignment of pKa values to underlying protonation states. However, it is important to consider that multiprotic compounds are common in pharmaceutically interesting molecules, necessitating the ability to model them reliably. It might also be interesting to select a series of a polyprotic compounds and their monoprotic fragments, to see if they can be used to disambiguate the pKa values.

Although relatively efficient, UV-metric pKa measurements with the Sirius T3 do not provide structural information about microstates. Even the acid–base assignment based on direction of psKa shift with cosolvent is not a reliable indicator for assigning experimental pKa values to individual functional groups in multiprotic compounds. On the other hand, most computational pKa prediction methods output microscopic pKas. It is therefore difficult to use experimental macroscopic pKa values to assess and train microscopic pKa prediction methods directly without further means of annotating macroscopic-microscopic correspondence. It is not straight-forward to infer the underlying microscopic pKa values from macroscopic measurements of a polyprotic compound without complementary experiments that can provide structural information. Therefore, for future data collection efforts for evaluation of pKa predictions, if measurement of pKas via NMR is not possible, we advise supplementing UV-metric measurements with NMR characterization of microstates to show if observed pKas are microscopic (related to a single group) or macroscopic (related to dissociation of multiple groups), as performed for SM07 and SM14 in this study.

Another source of complexity in interpreting macroscopic pKa values is how the composition of macroscopic pKas can change between different experimental methods as illustrated in Fig. 1. Different subsets of microstates can become indistinguishable based on the type of signal the experimental method is constructed on. In potentiometric titrations, microstates with the same total charge are indistinguishable and are observed as one macroscopic population. In spectrophotometric pKa measurements, the factor that determine if microstates can be resolved is not charge. Instead, microstates whose populations, and therefore UV-absorbance spectra, change around the same pH value become indistinguishable.

The “macroscopic” label is commonly ascribed to transitions between different ionization states of a molecule (all microstates that have the same total charge form one macrostate), but this definition only applies to potentiometric methods. In UV-absorbance based methods, the principle that determines which microstates will be distinguishable is not charge or number of bound protons, but molecular absorbance changes, and how closely underlying microscopic pKa values overlap. To compare experimental macroscopic pKa and microscopic computational predictions on common ground, the best solution is to compute “predicted” macroscopic pKa values from microscopic pKas based on the detection limitations of the experiment. A disadvantage of this approach is that experimental data cannot provide direct guidance on microscopic pKa resolution for improving pKa prediction methods.

Since analyte purity is critical for accuracy, necessary quality control experiments must be performed to ensure at least 90% purity for UV-metric pKa measurements. Higher purities may be necessary for other methods. For potentiometric methods, knowing the stoichiometry of any counterions present in the original powder stocks is also necessary. Identity of counterions also needs to be known to incorporate titratable counterions, e.g. ammonia, in the titration model.

For the set of SAMPL6 pKa challenge compounds, we could not use potentiometric pKa measurements due to the low aqueous solubility of many of these compounds. The lowest solubility observed somewhere in the experimental pH range of titration is the limiting factor, since for accurate measurements the analyte must stay in the solution phase throughout the entire titration. Since the titration pH range is determined with the goal of capturing all ionization states, the analyte is inevitably exposed to pH values that correspond to low solubility. Neutral and zwitterionic species can be orders of magnitude less soluble than ionic species. If a compound has a significantly insoluble ionization state, the pH range of titration could be narrowed to avoid precipitation, but it would limit the range of pKa values that could be accurately measured.

For future pKa challenges with multiprotic compounds, if sufficient time and effort can be spared, it would be ideal to construct an experimental pKa dataset using experimental methods that can measure microscopic pKas directly, such as NMR. In the present study, we were only able to perform follow up NMR microstate characterization of two compounds because we relied on intrinsically low-sensitivity and time-consuming 1H–15N HMBC experiment at natural abundance of 15N nuclei. 1H–15N HMBC experiments of SM07 and SM14 required high analyte concentrations and thus the use of organic solvents for solubility. Alternatively, it might be possible to determine microstates with 1H NMR by analyzing chemical shift changes of reporter protons [21] in aqueous solutions with lower analyte concentrations and with much higher throughput than 15N-based experiments. However, it should be noted that 1H NMR titration data may not always be sufficient for unambiguous microstate characterization. In this case, other reporter nuclei such as 13C, 19F and 31P can be used where appropriate to supplement 1H data To prepare sample solutions for NMR at specific pH conditions, the Sirius T3 can be used to automate the pH adjustment of samples. Another advantage of using the Sirius T3 for NMR sample preparation includes preparing ionic strength adjusted NMR samples and minimizing consumption of the analyte since small volumes (as low as 1.5 mL) of pH adjusted solutions can be prepared.

In the future pKa challenges, it would be especially interesting to expand this exercise to larger and more flexible drug-like molecules. pKa values are environment dependent and it would be useful to be able to predict pKa shifts based on on ionic strength, temperature, lipophilic content, with cosolvents or in organic solvents. Measuring the pKa of molecules in organic solvents would be useful for guiding process chemistry. To test such predictions, special pKa experiments would need to be designed to measure pKas under different conditions.

The next iteration of the SAMPL log P/log D prediction challenge will include a subset of compounds from pKa challenge. We therefore envision that the collected dataset of pKa measurements will also be of use for this challenge. Experimental pKa values will be provided as an input to separate the pKa prediction issue from other problems related to log D predictions. We expect that the experimental pKas can be used as an indication if protonation states need to be taken into account for a log D prediction at a certain pH and for the validation of protonation state population predictions in the aqueous phase. Even for compounds for which microstates were not experimentally determined, macroscopic pKa value can serve as an indicator of how likely it is that protonation states will have a significant effect on the log D of a molecule. Additionally, the information from NMR experiments in this study provided the site of protonation for six 4-aminoquinazoline compounds, which could be incorporated as microstate information for log D predictions. For predicting log D we suggest as a rule of thumb to include protonation state effects for pKa values at least within 2 units of the pH of the log D experiment. pKa values of six 4-aminoquinazoline compounds in this study were determined to be within 2 pKa units from 7.

Conclusion

This study reports the collection of experimental data for the SAMPL6 pKa prediction challenge. Collection of experimental pKa data was performed with the goal of evaluating computational pKa predictions, therefore necessary quality control and uncertainty propagation measures were incorporated. The challenge was constructed for a set of fragment-like and drug-like small molecules, selected from kinase-targeted chemical libraries, resulting in a set of compounds containing heterocycles frequently found in FDA-approved kinase inhibitors. We collected pKa values for 24 compounds with the Sirius T3 UV-metric titration method, which were then used as the experimental reference dataset for the SAMPL6 pKa challenge. For compounds with poor aqueous solubilities we were able to use the Yasuda–Shedlovsky extrapolation method to measure pKa values in the presence of methanol, and extrapolate to a purely aqueous phase.

In our work, we highlighted the distinction between microscopic and macroscopic pKas which is based on the experimental method used, especially how underlying microstate composition can be different for macroscopic pKa values measured with UV-metric versus pH-metric titration methods. We discuss how macroscopic pKa values, determined by UV, introduce an identifiability problem when comparing to microscopic computational predictions. For two compounds (SM07 and SM14) we were able to alleviate this problem by determining the sequence of microscopic protonation states using 1H–15N HMBC experiments. Microstates of five other compounds with 4-aminoquinazoline scaffold were inferred based on the NMR characterization of SM07 microstates which showed that it is monoprotic.

The collected experimental data constitute a potentially useful dataset for future evaluation of small molecule pKa predictions, even outside of SAMPL challenges. We expect that this data will also be useful for participants in the next SAMPL challenge on small molecule lipophilicity predictions.

Code and data availability