Introduction

Sulfur fluoride exchange (SuFEx) describes the reaction process where the S(VI)–F bonds are exchanged with incoming nucleophiles to yield stable S(VI)–O and S(VI)–N linked products (Fig. 1) [1]. As a new generation of click chemistry [1,2,3,4], SuFEx has been widely used in diverse fields such as chemical synthesis [2, 5,6,7,8,9,10], medicinal chemistry [11,12,13,14,15,16,17], polymer chemistry [18,19,20], material chemistry [17, 21, 22], and chemical biology [23, 24]. Particularly, proximity-enabled SuFEx generates specific covalent linkages between the interacting protein-biomolecules in vivo while minimizing non-specific cross-linking [25, 26], which opens innovative avenues for mapping elusive protein–biomolecule molecule interactions. Aryl fluorosulfate and aryl sulfonyl fluoride are relatively latent towards biological nucleophiles (Fig. 1). However, upon binding with other biological targets, aryl fluorosulfate or aryl sulfonyl fluoride would be brought into proximity to a nucleophile on the target [27]. This proximity enables aryl fluorosulfate or aryl sulfonyl fluoride to react with the nucleophile on the target via SuFEx, irreversibly cross-linking the interacting biomolecules (Fig. 2A, aryl fluorosulfate is used as a representative SuFEx warhead to show the mechanism of proximity-enabled SuFEx) [26]. Acting akin to “sleeping beauty,” they become reactive upon being placed in proximity to target residues, enabling the precise cross-linking of the interacting biomolecules [28].

Fig. 1
figure 1

Sulfur (VI) fluoride exchange (SuFEx) and the structures of aryl fluorosulfate and aryl sulfonyl fluoride

Fig. 2
figure 2

The application of SuFEx in probing protein-biomolecules interactions. A Schematic illustration of proximity-enabled SuFEx; B Aryl fluorosulfate or aryl sulfonyl fluoride can form a stable covalent bond with Lys, His, and Tyr residue; C: Aryl fluorosulfate and aryl sulfonyl fluoride fail to generate a stable bond with Cys; D: Aryl fluorosulfate fails to generate a stable bond with Ser and Thr

Aryl fluorosulfate and aryl sulfonyl fluoride are the most often used SuFEx warheads and can form a stable covalent bond with Lys, His, and Tyr residues, which are verified via several techniques such as mass spectrometry and X-ray crystallography (Fig. 2B). Aryl fluorosulfate or aryl sulfonyl fluoride can also react with Cys, however, the resulting bonds are not stable under physiological environment, leading to the formation of phenol and sulfinic acid, respectively (Fig. 2C) [29]. Aryl fluorosulfate fails to generate a stable bond with Ser and Thr via SuFEx (Fig. 2D). Following the initial reaction, an elimination reaction subsequently occurs, resulting in the formation of dehydroalanine (Dha) and dehydrobutyrine (Dhb), respectively [30, 31]. Proteomic studies suggested that cross-linking between aryl sulfonyl fluorides with Ser or Thr could be detected via mass spectrometry (MS) and further experiments are needed to comprehend these bonds’ stability [32].

Proteins interact with other biomolecules including proteins [33], carbohydrates [34], nucleic acids [35], and lipids [36]. These interactions govern almost every aspect of life, and understanding these interactions is fundamental to drug discovery as abnormal interactions often lead to disease progression [37,38,39]. These interactions are transient and reversible under physiological conditions, posing profound challenges to investigating these interactions and harnessing them for therapeutic purposes [40, 41]. Therefore, platforms that can stabilize these interactions for mechanistic investigation are valuable tools for biomedical research. To this end, a variety of chemical probes based on SuFEx have been developed to probe protein-biomolecule interactions. Aryl fluorosulfates and aryl sulfonyl fluorides can be installed on small molecule ligands to probe protein-ligand interactions [42,43,44,45]. Additionally, they could also be incorporated into proteins via genetic code expansion, a technique to incorporate unnatural amino acids (Uaas) into proteins in a site-specific manner [46,47,48]. These engineered proteins bearing SuFEx warheads have been used to study protein-protein, proteins-RNA, and protein-carbohydrate interactions.

In this review, we first provide an overview of the small molecule approach of utilizing SuFEx to profile protein-ligand interactions. Subsequently, we showcase the protein approach, where engineered proteins containing SuFEx warheads serve as chemical probes to profile protein-protein and proteins-RNA interactions. Additionally, we illustrate the factors that influence SuFEx kinetics and discuss strategies to fine-tune the reactivities of SuFEx warheads. Given the extensive application of SuFEx in biomedical research, our aim is not to comprehensively cover every aspect. Instead, this review provides an overview and roadmap to grasp the potential and advantages of SuFEx in probing protein-biomolecule interactions.

SuFEx warheads installed on small molecules to probe protein-ligand interactions

Understanding the detailed mechanisms of drug-target interactions is crucial for drug discovery. [49, 50]. To this end, SuFEx warheads including aryl fluorosulfate and aryl sulfonyl fluoride have been installed on the small molecules to study their interactions with their protein targets [32, 51]. In this section, we discuss this strategy to probe protein-ligand interactions by integrating a SuFEx warhead on small molecules.

A reactive adenosine derivative FSBA featuring aryl sulfonyl fluoride was originally developed 40 years ago to elucidate the binding sites of glutamate dehydrogenase (Fig. 3) [52]. Subsequently, FSBA was discovered to react with the conserved Lys residue within the ATP site of kinases. Taunton and co-workers designed a probe 1 based on FSBA with a click handle (terminal alkyne), which can react with the conserved catalytic lysine (Lys295) of SRC-family tyrosine kinase (Fig. 3) [53]. Upon probe 1 and kinase incubation, a rhodamine-azide was conjugated to probe 1 modified proteins via click chemistry. Subsequent fluorescence electrophoresis gel analysis showed rhodamine-labeled protein band, demonstrating the covalent bond formation. Additionally, K295R mutation abolished the covalent bond formation, indicating that probe 1 targeted Lys295 of SRC-family tyrosine kinase. Using probe 1 as a competitive probe, it was found that ponatinib, a clinical Bcr-Abl inhibitor, targeted SRC-family kinases. Later, aryl sulfonyl fluorides were found to react with Tyr residues in glutathione transferases (GSTs), further expanding protein target scopes of SuFEx [54]. Weerapana and co-workers designed several serine protease inhibitors featuring aryl sulfonyl fluoride and alkyne, with the structure of DAS1 serving as a representative example (Fig. 3). Modeled after AEBSF, DAS1 demonstrated the capability to profile serine proteases within live cells, thereby expanding the repertoire of activity-based protein profiling (ABPP) for serine proteases.

Fig. 3
figure 3

Structures of covalent small molecule probes featuring SuFEx warheads

Kelly and co-workers reported fluorosulfate-containing compounds that are capable of selectively reacting with the intracellular lipid binding protein (iLBP) family (Fig. 3) [44]. These probes reacted with a conserved tyrosine phenolic group within the binding site of (iLBPs). Incubation of probe 2 (100 µM) with CRABP2 (2 µM), an iLBP, for 48 h led to quantitative yield of covalent labeling as revealed by liquid chromatography-electrospray ionization mass spectrometry (LC-ESI-MS). To determine which iLBP amino acid residue reacted with probe 2, CRABP2 modified by probe 2 was subjected to tandem mass spectrometry analysis, which showed that the Tyr located in the conserved Arg~Arg~Tyr binding motif was responsible for the covalent bond formation with probe 2. To gain additional insights into this binding and further investigate the reaction kinetics, probe 3 was prepared by incorporating a diphenyl moiety based on probe 2 to improve the binding affinity towards iLBP. Covalent modification of CRABP2 (2 μM) by probe 3 (100 μM) reached completion within 1 h in pH 8.0 buffer at 25 °C as revealed by LC-ESI-MS analysis. On the other hand, incubation of probe 3 with FABP3, FABP5, or FABP4 under the same conditions for up to 24 h did not lead to significant adduct products, indicating the selectivity of probe 3 towards CRABP2. Subsequently, crystallography experiments were performed to gain structural insights into probe 3-CRABP2 interaction (PDB: 5HZQ). The biphenyl substructure of probe 3 binds within the spacious ligand binding pocket of CRABP2 and is within van der Waals distance of various hydrophobic side chains. Structural alignment showed that probe 3 occupies almost the same space as that of retinoic acid (RA, the natural ligand of CRABP2) bound to CRABP2. The diarylsulfate linkage resulting from covalent bond formation between Tyr134 of CRABP2 and probe 3 was clearly shown in the electron density map. It has been known that the major function of iLBP is to deliver RARα-mediated retinoic acid (RA) across the nuclear membrane for interaction with RA receptors [44, 55, 56]. Probe 3 was found to inhibit the transcript of CRBP1 mRNA, which is a downstream gene target of CRABP2-mediated RARα-RA transcriptional reprogramming. RA (100 µM) treatment of MCF-7 cells led to a 5.5-fold induction of CRBP1 mRNA and pretreatment with probe 3 (20 μM, 4 h) attenuated this induction by 2-fold. These studies showed that aryl fluorosulfate could selectively target single iLBPs when installed on proper small molecule ligands, demonstrating its value in deciphering the functions of iLBP.

Taunton and co-workers reported a series of small molecule probes featuring sulfonyl chloride (compounds 4 to 6), which can covalently bind to the ATP binding domain of SRC-family kinases [12]. These probes were used to investigate the interaction between drug molecules and intracellular kinases. The pyrimidine 3-aminopyrazole scaffold of probes 4 to 6 can form 3 hydrogen bonds with the conserved hinge region and the short linker attached to the C2 of pyrimidine can position aryl sulfonyl fluoride proximal to the catalytic lysine. As shown in Fig. 3, compounds 4 to 6 are probes that contain an alkyne clickable moiety and aryl sulfonyl fluoride for covalent conjugation. To assess whether probes 4 to 6 could covalently modify SRC-family kinase, probes 4 to 6 (15 µM) were incubated with the SRC kinase domain (5 µM) for 1 h at room temperature. LC-MS analysis showed the formation of a 1:1 adduct between the probe and SRC. Probes 4 and 5 reacted with SRC in quantitative yield, whereas 6 achieved 30% labeling yield, showing the critical impact of the orientation of covalent warhead on labeling yield. To confirm the conjugation linkage, the formed adduct was further digested by trypsin and analyzed by LC-MS/MS. The covalent conjugation site was found to be the catalytic lysine, Lys295. Subsequently, probe 5 (renamed as XO44) was applied to study the covalent modification of endogenous kinase in human cell lines. Specifically, XO44 (2 µM) was incubated with Jurkat T cells for 30 min. XO44-labeled proteins were further reacted with a biotin-azide for click conjugation, enriched via affinity pull-down, and analyzed by LC-MS/MS. XO44 was able to covalently modify 133 protein kinases in Jurkat T cells (identified with >1 unique peptide). The authors then utilize XO44 to monitor intracellular kinase engagement by dasatinib, a tyrosine kinase inhibitor used for treating chronic myeloid leukemia. To explore the interaction of dasatinib with various kinases, Jurkat cells were pre-incubated with dasatinib (100 or 300 nM) for 1 h followed by treatment with XO44 (2 µM) for 30 min. LC-MS/MS analysis of the lysates suggested three kinases, ABL1, BLK, and SRC were fully inhibited to react with XO44 after incubating with 100 nM dasatinib. Several other kinases that are responsible for T-cell proliferation including ZAP70, ITK, JAK1, MAPK1, and AURKB were still able to interact with XO44 with minimal competing effects from dasatinib. These studies underscore the potential of SuFEx-based covalent probes in profiling drug-kinase interactions.

Kelly and co-workers reported a series of compounds containing a clickable alkyne moiety and a fluorosulfate group to explore the concept of “Inverse Drug Discovery” [13]. These compounds were designed to harbor a suitable electrophile for covalent conjugation with potential targets in cells or cell lysates (7 to 9, Fig. 4). Additionally, competing compounds, which lack an alkyne group, were prepared (7c to 9c, Fig. 4). The presence of excess competitors should block the reaction between target proteins and the alkyne probes 7-9, leading to lowered protein abundance via affinity purification. These competition experiments further enhance the selectivity of the designed probes. HEK293T cell lysate was treated with probes 7, 8, or 9 (10 µM) in the presence or absence of competitor 7c, 8c, or 9c (90 µM), respectively, for 24 h. Then the samples were enriched via affinity purification and analyzed via MS-MS. The proteins whose covalent adduction with probes was strongly attenuated by the corresponding competitors were deemed of high interest and further investigated. The identified proteins were then expressed and tested to ascertain if they could be labeled with covalent probes in vitro for further validation purposes. These studies revealed a variety of new protein-ligand interactions. For example, a protein named HSD12 was enriched by probe 7, not the other two probes, which makes probe 7 a validated ligand to modulate HSDL2 function. A nucleoside diphosphate kinase, NMW1, was enriched by probe 8. X-ray structure of NME1−probe 8 interactions showed that probe 8 reacted with the Lys12 of NME1. Secretion of NME1 and NME2 from cancers is linked to enhanced growth and metastatic potential, and there was a lack of potent inhibitors for NME1 [57, 58]. The “Inverse Drug Discovery” strategy, which harnesses the power of SuFEx, could serve as an efficient platform for linking chemical structures with novel or important protein targets and identifying lead structures for medicinal chemistry efforts.

Fig. 4
figure 4

Structure of covalent probes used in the strategy of “Inverse Drug Discovery”

Besides probing protein-ligand interactions, aryl fluorosulfates and aryl sulfonyl fluorides have been widely used in designing covalent small-molecule drugs (Fig. 5). For example, aryl sulfonyl fluorides-containing inhibitor (SF-p1) and fluorosulfate-containing inhibitor (FS-p1) have been developed to selectively target DcpS, an RNA decapping scavenger enzyme [58, 59]. Notably, aryl fluorosulfates showed much improved chemical stability than aryl sulfonyl fluorides. For example, incubation of SF-p1 (100 µM) in PBS (pH 7.4) for 24 h led to 50% degradation, whereas FS-p1 containing a fluorosulfate remained intact under the same conditions. Compound 10 with aryl sulfonyl fluoride was found to react with Lys15 on transthyretin to prevent amyloidogenesis [60]. TMX-2164, featuring aryl sulfonyl fluoride, was found to react with the Tyr58 located in the lateral groove of B-cell lymphoma 6 (BCL6), a transcriptional repressor frequently deregulated in lymphoid malignancies [61]. Covalent inhibitors with SuFEx warheads have also been developed to target SIRT5 lysine deacylase [62], human neutrophil elastase [63], SRPK1/2 kinase [64], X-linked inhibitor of apoptosis protein IAP (XIAP) [65], and melanoma-IAP (ML-IAP) [66]. Furthermore, aryl sulfonyl fluoride-modified oligonucleotides have enabled the design of covalent aptamers capable of disrupting the interaction between the SARS-CoV-2 spike protein and the human angiotensin-converting enzyme 2 (ACE2) receptor [67]. These pioneering efforts greatly expand the realm of covalent drug discovery and have been summarized by other reviews [68]. Since this review focuses on the application of SuFEx in profiling protein-biomolecule interactions, the detailed discussion of SuFEx in covalent drug discovery is beyond the scope of this review.

Fig. 5
figure 5

Structure of covalent drugs featuring covalent warheads and their protein targets

SuFEx group on proteins to probe protein-biomolecule interactions

Besides the application in profiling protein-ligand interactions mentioned above, SuFEx warheads can be installed on proteins and these engineered proteins can serve as probes to investigate protein-biomolecule interactions. SuFEx warheads can be installed on proteins via chemical reaction (Fig. 6) or enzymatic labeling (Fig. 9). These probes are designed to harness both a SuFEx warhead and a special chemical moiety, which can react with an amino acid side chain on the target protein or can be conjugated to the target protein via enzyme labeling. However, these two approaches usually lack site-specificity. Alternatively, SuFEx warheads can be incorporated into target proteins through genetic code expansion (Fig. 7) [25, 26, 69]. In this approach, an unnatural amino acid featuring SuFEx warhead (e.g. FSY, see structure in Fig. 7A) is first synthesized via organic reaction. Subsequently, the unnatural amino acid featuring SuFEx warhead can be incorporated into proteins site-specifically during the protein expression process in live cells via a pair of specifically engineered tRNA and tRNA synthetase. Therefore, the SuFEx warheads can be incorporated into proteins-of-interest in a precise manner. In this section, we overview and discuss the previous efforts to study protein-biomolecule interactions including protein-protein and protein-RNA via SuFEx.

Fig. 6
figure 6

A plant-and-cast strategy for chemical cross-linking mass spectrometry (A) and the structures of NHSF and BS2G (B)

Fig. 7
figure 7

Proximity-enabled SuFEx to cross-link the interacting proteins (A) and the structures of unnatural amino acids featuring SuFEx warheads (B)

SuFEx in probing protein-protein interactions

Wang and co-workers reported the application of aryl sulfonyl fluoride for chemical cross-linking mass spectrometry (CXMS). In their design, referred to as the plant-and-cast strategy, a single cross-linker was constituted by conjugating a highly reactive electrophile (succinimide ester) and a weakly reactive electrophile (sulfonyl fluoride) (Fig. 6) [32]. The highly reactive succinimide ester moiety of NHSF can first react with the Lys chain on the surface of protein A, which brings the weaker electrophile (sulfonyl chloride) into proximity to a nucleophile on protein B, promoting SuFEx to form a covalent bond. First, the reactivity of NHSF was tested with a model peptide named 7KR, which has one single Lys. Only monoadduct formation was observed between NHSF and the model peptide via succinimide ester reactivity. In contrast, BS2G, which consisted of two succinimide ester groups, generated a dimer of the model peptide, suggesting the lower reactivity of sulfonyl fluoride compared to succinimide ester. The reactivity of NHSF was further assessed using a model protein, bovine serum albumin (BSA). Incubation of BSA with NHSF (BSA: NHSF, 1:1000, in PBS for 1 h) generated various intramolecular cross-linking sites including Lys-His, Lys-Ser, Lys-Thr, and Lys-Tyr. Interestingly, Lys-Lys conjugation was not observed with NHSF but was detected for BS2G. The author suggested the Lys-Lys distance was calculated to be 20 Å. Meanwhile, for most of the cross-linking sites enabled by NHSF, the distance was less than 20 Å. The author suggested that these results reflected that NHSF-mediated cross-linking was majorly determined by proximity. Further, the cross-linking efficiency of NHSF was evaluated with E. coli whole-cell lysate. Tandem MS analysis showed 86% of the cross-linkings were related to the side chain reaction of Ser, Thr, Tyr, and His, which are not accessible via traditional succinimide ester cross-linkers.

Through plant-and-cast strategy, aryl sulfonyl fluoride can be installed on a protein-of-interest protein to cross-link its interactors. However, the conjugation of NHSF to proteins via the reaction between Lys and succinimide ester lacks selectivity, leading to the labeling of multiple positions. To extend the applicability of SuFEx and achieve the precise installment of SuFEx warheads, Wang and co-workers pioneered the efforts in engineering proteins with SuFEx groups via genetic code expansion. This strategy incorporates a fluorosulfate-containing unnatural amino acid (Uaa) named FSY, into a protein of interest in a site-specific manner via genetic code expansion (Fig. 7A). [25] In this approach, a specifically evolved orthogonal tRNA/synthetase pair could recognize the FSY and incorporate it into proteins at designated positions in response to a stop codon. Although fluorosulfate itself exhibits low reactivity, it can react with a nucleophile (Lys, His, Tyr) on the target protein upon protein-target binding, irreversibly cross-linking the interacting proteins via SuFEx. To improve the cross-linking efficacy and add new arsenals to the toolbox, Wang and co-workers developed a variety of FSY analogs including FSK [26], mFSY [70], FFY [41], and SFY [71] (Fig. 7B). Subsequently, this strategy was expanded to cross-link protein-RNA and protein-sugar interactions [26, 71, 72].

Wang and co-workers utilized FSY and FSY to probe protein-protein interactions. FSY and FSK are unnatural amino acids containing aryl fluorosulfate and can generate covalent bonds with the target proteins via SuFEx (Fig. 7B)[26]. FSK has a long and flexible side chain, whereas FSY features a short and rigid side chain. To determine the optimal reaction distances for FSY and FSK, FSY or FSK was incorporated at site 103 of E. coli glutathione transferase (ecGST), strategically located at the interface of the dimer pairs to target His106 and Lys107 of the other monomer. Based on the ecGST crystal structure, the distance between the Cα of residue 103 with Nδ atom of His106 and with Nε atom of Lys107 is 7.8 Å and 6.0 Å, respectively (Fig. 8A). Incorporation of FSY at site 103 resulted in significant dimeric cross-linking, while no cross-linking was observed with FSK, indicating that FSK was not effective in targeting nucleophilic residues located too close in the restricted space of the dimer interface. Subsequent experiments evaluated FSK’s capability to form cross-linkings with target residues that lay beyond FSY’s reach. Specifically, FSY and FSK were incorporated at site 65 of ecGST, which is surrounded by several nucleophilic residues including Lys93, Tyr100, Lys132, and Tyr135, ranging from 9.2 Å to 13.3 Å from Cα of Glu65 (Fig. 8B). These distances are potentially within FSK’s reactive range but too extended for FSY. Upon incorporation at site 65, FSK resulted in significant dimeric cross-linking while FSY failed to do so. These side-by-side comparative studies demonstrated the critical impact of distance on the efficiency of protein-protein cross-linking and suggested the possibility of achieving selectivity via the strategic placement of SuFEx groups.

Fig. 8
figure 8

The application of FSY and FSK in mapping PPIs. A Crystal structure of ecGST dimer showing the distance between site 103 with His106 and Lys107 of the other monomer. PDB: 1A0F; B Crystal structure of ecGST dimer showing the distances between site 65 and the adjacent nucleophilic residues. PDB: 1A0F; C Workflow of profiling PPIs via FSY or FSK

Subsequently, the authors mapped the interactome of thioredoxin (Trx) in E. coli utilizing FSY and FSK. Unlike traditional methods that primarily targeted cysteine residues within proteins, FSY and FSK provided a significant advantage by enabling the targeting of Lys, His, and Tyr residues via SuFEx. As a proof-of-concept, FSY or FSK was incorporated into site 62 of thioredoxin in E. coli cells, which is likely located at the periphery of the binding interface with substrate proteins. Upon protein-protein cross-linking in vivo, these proteins were subsequently isolated, digested with trypsin, and analyzed through tandem mass spectrometry (Fig. 8C). MS analysis of cross-linked peptides showed that 12 substrate proteins of Trx were identified for both FSY and FSY. Among these substrate proteins were AHPC, TPX, SDHA, HPTG, and CH10, which are known substrates of Trx, validating the efficacy of the method. FSY and FSK led to the identification of different subsets of interacting proteins, with some overlaps such as DNAK, APHC, and TPX. However, for the same substrate protein AHPC and DNAK, FSK and FSY cross-linked to different residues, illustrating the critical importance of distances on proximity-enabled SuFEx. These studies explored the complementary utility of FSY and FSK in mapping protein-protein interactions. By enabling the targeting of additional amino acid residues beyond cysteine and allowing the site-specific incorporation of fluorosulfate in vivo, this strategy greatly expanded the toolkits available for probing PPIs in vivo.

Burkart and co-workers employed a small molecule named BSF3, featuring aryl sulfonyl fluoride, to cross-link the interacting acyl carrier protein (AcpP) and BioF to investigate AcpP-BioF interactions. BioF is a pyridoxal 5′-phosphate (PLP)-dependent enzyme that catalyzes the first committed step of biotin biosynthesis. BSF3 contains a pantetheine moiety, which was used to conjugate aryl sulfonyl fluoride to AcpP via enzymatic reaction (Fig. 9). The aryl sulfonyl fluoride group installed on the AcpP could cross-link AcpP-BioF upon protein-protein binding. Crystal structure and LC−MS/MS analysis showed that the pantetheine moiety was linked to Ser36 of AcpP, enabling aryl sulfonyl fluoride to react with Tyr264 of BioF. Subsequently, the author performed alanine mutagenesis on the interface residues and studied the effects of each mutation on cross-linking efficiency, which would identify the key residue(s) involved in the AcpP-BioF interactions. For example, it was found that R130A, R148A, and R149A mutations greatly reduced the cross-linking efficiency from 35% to 6, 11, and 6%, respectively. Q146A and K6A both had limited effects on cross-linking. This visualization of PPI via protein-protein cross-linking offers a molecular basis for understanding the dynamics of biotin biosynthesis and could be adapted to investigate various other PPIs.

Fig. 9
figure 9

Workflow of mapping AcpP-BioF interaction via a small molecule named BSF3

The incorporation of FSY and its analogs into proteins enables the proteins to form a covalent bond with other biological macromolecules, leading to the development of covalent protein drugs. Covalent protein drugs were first demonstrated on a programmed cell death protein-1 (PD-1)/PD-L1 pair for cancer immune therapy [51]. FSY was incorporated into PD-1 to covalently bind to PD-L1, and the irreversible blocking of PD-L1 exhibited much increased antitumor efficacy over the noncovalent wildtype PD-1. Subsequently, this strategy has been used to build covalent antibodies for human rhinovirus 14 (HRV14) 3C protease [73], covalent sugar binders for cancer cell surface sialoglycans [72] and covalent protein inhibitors for neutralizing SARS-CoV-2 [41, 74]. Recently, covalent protein drugs were demonstrated to enhance efficacy and safety for targeted radionuclide therapies, showing the great potential of this strategy in developing novel biotherapeutics [75]. The recent progress in designing covalent protein drugs has been systematically summarized in other reviews and is not discussed in detail in this review [25].

SuFEx in probing protein-RNA interactions

Wang and co-workers further expand the application of proximity-enabled SuFEx to probe protein-RNA interaction in living cells [71]. RNA-binding proteins (RBPs) regulate almost all aspects of RNA molecules inside cells and RBP–RNA interactions regulate the fate and function of RNA. To precisely map RBP-RNA interactions, unnatural amino acids featuring aryl fluorosulfate or aryl sulfonyl fluorides were site-specifically incorporated into RBPs to cross-link with the interacting RNA (Fig. 10). As a proof-of-concept, FSY was incorporated into catalytically inactive Cas13b, a type VI RNA-guided RNA-targeting CRISPR-Cas effector. When FSY was incorporated into site 133 of Cas13b, it enabled efficient cross-linking between Cas13b and RNA as revealed by electrophoretic mobility shift assays (EMSAs). Using Cas13b as a model protein, it was found that FSY could cross-link with all four nucleotides (A, U, G, C) in RNA. Since there is no available nucleophile in uracil, the author argued that FSY could target the ribose 2′-hydroxy group. To enhance the scope of SuFEx-based cross-linking within cellular environments, they developed another unnatural amino acid named SFY, which features an aryl sulfonyl fluoride group. Aryl sulfonyl fluoride is more reactive than aryl fluorosulfate [29, 43]. Unlike FSY, where the SuFEx warhead is placed at the para position, SFY contains the SuFEx warhead at the meta position. Due to the respective meta and para positioning of the reactive groups in SFY and FSY, they can complement each other in targeting nucleophiles with different orientations. An in vivo method for detecting N6-Methyladenosine (m6A) in mammalian cells with single-nucleotide resolution was developed using SFY. By integrating SFY into this protein’s m6A recognition site, the modified protein could then create covalent bonds with nucleotides adjacent to m6A sites, allowing for the precise mapping of m6A locations. This strategy, which can covalently cross-link the interacting portion-RNA with high specificity in vivo, is a powerful platform to profile RBP-RNA interactions and provide an innovative solution for RNA-related research and therapeutics.

Fig. 10
figure 10

The application of FSY and SFY in mapping Protein-RNA interactions. FSY is used as a representative SuFEx warhead

Analytical methodologies for verifying covalent bond formation

Various techniques have been employed to verify the covalent bond formation between SuFEx probes and their targets, including mass spectrometry, electrophoresis gel analysis, and X-ray crystallography [12, 25, 44, 53]. Intact protein mass spectrometry has been used to verify the adduct products between SuFEx probes and their targets. To confirm the specific amino acid that is subjected to covalent modification, the formed adduct can be further digested by trypsin and analyzed by LC-MS/MS. Additionally, the covalent bond formation can be visualized via electrophoresis gel analysis. Many small molecule SuFEx probes have a click handle (e.g. the terminal alkyne in probes 1 to 6). Upon covalent bond formation with their protein targets, a fluorophore or biotin can be conjugated to SuFEx probe-modified proteins via click chemistry and give corresponding protein bands on electrophoresis gel. SuFEx can cross-link the interacting proteins, leading to protein molecular weight change, which can be visualized in SDS-PAGE. The covalent linkage between the SuFEx probes and their targets can be shown in the electron density map of X-ray crystal structure, providing detailed confirmation of the covalent interactions.

SuFEx kinetics and implications

For various emerging SuFEx applications in proteins, the formation of the covalent linkage is a key step. SuFEx kinetics dictate the rate of covalent bond formation, directly influencing potency and labeling efficacy. Therefore, it’s imperative to understand SuFEx reaction kinetics under physiological conditions. In this section, we delve into the factors that can be manipulated to modulate the reactivity and stability of SuFEx warheads.

Grimster and co-workers extensively evaluated the substitution effect on the reactivity and stability of aryl sulfonyl fluoride analogs including steric and electronic factors [29]. First of all, a series of mono-substituted aryl sulfonyl fluoride compounds featuring various substituted groups at para and ortho positions were synthesized to examine the electronic effect (Fig. 11). The synthesized sulfonyl fluoride compounds (1 mM) were incubated with various nucleophilic amino acids (10 mM) including N-acetylcysteine, N-acetyltyrosine, N-acetyllysine, or N-acetylserine in PBS (pH 7.5 with 5% ACN). It was found that the intermediate generated from the sulfonyl fluoride and cysteine reaction is unstable, which could further react with excessive cysteine to produce a sulfinic acid and a disulfide product (Fig. 2C). When N-acetyltyrosine was used as a nucleophile, Hammett analysis revealed a strong correlation between the electron-withdrawing properties and reaction rates. ortho-substituted sulfonyl fluorides exhibited a similar correlation between electron-withdrawing properties and reaction rates. N-acetyllysine reacted with various sulfonyl fluorides (14a to 14e) to afford corresponding sulfonamide, though at a slower rate than N-acetyltyrosine (2.9-fold) and N-acetylcysteine (10-fold), possibly because that the lysine side chain is predominately protonated at physiological pH. No reaction between N-acetylserine and sulfonyl fluorides was observed under these conditions even with the most electron-deficient sulfonyl fluoride analog, suggesting the low reactivity of serine towards sulfonyl fluoride. Taken together, under near-physiological conditions (pH 7.5, PBS), the reactivity of various nucleophilic amino acids towards sulfonyl fluoride is such: cysteine > tyrosine > lysine > serine.

Fig. 11
figure 11

Structure of aryl sulfonyl fluoride and aryl fluorosulfate analogs used to evaluate the substitution effect on their reactivity and stability

Subsequently, the authors studied the hydrolysis rates of sulfonyl fluoride analogs under physiological pH and confirmed the correlation between the hydrolysis rate and electron deficiency of the sulfonyl fluoride analogs. The half-life of sulfonyl fluoride analogs with strong electron-withdrawing groups (14b, 14c, 14f, and 14g) was measured to be around 5–15 min, while analogs with electron-donating groups such as 14e and 14o, the stability was significantly improved with half-life up to several days. The fluorosulfate analogs (16 and 17) were found to be very stable: no hydrolysis was observed at pH 7.5 over 24 h at 37 ° and no substitute product was observed when they were incubated with either N-acetyltyrosine or N-acetylcysteine (10 mM).

The structure and chemical reactivity relationship of SuFEx warheads were further studied by Bush and co-workers [76]. A series of analogs with SuFEx warheads were prepared (Fig. 12A). The half-life of these compounds ranged from 35 min to more than 1700 h when incubated in PBS (pH 7). Para-amide and para-sulfonamide substituted sulfonyl fluoride hydrolyzed faster than the meta analogs (18a vs 18b, 18c vs 18d). Electron-donating substituents markedly increased the stabilization (18e vs 18c, 18f vs 18d). The fluorosulfate (18g), sulfonyl fluoride conjugated to pyrrole (18h), and N-linked sulfonyl fluoride (18i) displayed the greatest stability, with negligible hydrolysis over 24h at pH 8. The reactivity of 18a−i with the nucleophilic amino acids was found to be closely correlated with their hydrolysis rate and the less stable analogs led to increased reactivity. The reactivity towards different amino acids increased in the order His < Lys < Tyr < Cys under physiological conditions. To explore the structural effects of various sulfonyl fluoride in proteome applications, a panel of XO44 analogs with different SuFEx warheads was prepared (Fig. 12B). XO44 is a covalent probe for kinase proteins, which reacts with the conserved catalytic lysine residue in the kinase ATP-binding pocket (see structure of XO44 in Fig. 3). These probes (10 µM) were incubated with CDK2 kinase protein (1 µM) in buffer (pH 7.5) and the protein adduct products were monitored via LC-MS. The majority of probes reacted with CDK2 kinase protein with similar kobs (0.3–0.8 × 10−3 s−1) despite the variance in intrinsic reactivity. For the highly reactive probes 19a and 19b, the reaction yield reached a plateau of 75% and 90%, respectively, indicating a competing hydrolysis process of sulfonyl fluoride. 19f and 19h that underwent the slow reaction with CDK2 (20% after 2 h), and 19i that did not yield any modification of CDK2.

Fig. 12
figure 12

Compounds used to evaluate the structure and chemical reactivity relationship of SuFEx warheads

Wang and co-workers systematically studied the kinetics proximity-enabled SuFEx in the protein context. Proximity-enabled SuFEx has emerged as a powerful strategy to cross-link the interacting proteins, enabling the profiling of PPIs in vivo and the development of covalent protein drugs. Since the covalent linkage formation between proteins is the critical step, SuFEx kinetics in different protein pairs and conditions was investigated. SuFEx kinetics was first studied using different protein pairs with distinct dissociation constants (Kd). The Zspa affibody (Afb) binds with its target Z protein with a Kd of ∼6 µM. FSY (see structure in Fig. 7B) was incorporated at site 24 of the Z protein (Z-24FSY) to react with the Lys7 of Afb (Fig. 13A). 6 µM Z-24FSY was incubated with Afb ranging from 3 to 192 µM in PBS (pH 7.4) at 37 °. The kobs for covalent protein complex formation initially increased with increasing concentration of Afb and plateaued at 24 µM Afb with a maximum rate constant of kmax = 0.0597 ± 0.0019 h−1, showing nonlinear dependence of protein concentration (Fig. 13B). The nonlinear dependence of protein concentration is because that the covalent linkage formation between two interacting proteins occurs in two steps. Firstly, the initial protein-protein binding forms a noncovalent complex, which places the SuFEx warhead close to a nucleophile on the target protein. Subsequently, covalent bond formation leads to a cross-linked complex. This kinetics is similar to that of covalent small molecules inhibiting target protein. The second-order rate constant, k = kmax/KS, was determined to be (1.32 ± 0.04) × 104 M−1h−1, where KS stands for the concentration needed to reach half of the maximum reaction rate. The kinetics was then evaluated using another protein pair, 7D12 nanobody and human epidermal growth factor receptor (EGFR), which has a Kd of 200 nM. FSY was incorporated at site 109 of the 7D12 nanobody to react with the Lys443 of the EGFR (Fig. 13C). The second-order rate constant k is measured to be (1.68 ± 0.09) × 105 M−1h−1, which is much higher than that of the Z protein-Afb pair, showing the impact of binding affinity on SuFEx kinetics (Fig. 13D).

Fig. 13
figure 13

Kinetics proximity-enabled SuFEx in the protein context. A Crystal structure of the Afb (cyan) in complex with the Z protein (yellow). FSY was incorporated at site 24 of Z protein to target Lys7 of Afb shown in sticks. PDB: (PDB: 1LP1). B Plotting kobs against Afb concentration to determine the second-order rate constant k; C Crystal structure of nanobody 7D12 (cyan) in complex with the human EGFR (yellow). FSY was incorporated at site 109 of 7D12 to target Lys443 of EGFR shown in sticks. D: Plotting kobs against 7D12(109FSY) concentration to determine the second-order rate constant k

The authors then studied the effects of different amino acid side chains and pH on protein SuFEx kinetics. The Lys7 of Afb (Afb-7Lys) was also mutated to His or Tyr to generate Afb-7His or Afb-7Tyr, respectively. 6 µM Z-24FSY was incubated with 192 mM Afb-7Lys, Afb-7His, or Afb-7Tyr in PBS (pH 7.4) at 37 °. The kmax for Afb-7His, Afb-7Lys, and Afb-7Tyr to react with Z-24FSY was 0.110 ± 0.001 h−1, 0.057 ± 0.006 h−1, and 0.022 ± 0.007 h−1, respectively. When incubated at pH 8.8, the kmax for Afb-7His, Afb-7Lys, and Afb-7Tyr to react with Z-24FSY was 0.135 ± 0.015 h−1, 0.273 ± 0.058 h−1, and 0.077 ± 0.001 h−1, respectively. While Afb-7His showed the fastest reaction rate with FSY at pH 7.4, Afb-7Lys had the fastest rate at pH 8.8. Increasing the pH from 7.4 to 8.8 significantly elevated the SuFEx rates for Lys (5-folds) and Tyr (3.5-folds), however, it only had a minor effect on His. These results highlighted the impact of pKa of amino acid side chains on reaction kinetics. The authors then compared the SuFEx kinetics of aryl fluorosulfate and aryl sulfonyl fluoride using E. coli glutathione transferase (ecGST), a homodimeric protein. mFSY and SFY (see structures in Fig. 7B), featuring aryl fluorosulfate and aryl sulfonyl fluoride, respectively, were incorporated into site 103 of ecGST at the dimer interface to target the Lys107 of the other monomer. SFY exhibited a much faster covalent dimerization rate than mFSY, suggesting that higher reactivity of aryl sulfonyl fluoride than aryl fluorosulfate in this protein context.

Overall, the kinetics of SuFEX should be carefully evaluated before implementing into different applications.

  1. 1.

    The stability and the reactivity are closely correlated. Higher electrophilic reactivity often leads to a faster-competing hydrolysis reaction. Therefore, achieving a balance between reactivity and stability is crucial when designing versatile SuFEx warheads.

  2. 2.

    The proximity effects drive the SuFEx reactions in protein environments. For example, fluorosulfate shows minimal reactivity to various nucleophilic amino acids, yet exhibits fast reaction kinetics towards protein targets when placed to proximal nucleophilic amino acids via protein-protein or protein-ligand binding. Installing SuFEx warheads at different positions (e.g. para vs meta, different sites of proteins) could lead to significantly different reaction rates.

  3. 3.

    The microenvironment of the active side of amino acids plays an important role in modulating the kinetics of SuFEx. The pKa of side chains of nucleophilic amino acids can be lower (pKa perturbation) due to the interaction with surrounding amino acid residues, making them ideal targets for SuFEx warheads. Another example is the reactivity of His, Lys, and Tyr towards SuFEx warheads. At the small molecular level, the reactivity order is His < Lys < Tyr. However, in protein context, based on limited data, the reactivity is His > Lys > Tyr. This discrepancy may be attributed to pKa perturbation caused by the microenvironment, and further studies on additional protein pairs are needed to elucidate this issue.

Conclusion/Outlook

SuFEx warheads such as aryl sulfonyl fluoride and aryl fluorosulfate are mildly reactive and their reaction towards nucleophilic amino acids (e.g. Lys, Tyr, His) can be greatly accelerated due to proximity effect upon binding event. Therefore, they can be installed on both small molecule ligands and proteins to investigate protein-biomolecule interactions including protein-ligand, protein-protein, and protein-RNA. Target identification is the critical first step in drug discovery. SuFEx chemistry specifically generates covalent bonds during protein-biomolecule interactions, offering several advantages over traditional methods such as non-covalent affinity pulldown assays. In non-covalent affinity pulldown assays, the non-covalent interactions often do not withstand stringent washing conditions. Therefore, mild washing conditions are usually necessary and non-specific interactions are unavoidably persevered, complicating the target identification process. In contrast, the covalent bonds formed via SuFEx are preserved during stringent washing, reducing non-specific interactions and providing cleaner target identification data. Additionally, covalent bond formation and SuFEx could provide a higher resolution of protein-biomolecule interaction patterns than the traditional non-covalent method. Mass spectrometry analysis of cross-linking peptides enables the identification of specific domains or amino acids involved in protein-biomolecule interactions, providing information that is typically not captured by non-covalent methods.

The selection of suitable SuFEx warheads also depends on the application. Aryl sulfonyl fluoride is more reactive than aryl fluorosulfate, and their reactivity can be fine-tuned via the electron-donating or electron-withdrawing substitutes on the aromatic rings. For proteome work, SuFEx warheads with higher reactivity might be favorable to afford a desirable pull-down yield. For covalent drug development, the stability and reactivity should be fully evaluated to minimize off-target cross-linking. Based on our experience in developing covalent protein drugs, aryl fluorosulfate is a better option than aryl sulfonyl fluoride.

In theory, SuFEx warheads offer unique advantages over other chemical functional groups such as acrylamide and photoreactive groups (e.g. azido, diazirine). Acrylamide mainly targets cysteine, while aryl sulfonyl fluoride and aryl fluorosulfate can effectively form a covalent bond with His, Lys, and Tyr, greatly extending the application of covalent probes. Photoreactive groups generate high reactive radicals and often lead to off-target cross-linking. Future comparative works between SuFEx warheads and other cross-linkers need to be performed. The application of SuFEx warheads in profiling protein-biomolecule interactions is still at an early stage and this strategy awaits further investigation in more protein examples.