Keywords

1 Introduction to Native Chemical Ligation

Native chemical ligation (NCL) has shown great utility and is a proven method for the preparation of peptides, cyclic peptides, proteins, and glycoproteins via synthetic or semi-synthetic pathways [15]. The concept of NCL dates back to the pioneering work of Wieland et al. [6], but it was not until 1994 that this method gained widespread attention when Kent et al. reported its application in the synthesis of interleukin-8, a cytokine responsible for the proliferation of B cells during immune response [7].

Approaches to ligation of two peptide segments include prior thiol capture [8], NCL [1], conformationally assisted ligation [9], and Staudinger ligation [10], of which NCL is the most widely used method. NCL involves the chemoselective coupling of two protein or peptide fragments, one containing a C-terminal thioester and the other, typically, an N-terminal Cys residue (Fig. 1). The two components combine via an intramolecular S- to N-acyl shift to produce, irreversibly, the ligated amide bond (a native peptide bond) at the point of ligation. The fact that this reaction occurs in aqueous solution and in the absence of protecting groups has placed this powerful technology at the forefront of protein synthesis. The driving force for NCL is the formation of the thermodynamically stable amide link, at around neutral pH [3].

Fig. 1
figure 1

Intermolecular chemical ligation (NCL)

The rate of the ligation reaction depends on the C-terminal amino acid. Dawson et al. reported that all 20 amino acids could be used, but amino acids such as Pro, Val, and Ile gave reduced reaction rates (Fig. 2) [10]. Additives such as urea and guanidinium chloride can be added to the buffer solution to prevent the aggregation of amino acids or peptides in the solution phase synthesis [11].

Fig. 2
figure 2

NCL with all amino acids

The requirement of a Cys residue (Xaa-Cys, where Xaa=any amino acid) at the ligation site can be problematic because not all proteins contain a Cys residue and the Cys residues may be present at locations in the protein which are not appropriate for NCL [12]. This has resulted in efforts to carry out ligations at Xaa-Xaa sites in which a thiol group is attached to a side chain of the N-terminal amino acid. Subsequent desulfurization then affords the desired peptide (Fig. 3) [13]. Desulfurization of thiol-modified amino acids can be achieved using Raney nickel or Pd/Al2O3, or under free-radical-based conditions [12, 14]. To date, the ligation-desulfurization technique has been achieved at Ala [12], Phe [15], Val [16], Lys [17, 18], Thr [19], Leu [20], Pro [21], Gln [22], and, most recently, Arg residues [23]. The design of thiol-containing amino acids was recently reviewed by He et al. [24]. However, where cysteine residues are present in the peptide sequence at positions other than the ligation junction, protection of the side chain is required to prevent unselective desulfurization. In this instance the acetamidomethyl group is typically used [25], and Pentelute and Kent were the first to show its application in NCL [26].

Fig. 3
figure 3

NCL and subsequent desulfurization

Dawson et al. demonstrated that the addition of thiophenol or benzyl mercaptan can increase the rate of ligation [27] and Johnson and Kent studied the effect that various thiols may have on NCL [28]. It was found that aryl thiols exchange with peptide alkyl thioesters to form peptide aryl thioesters, which then act as efficient leaving groups, thus facilitating the ligation. Mercaptophenylacetic acid (MPPA) was found to be a more effective thiol additive than those previously used.

2 Solid Phase vs Solution Phase Chemical Ligation

The most successful method of fragment condensation for the synthesis of polypeptides and proteins in solution phase is NCL, reported by Dawson for the first time in 1994 [7]. This was a significant contribution because NCL overcomes one of the main limitations of solid phase peptide synthesis (SPPS), namely the production of long peptide sequences (>50 amino acid residues) [1, 3, 29]. NCL may be used in both solution and solid phases; solution phase NCL has been used for the synthesis of small peptides and cyclic peptides [3032] whereas SPPS is more widely applied in polypeptide and protein synthesis.

SPPS has proved more useful than solution phase techniques in terms of ligation reactions for long peptides or polypeptides [33], because both N-terminal Cys-containing peptides and C-terminal thioester-containing peptides can be efficiently prepared on solid supports. Zhang et al. used 3-mercaptopropionyl MBHA resin for the preparation of thioesters using Boc chemistry [34]. However, harsh acidic conditions (HF/anisole, 9:1) were required to cleave the peptide thioester from the solid support. Clippingdale et al. reported the synthesis of peptide thioesters via Fmoc SPPS in order to avoid the use of HF [35]. However, piperidine, widely used for Fmoc-deprotection, may cause hydrolysis of peptide-α-thioesters, and thus methods to overcome this difficulty have been the subject of a number of studies [36].

Raibaut et al. demonstrated the advantages of SPPS for preparing peptide sequences for use in NCL. An efficient solid-phase synthesis of large polypeptides was achieved by iterative ligations of bis(2-sulfanylethyl)amido (SEA) peptide segments [37]. Sequential NCL by N- to C-elongation cycles between the supported peptide thioester (blocked with SEA) and a free C-terminal thiol group-SEA activated N-terminus peptide (Fig. 4) allowed the synthesis of peptide thioesters containing 60 amino acids and the assembly of five peptide segments to give a 15-kDa polypeptide [38].

Fig. 4
figure 4

N- to C-assembly of peptides

3 Intramolecular Chemical Ligation (Acyl Migration)

In general, NCL means intermolecular ligation but the phrase has also been applied to intramolecular ligation, often called acyl migration, which occurs when an acyl group migrates from XN (X=S, O, N) within an isopeptide (Fig. 5). Recently, Panda et al. investigated chemical ligation from isopeptides in the solution phase via different cyclic transition states [39].

Fig. 5
figure 5

Intramolecular chemical ligation

3.1 S- to N-Acyl Migration

Isopeptide ligation is an alternative method for the synthesis of cysteine peptides via an intramolecular chemical ligation by an entropically favored mechanism. S- to N-Acyl migration via various cyclic transition states was investigated by carrying out the ligation with mono-isopeptides under microwave irradiation (50°C, 50 W, 1–3 h) using 1 M NaH2PO4/Na2HPO4 phosphate buffer to maintain pH 7.3 (Scheme 1). The feasibility of intramolecular acyl migrations via 5- to 19-membered cyclic transition states was demonstrated and the yields of long-range S- to N-acyl transfers were found to depend on the size of the macrocyclic transition state (TS). Thus the relative rates, based on yields, depended on the ring size of the TS in the order, 5 >10 >11 >14, 16, 17 >12 >13, 15, 19 >18 >>>9 >8 [4044].

Scheme 1
scheme 1

Intramolecular ligation studies via S- to N-acyl migration

3.2 O- to N-Acyl Migrations

Chemical ligation of serine isopeptides via O- to N-acyl transfer with 8- and 11-membered TSs occurs without the use of an auxiliary group (Scheme 2) [45]. This is in contrast to cysteine isopeptides, in which the 8-membered TS was disfavored even under basic conditions. Intramolecular acyl transfer of Thr isopeptide through 5- and 9-membered TSs was favored over 8- and 11-membered TSs (Scheme 2) [46].

Scheme 2
scheme 2

Ligation studies via O- to N-acyl migration

Chemical ligation studies of Tyr isopeptides under microwave irradiation (50°C, 50 W, 3 h) using 1 M phosphate buffer and a DMF-piperidine medium showed that intramolecular O- to N-acyl transfer occurs via 11- to 13-membered TSs under basic conditions and with 14- to 18-membered TSs in aqueous media (Scheme 3) [47].

Scheme 3
scheme 3

Ligation studies via O- to N-acyl migration

3.3 N- to N-Acyl Migrations

The intramolecular chemical ligation of tryptophan isopeptides via N- to N-acyl migration occurs through 7- to 18-membered cyclic TSs forming the native peptides in basic, non-aqueous media rather than aqueous buffered conditions (Scheme 4) [48].

Scheme 4
scheme 4

Ligation studies via N- to N-acyl migration

4 Applications of Native Chemical Ligation

4.1 Synthesis of Cyclic Peptides

In 1944 Gause and Brazhnikova discovered Gramicidin S, a cyclic peptide [49], and used it in the treatment of septic gunshot wounds during the Second World War. Since then, both natural and unnatural cyclic peptides have become important synthetic targets because of their potential applications as antibiotics and other therapeutic agents [5054]. Selected examples include anticancer agents (ADH-1), antibiotics (colistin), growth hormone inhibitors (octreotide), and immunosuppressant agents (cyclosporine A) (Fig. 6) [55].

Fig. 6
figure 6

Selected cyclic peptides

The constrained conformation of cyclic peptides often results in increased exo- and endopeptidase resistance, enhanced binding affinity, and in certain cases, increased cell penetration compared to their linear counterparts. Numerous strategies, both in solution and solid-phase, have been reported for the synthesis of cyclic peptides [5658]; NCL, the reaction of a C-terminal peptide thioester with an N-terminal cysteine peptide, is now an established method for production of peptides with a cyclized backbone [5961].

Dawson demonstrated ligation strategies on 1,2-aminothiols which depend on a capture/rearrangement mechanism to link two peptide fragments under mild conditions [7]. The cyclization process involves a reaction between a weakly activated C-terminal thioester and an unprotected N-terminal cysteine residue. The thermodynamic stability of an amide bond over a thioester is again the driving force behind this reaction, made possible through a proximity-driven S- to N-acyl migration. Zhang and Tam used the above methodology to synthesize cyclic peptides in a head-to-tail fashion (Fig. 7) [62].

Fig. 7
figure 7

Native chemical ligation applied to the head-to-tail cyclization of peptides

Hackenberger and Kleineweischede reported a traceless Staudinger ligation for the head-to-tail macrocyclization of peptides without a deprotection step (Fig. 8) [63]. In this strategy, a phosphine tethered to a thioester at the C-terminus of a peptide reacts intramolecularly with an azide at the N-terminus to form the cyclic peptide.

Fig. 8
figure 8

The head-to-tail macrocyclization of peptides through a traceless Staudinger ligation strategy

Recently, cLac (cyclic peptide-mimicking lactadherin) was synthesized using NCL and studied for phosphatidylserine (PS) recognition [64, 65]. The linear precursors for the synthesis of cLac derivatives were prepared following standard protocols for automated Fmoc-peptide synthesis and the cyclic peptides (cLac variants) were obtained in the presence of 4-mercaptophenylacetic acid and isolated via HPLC in yields of 60–70% (Fig. 9). All the synthesized cLac variants were labeled with the thiol-reactive fluorescein-5-maleimide in DMF containing 2% N-methylmorpholine (NMM) by taking advantage of the free cysteine side chain [65].

Fig. 9
figure 9

cLac and a cLac variant

The phosphatidylserine (PS) recognition study suggests that the cLac peptide effectively mimics the PS binding mechanism of lactadherin, in which multiple polar residues are conformationally preorganized by the protein or the cyclic peptide to balance various noncovalent forces (desolvation, hydrogen bonding, and salt bridges) for specific PS recognition [65].

Liu et al. reported a modified version of NCL to synthesize the native peptide from a C-terminal peptide hydrazide and an N-terminal Cys under NaNO2-mediated activation [66, 67]. This method (hydrazide ligation) was also applied in the synthesis of cyclic peptides. One of its important advantages is that peptide hydrazides can be easily prepared through routine Fmoc SPPS. The linear hydrazide peptides were synthesized from hydrazine-Trt(2-Cl) by following standard Fmoc SPPS. The peptide hydrazides cyclize in two steps, in a one-pot fashion, in the presence of NaNO2 and the thio-additive MPAA. Aqueous phosphate buffer containing 6.0 M guanidinium chloride was used as the solvent system. A mixture of an organic solvent with aqueous phosphate buffer also works as medium for this transformation (Scheme 5). A number of cyclic peptides were prepared by following this methodology in 18–65% yields (Table 1) [68].

Scheme 5
scheme 5

Synthesis of cyclic peptides on a solid support

Table 1 Cyclic peptides prepared by hydrazine ligation

SPPS is often used for the preparation of linear precursors required for NCL. Barany and Tulla-Puche reported NCL on resin to avoid tedious purification steps, usually necessary after each step. The linear precursor was prepared using 1-hydroxy-7-aza-benzotriazole (HOAt) or 1-hydroxybenzotriazole (HOBt) and N,N′-diisopropylcarbodiimide (DIPCDI) as coupling agents. Key aspects of on-resin NCL include Fmoc/tBu chemistry, side-chain anchoring, allyl protection of the penultimate residue to allow introduction of the C-terminal thioester later in the synthetic sequence, a new derivative, Trt-Cys(Xan)-OH, which facilitates selective and mild removal of both protecting groups. The synthesis of cyclo(Cys-Thr-Abu-Gly-Gly-Ala-Arg-Pro-Asp-Phe) using on-resin NCL is illustrated in Scheme 6 [69].

Scheme 6
scheme 6

Synthesis of cyclic peptides via on-resin NCL

Fukuzumi et al. introduced a strategy in which linear peptides bearing side chains with unprotected functional groups could be cyclized under high dilution conditions [70]. This strategy, known as α-ketoacid-hydroxylamine amide-ligation, is achieved by the Fmoc-based SPPS of linear peptides bearing a C-terminal sulfur ylide linker, which acts as a ‘masked’ α-ketoacid (Fig. 10). The introduction of the N-hydroxylamine can also be carried out on a solid support. Subsequent global deprotection and cleavage from the solid support affords the linear peptide which, following sulfur ylide oxidation, undergoes the desired cyclization. The general applicability of this strategy was demonstrated by the preparation of five natural product cyclic peptides (Table 2).

Fig. 10
figure 10

General strategy for the preparation of unprotected linear peptides with N-terminal hydroxylamine and C-terminal α-ketoacids for direct, reagent-less cyclizations

Table 2 Cyclic peptides prepared by α-ketoacid-hydroxylamine amide-ligation

van de Langemheen et al. synthesized cyclic peptides containing a thioester handle using a ‘sulfo-click’ linker. In this approach the ‘sulfo-click’ linker used for the synthesis of the linear precursor was prepared by Fmoc SPPS (Scheme 7) [71]. Three different cyclic peptide sequences were synthesized, corresponding to the loops present in HIV protein gp120 interacting with CD4 as found in the X-ray structure of the gp120-CD4 complex. On the basis of this structure, the 365SGGDPEIVT373, 424INMWQEVGKA433, and 454LTRDGGN460 peptide sequences were selected for the preparation of cyclic peptide thioesters (Fig. 11) [72]. HIV-gp120 plays a crucial role in the first steps of HIV-infection through its attachment to the CD4 receptor [73]. Preventing attachment of gp120 to cells and/or using gp120 as a starting point to develop a vaccine may offer alternative approaches to avoid the further spread of HIV [74, 75].

Scheme 7
scheme 7

Synthesis of linear precursors for NCL by Fmoc SPPS

Fig. 11
figure 11

365SGGDPEIVT373, 424INMWQEVGKA433, and 454LTRDGGN460 loops

Chen et al. developed a strategy for preparing cyclic peptides via in situ generation of a thioester resulting from disulfide reduction; subsequent NCL results in the desired peptide (Fig. 12) [76]. This strategy was used to synthesize linear glycopeptides, which, after thioester formation, resulted in cyclization to form a model glycopeptide in 73% yield (Scheme 8).

Fig. 12
figure 12

Proposed mechanism for the in situ generation of thioesters and subsequent ligation

Scheme 8
scheme 8

Synthesis of cyclic glycopeptides following the in situ generation of thioesters

The synthesis of branched peptides using masked side-chain thioester derivatives of Asp and Glu which are compatible with Fmoc-SPPS is an important goal. Boll et al. synthesized cyclic and branched chain peptides using bis(2-sulfanylethyl)amido (SEA) side-chain derivatives of Asp and Glu via Fmoc SPPS [77]. The tail-to-side-chain cyclization via an in situ reduction of both acyclic and cyclic disulfides with tris(2-carboxyethyl)phosphine (TCEP) triggered the SEA intramolecular ligation. Glu derivatives cyclized more readily than the Asp analogues and without formation of side products (Scheme 9).

Scheme 9
scheme 9

Tail to side-chain cyclization using bis(2-sulfanylethyl)amido (SEA) ligation

There are several drawbacks associated with the use of thioester surrogates such as SEA, in SPPS, especially when the thioester surrogate is attached to the resin via an MeCys linker [78]. These limitations include the need for multiple coupling reactions to attach the C-terminal amino acid to the resin and the possibility that the protected MeCys linker can, during peptide elongation, undergo β-elimination and then piperidine conjugate addition to form MeAla(Pip) as a side product [79]. In order to overcome these limitations, Taichi et al. developed the thioethylbutylamido (TEBA) group as an alternative to SEA-derived thioester surrogates (Fig. 13). The utility of the TEBA-thioester surrogate was demonstrated by the synthesis of cysteine-rich cyclic peptides [78, 80].

Fig. 13
figure 13

Thioester surrogates

Li et al. developed an efficient method for the synthesis of peptides and proteins. In this method, an O-salicylaldehyde ester at the C-terminus reacts with N-terminal serine or threonine to realize peptide ligations via an O- to N-acyl transfer (Scheme 10) [81].

Scheme 10
scheme 10

NCL via an O- to N-acyl transfer

The utility of this ligation approach has been demonstrated through the convergent syntheses of therapeutic peptides (ovine-corticoliberin and Forteo) and the human erythrocyte acylphosphatase protein (∼11 kDa) [82]. The requisite peptide salicylaldehyde ester precursor is prepared in an epimerization-free manner via Fmoc–solid-phase peptide synthesis. This approach was also used for the synthesis of Daptomycin (Fig. 14), a lipodepsipeptide isolated from Streptomyces roseoporus which was obtained from a soil sample from Mount Ararat (Turkey) [83].

Fig. 14
figure 14

Daptomycin

4.2 Synthesis of Glycopeptides

Post-translational modifications of proteins and peptides can have profound effects on the overall biological and chemical properties of these biopolymers. In particular, glycosylation of proteins and peptides results in highly complex structures known as glycoproteins [84, 85]. The presence of glycans can ensure the stability and activity of glycoproteins, with biological functions such as cell adhesion, differentiation, and growth [8688]. Carbohydrates are linked to proteins via either a N-glycosidic bond at an asparagine residue or an O-glycosidic bond at a serine or threonine residue [89]. N-Linked glycoproteins are biosynthesized in a process commencing in the endoplasmic reticulum (ER) in which a 14-mer oligosaccharide is transferred to the amide nitrogen of an asparagine residue, in an Asn-Xxx-Ser/Thr sequon, where Xxx is any amino acid apart from proline, from a dolichol phosphate. Glycosidases then truncate the 14-mer to a pentasaccharide core fragment which is further modified to afford the final N-glycoprotein, which is in turn either transported to the cell surface or secreted (Fig. 15) [90].

Fig. 15
figure 15

Pentasaccharide core fragment present on N-linked glycopeptides where R=Asp

The most prevalent class of O-linked glycoproteins are the mucins, in which a galactosamine (GalNAc) monosaccharide is linked to the peptide backbone via an α-glycosidic bond from GalNAc to a Ser or Thr residue [91]. The GalNAc monosaccharide, often known as the TN antigen (Fig. 14), is then modified at the C-3 or C-6 positions with GalNAc, galactose, or glucosamine via α- or β-glycosidic bonds, resulting in a series of di- and trisaccharides which are known as the core structures (Fig. 16).

Fig. 16
figure 16

Core structures of mucin O-linked glycans where R is a Ser or Thr residue [91]

O-Linked glycoproteins are biosynthesized in a continuous process that occurs in the ER and the Golgi apparatus. This is not a template driven process, but rather is subject to numerous sequential and competitive enzymatic pathways, and the O-glycans vary according to the cell lineage, tissue location, and developmental stage of the cell [91, 92]. Furthermore, as the pattern of O-glycosylation alters in response to mucosal infection and inflammation, O-linked glycoproteins have important implications in a wide range of diseases, including cystic fibrosis, Crohn’s disease, and cancers [9399]. Aberrant glycosylation can result in truncated core structures, lacking backbone motifs, which leave the TN and T antigens or their sialylated versions exposed, providing disease markers which might have important implications in the design and synthesis of anti-cancer vaccines (Fig. 17) [100, 101].

Fig. 17
figure 17

Common tumor-associated carbohydrate antigens (TACAs) where R=H (Ser) or CH3 (Thr)

To determine the exact role of glycoproteins and glycopeptides in biological systems, it is necessary to access pure samples of these compounds. Furthermore, glycoproteins and glycopeptides are important compounds in drug discovery because of their potential therapeutic benefits [102, 103]. However, because of the dynamic and heterogeneous nature of biological systems, progress in glycobiology has been hindered, and it is not currently possible to isolate single glycoforms from natural sources [104]. Thus chemical synthesis of glycoproteins and glycopeptides is an important area of research [105, 106].

Chemical ligation is a very attractive method for the synthesis of glycoproteins [107], and NCL, traceless Staudinger ligation [108], and sugar-assisted ligation (SAL) (Fig. 18) [109111] have all been used for this purpose. SAL has been used for the synthesis of a large number of glycopeptides [112114], but difficulties were encountered in the presence of larger glycans [115].

Fig. 18
figure 18

Sugar-assisted ligation (SAL)

The synthesis of glycoproteins has been reviewed recently [84, 85, 106, 116, 117], and hence the focus here is on the use of NCL for the synthesis of glycopeptides and glycopeptide mimetics, the so-called ‘neoglycopeptides’ [118, 119]. Although NCL has been applied to the synthesis of glycoproteins and glycopeptides [120], the synthesis of glycopeptides is inherently challenging [121].

Boc-SPPS, traditionally used to prepare peptides required for NCL, is not used for the synthesis of glycopeptides because it is incompatible with acid labile glycosidic linkages such as sialyl- or fucosyl-glycosidic bonds. However, Murakami et al. were able to prepare a sialylglycopeptide by substituting the harsh acidic conditions (HF) typically used to remove Boc protecting groups for an acidic deprotection cocktail of TFA/TfOH/DMS/m-cresol (5:1:3:1) [122]. Alternatively, Fmoc-SPPS, used in glycopeptide synthesis, is not compatible with the synthesis of peptide-α-thioesters, as the thioester may be cleaved by the piperidine used to remove Fmoc protecting groups. Because the Fmoc strategy is more widely used than Boc strategies, especially in automated SPPS, methods to overcome difficulties associated with Fmoc-protected peptide-α-thioesters have received a great deal of attention [36].

The requirement of a Cys residue at the ligation junction for successful NCL can be problematic when attempting to synthesize glycopeptides because of the low natural abundance of Cys residues in nature (only 1.7%). This results in a lack of Cys residues in the target glycoprotein or the need to synthesize glycopeptide sequences of 30–50 amino acid residues. It is worth mentioning that the synthesis of N-linked glycopeptide chains is more challenging than the corresponding amino acid sequence without an attached glycan, which can lead to difficulties in glycopeptide synthesis [106, 123, 124]. Consequently, a large number of techniques have been developed, including Ser or Thr ligation strategies [125] and the addition of thiol groups to amino acids [24] which, subject to successful desulfurization, might afford the target glycopeptide. Desulfurization of peptides was originally achieved with either Raney nickel or Pd/Al2O3 [12], but these conditions are not always compatible with desulfurization of glycopeptides [16, 126].

The in situ generation of thioesters has been applied to the synthesis of cyclic glycopeptides bearing a single carbohydrate moiety (Fig. 12) [76]. Wan and Danishefsky applied this strategy to the synthesis of glycopeptides containing strategically placed Cys residues to facilitate NCL. Following NCL, a radical-based, metal-free desulfurization process using TCEP and tBuSH enabled the Cys residue to be converted into the desired Ala (Scheme 11) [14]. The NCL free-radical-based desulfurization was subsequently reported to be applicable to thiol-modified Val residues [127] and thiol-protected Thr residues [19].

Scheme 11
scheme 11

Free-radical desulfurization of a glycopeptide

Efforts have been made to improve free-radical desulfurization strategies and in this respect one-pot strategies are particularly attractive because they avoid time-consuming purification after each reaction. Moyal et al. [128] recently reported a one-pot ligation-desulfurization protocol and subsequently Thompson et al. developed a one-pot ligation-desulfurization protocol, using a novel thiol additive, 2,2,2-trifluoroethanethiol, for the synthesis of short peptide fragments [129]. Neither of these strategies has yet been applied to glycopeptide synthesis.

To achieve a more widely applicable method for glycopeptide synthesis, Okamoto and Kajihara developed a strategy for the conversion of Cys to a Ser residue after NCL. Thus, S-methylation of Cys, followed by an intramolecular rearrangement activated by CNBr, results in an O- to N-acyl shift, which affords an O-ester peptide intermediate. A second O- to N-acyl shift generates the desired glycopeptide (Fig. 19) [130]. This strategy was used to access a fragment (residues 79–98) of EPO, N-linked glycopeptide and a repeat sequence of MUC-1, an O-linked glycoprotein. Furthermore, this procedure can be carried out in the presence of methionine residues, when protected as the sulfoxide. Subsequently, Okamoto et al. reported that the substitution of Cys to Ser could be accomplished in the presence of acid-labile sialyl-glycosidic linkages such as those found on sialyl-TN antigens [131].

Fig. 19
figure 19

Conversion of Cys to Ser following NCL

NCL has been achieved at unprotected Ser and Thr residues via an O- to N-acyl transfer [81]. Hojo et al. employed a slightly different strategy in which a mercaptomethyl group was used to protect the Ser and Thr residues. In this instance the ligation step occurs through an S- to N-acyl transfer via a seven-membered ring and this approach was used to access O-linked glycoprotein contulakin-G (Fig. 20) [132].

Fig. 20
figure 20

Synthesis of contulakin-G via an S- to N-acyl transfer

N- to S-acyl shifts have been used to access peptide thioesters (Fig. 21) [133, 134]. Macmillan et al. used this strategy to prepared thioesters, which were subsequently used in NCL to access model glycopeptides based on the erythropoietin (EPO) amino acid sequence [135, 136].

Fig. 21
figure 21

Synthesis of peptide thioesters via an N- to S-acyl shift

Hsieh et al. prepared S-glycosylated peptides using NCL (Scheme 12) [137]. In this instance, the glycopeptides fragments were prepared using Fmoc SPPS and the subsequent NCL proceeded in yields of over 80% for the three examples. Disulfide-bridge formation at key cysteine residues achieved the natural product, bacteriocin glycopeptide Sublancin 168, and two derivatives bearing alternative sugars.

Scheme 12
scheme 12

Synthesis of S-linked glycopeptides via NCL

Glycoconjugate mimetics, known as neoglycoconjugates, which bear an ‘unnatural’ linkage between the carbohydrate and aglycon moieties, have been widely explored, as this allows access to novel glycopeptides which might have enhanced activity relative to the naturally occurring glycopeptide [119]. The triazole unit is one of the most widely studied ‘unnatural’ glycosidic linkages and, in this context, Macmillan and Blanc synthesized a neoglycopeptide, using Boc-SPPS, in which the peptide sequence was based upon human erythropoietin. Cleavage of the neoglycopeptide from the solid support afforded an 11-mer peptide sequence which was able to undergo NCL with a suitable thioester to afford the target neoglycoconjugate (Scheme 13), neatly demonstrating that NCL is compatible with triazole moieties [138].

Scheme 13
scheme 13

NCL of neoglycopeptides containing a triazole motif

Lee et al. subsequently synthesized similar neoglycopeptides using a one-pot strategy in which two propargyl peptides were coupled via NCL; subsequent 1,3-dipolar cycloaddition afforded the glycosylated peptide (Fig. 22) [139].

Fig. 22
figure 22

One-pot NCL followed by 1,3-dipolar cycloaddition

Finally, it has been shown that NCL has applications in the context of synthesizing RNA mimics [140], and selective desulfurization again plays an important role in producing target molecules [141].

5 Computational Rationalization of Chemical Ligation

Chemical ligation has been widely reported and discussed, and numerous attempts have been made to increase reactivity and improve yields to make the peptide ligation method more effective. In particular, mechanistic studies of the reaction have played an important role in rationalizing the rate of ligation.

Wang et al. investigated the mechanism of NCL and calculated energy barriers for various C-terminal amino acids in order to compare their reactivity [142]. In the reaction between a peptide thioalkyl ester and a Cys-peptide in the presence of an aryl thiol catalyst, it is alleged that both the thiol–thioester exchange step and the trans-thioesterification step proceed by a concerted SN2 displacement, whereas the intramolecular rearrangement occurs by an addition-elimination mechanism (Scheme 14).

Scheme 14
scheme 14

Hypothetical routes for thioesterification

The energy barrier for the thiol–thioester exchange step depends on steric hindrance associated with the side-chain of C-terminal amino acids, whereas that of the acyl-transfer step depends on steric hindrance caused by the side-chain of the N-terminal amino acid. In auxiliary-mediated peptide ligation, between a peptide thiophenyl ester and an N-2-mercaptobenzyl peptide, the thiol–thioester exchange step and intramolecular acyl-transfer step proceed by a concerted SN2-type displacement mechanism (Scheme 15). For N-terminal Gly, the thioester exchange is rate limiting, whereas the acyl transfer is the rate-limiting step for N-terminal non-Gly amino acids (Table 3) [142]. When the difference in ΔG between the two transition states is, for example, 3 kcal mol−1, then the reaction with the lower activation energy is 150 times faster (Table 3).

Scheme 15
scheme 15

Mechanism of auxiliary-mediated peptide ligation

Table 3 Energy barriers of auxiliary-mediated peptide ligations for various conjugation sites

When compared to intermolecular NCL, there have been far more computational studies into the intramolecular chemical ligation of isopeptides. Monbaliu et al. discussed the computational approach and developed the first systematic theoretical background for an n-exo-trig intramolecular S-to N-acyl transfer [143]. Cyclic TSs in the ring size range n = 5–10 were controlled by enthalpic factors and a classical range of ΔG values were found. The calculations also emphasized that the substituents at R1 and R2 (Fig. 23b) had little impact on the nature of the TS. The preorganization of the structures and, in particular, the emergence of stabilizing hydrogen bonds in intramolecular TSs appeared as major factors governing the variations of ΔG with ring size. In isomeric structures, the presence of an internal non-natural amino acid (β-Ala or GABA) directly after the Cys residue favored hydrogen bonds (5.7–8.4 kcal mol−1), which stabilize the TS. The competition between the intra- and intermolecular acyl transfers is driven by parameters which govern the approach of the reactive termini, for example, the ring strain (Fig. 23) [143].

Fig. 23
figure 23

(a) 5-Exo-trig S- to N-acyl transfer as commonly encountered in classical NCL and (S)-isopeptide rearrangements (left) and n-exo-trig S- to N-acyl transfer in internal Cys NCL and extended isopeptides rearrangements (right). (b) Isomerization of (S)-acyl isopeptides to native peptide analogues if m = 0, n = 5

The feasibility and rate of intramolecular ligation depends on various factors such as preorganization energy, hydrogen bonding, and the distance between the reaction sites. Oliferenko and Katritzky rationalized the curious behavior of the intramolecular ligation of Cys-containing isopeptides by computational chemistry. Preorganization is an important factor in chemical ligation because it occurs through a cyclic TS [144]. The more easily the starting material achieves an appropriate cyclic conformation, the higher the probability of intramolecular reaction. Hydrogen bonding and NH–π interactions play a major role in the stabilization of a preorganized conformer [44]. Intramolecular ligation via eight-membered cyclic TSs of Cys-containing isopeptides was disfavored, whereas, surprisingly, Ser-containing isopeptides showed an efficient O- to N-acyl migration via eight-membered cyclic TSs. The O- to N-acyl migration of Tyr-containing isopeptides showed intramolecular chemical ligation in the presence of a base with 12- to 14-membered cyclic TSs and in a buffered medium with 15- to 19-membered cyclic TSs. The reactivity of Tyr isopeptides is described in terms of preorganization energy, hydrogen bonding and bond distance (Table 4) [145].

Table 4 Determination of parameters governing O- to N-acyl migrations for Tyr isopeptides

Biswas et al. reported, the design of a predictive model using statistical techniques to correlate the relative abundance (ligated product percentage) with quantitative structural activity/property relationship (QSAR/QSPR) [146]. The genetic algorithm linear regression method was performed using QSARINS software [147], which establishes a correlation between the dependent variable (property/response relative to the abundance of ligated peptide) and independent variables (molecular descriptors or factors) (Table 5). It was found that the percentage of ligated peptides increases with both a shorter spatial b(N-C) distance and a higher Balaban index (Fig. 24).

Table 5 Statistical model for the relative abundance of ligated peptides
Fig. 24
figure 24

(a) Correlation plot for relative abundance model of ligated product. (b) Correlation between relative abundance and distance b(N-C). (c) Correlation between relative abundance and Balaban index

6 Conclusions

The increasing demand for peptides and chemically modified peptides has produced a significant expansion in alternative synthetic routes for peptide and protein synthesis. Native chemical ligation has proved to be a very useful tool for the synthesis of proteins, peptides, cyclic-peptides, glycopeptides, and neoglycoconjugates. In spite of ligation-desulfurization techniques, the general requirement of an N-terminal cysteine and a C-terminal thioester remains a limitation. However, considerable efforts have been made to generalize NCL. Both solution and solid phase peptide synthesis have been used for NCL, and NCL is the more acceptable method for preparing peptides because of the high-yielding isolation of long-chain peptides and cyclic peptides with a lower likelihood of thioester racemization. Because thioesterification can occur at mildly acidic or neutral pH, NCL promises to have a major impact on the synthesis of peptides and peptide mimetics and is particularly suitable for the synthesis of natural products, cyclic peptides, and glycopeptides. The rate of acyl transfer in chemical ligation reactions depends on various factors and these have been analyzed by computational studies. This theoretical rationalization should assist in the preparation of new cyclic peptides and glycopeptides via chemical ligation.