Introduction

Amino acid and peptide chemistry have the unique strength to propose in fine peptide sequences beyond the 20 canonical amino acids (Blaskovich 2016). Non-natural amino acids are regularly used in drug discovery benefiting from the extraordinary development of solid-phase peptide synthesis (SPPS) (de la Torre and Albericio 2020; Drucker 2020). Altogether, each natural amino acid has numerous isosteres (or analogues), which are used to modulate the structure–activity relationship (SAR) and the pharmacokinetic and dynamic (PKPD) of a defined peptide drug (Muttenthaler et al. 2021; Blaskovich 2016). Amongst the regularly used non-canonical amino acids, N-(α)-methylated amino acid (Luisa Di Gioia et al. 2016), d-configured amino acid (Feng and Xu 2016), β/γ -amino acids (Cabrele et al. 2014), N-linked side chain (or peptoid bond) (Olsen 2010), (α, α′)-di-substituted amino acids, homo- or nor- amino acid represent a subtle change in the peptide sequence (Chatterjee et al. 2007). Some of those amino acids are chemically engineered while others are inspired by compounds issued from the natural biodiversity. Overall, the increasing pool of amino acids is driving the expansion of peptide therapeutics (de la Torre and Albericio 2020; Drucker 2020; Blaskovich 2016); an expansion that gains over protein engineering where the introduction of non-canonical amino acids is amongst the greatest actual challenges (Ngo and Tirrell 2011; Hodgson and Sanderson 2004).

Natural Products and Non-natural Amino Acids

Non-ribosomal peptides (NRPs) and ribosomally synthesized and post-translationally modified peptides (RiPPs) have a lot of features that are interesting from a peptide medicinal chemistry perspective leading to broad structural diversities (Süssmuth and Mainz 2017; Hetrick and van der Donk 2017). The context of antibiotic resistance (Aslam et al. 2018) is a tremendous motor to study those biosynthetic systems—NRPS and RiPPs—providing most of the peptide antimicrobials and antibiotics available on the market (Dang and Süssmuth 2017). Both biosynthetic types of machinery have advantages namely access to non-natural amino acids for NRPS and an easier manipulation to generate libraries of bioactive for RiPPs in comparison to NRPS (Hudson and Mitchell 2018; Hetrick and van der Donk 2017). NRPS is an intricate multi-modular protein-made system that can produce quantities of bioactive peptides bearing numerous of amino acid isosteres and post-synthetic modifications (Figs. 1 and 2). This includes N-terminal capping (by fatty acid synthase or polyketide synthase), N-(α)-methylated backbone (by N-methyl transferase), d-configured amino acid (by epimerization domain), β-hydroxylation (by β-hydroxylase), halogenated aromatic amino acids (by halogenase), regioisomers, homo- or nor- amino acid isosteres issued from the NRPS gene cluster (Süssmuth and Mainz 2017; Payne et al. 2017). This cluster encodes for amino acid biosynthesis, non-ribosomal peptide synthetase (peptide elongation and cleavage) and every post-NRPS modification including methylation, glycosylation, sulfation and phosphorylation. Amongst the amino acid diversity, the family of arylglycine (Fig. 1) provides essential amino acid building blocks to several antibiotics and antimicrobial peptides (AMPs) (Figs. 1 and 2) (Al Toma et al. 2015).

Fig. 1
figure 1

Examples of bioactive peptides incorporating arylglycines. The square highlights the biosynthesised amino acids associated in NRPS gene cluster (l-Phg, l-Hpg and l-Dpg), which are further modified before, during and after the non-ribosomal peptide synthesis. The yellow arrow represents the direction of the non-ribosomal peptide synthesis ending with the action of the thioesterase domain. Phg phenylglycine, Hpg 4-hydroxyphenylglycine, Dpg 3,5-dihydroxyphenylglycine, CDA calcium-dependent antibiotics (Color figure online)

Fig. 2
figure 2

Example of peptide bond achieved in the total synthesis using the coupling of arylglycines linked to Table 1. A ArylomycinA-C16; B Teicoplanin aglycone; C Ramoplanin aglycone; D Feglymycin. Peptide bond colours: Red for Hpg coupling; purple for Dpg coupling and green for Hpg ester formation. The yellow square highlights the last coupling achieved to assemble the full peptide chain (total synthesis) (Color figure online)

Natural Diversity in the Arylglycine Family

From three biosynthesized amino acids (l-Phg, l-Hpg and l-Dpg), NRPS incorporates and regularly modify those arylglycines to increase both amino acid and structural diversities. The timing of the amino acid modifications is always critical to understanding the biosynthesis of the bioactive molecules in vivo and can be divided into two groups: during and post non-ribosomal peptide synthesis. While tethered to multi-modular protein through a peptidyl carrier protein (PCP), arylglycine can be N(α)-methylated (Fig. 2. Arylomycin A2-C16), d-configured (Fig. 1. Nocardicin A, Vancomycin, CDA1b and Fig. 2. Feglymycin), mono-chlorinated (Fig. 2. Ramoplanin-A2) or bis-chlorinated such as in complestatin (Kittilä et al. 2017; Kaniusaite et al. 2019). After the peptide elongation, the ending of the NRPS brings the structural diversity by introducing various cyclizations mediated by either the thioesterase domain able to perform macro-lactamisation or lactonisation (Fig. 1. Pristinamycin IA and Fig. 2. Ramoplanin-A2) or P450 enzymes (Fig. 1. Vancomycin and Fig. 2. Teicoplanin) before the hydrolysis performed by the thioesterase domain. Further modifications by glycosyltransferases are also happening on specific arylglycine residues such as the central Hpg in vancomycin, the C-terminal Dpg in teicoplanin or on ramoplanin A2.

What seems routinely achieved by NRPS is extremely challenging from a synthetic perspective; particularly on extremely complex molecules such as those from the glycopeptide peptide antibiotic (GPA) family (Marschall et al. 2019). To date, the usage of arylglycine is limited to small peptides for which the epimerisation of arylglycine can be limited such as in Pasireotide or Rapadocin and small molecules such as in ampicillin (Fig. 1. Ampicillin) (Ma et al. 2019; Rolinson 1998; Wang et al. 2021). This review aims to regroup amino acid, total synthetic and SPPS strategies involving the usage of arylglycines.

Overview on Arylglycine Stereoselective Synthesis

Nowadays, phenylglycine (Phg) and 4-hydroxyphenylglycine (Hpg) are commercially available and can be modified efficiently to generate any building blocks used in total chemical synthesis or SPPS. Recently, the nitration of phenylglycine in meta position allowed after reduction the synthesis of anilino- and guanidino-phenylglycine derivatives (Weigel et al. 2015; Liu et al. 2018). On the contrary, 3,5-dihydroxyphenylglycine (l or d-Dpg)—critical for glycopeptide antibiotics (GPAs)—is mostly obtained by Sharpless strategy starting from styrenes (Scheme 1) (Reddy and Sharpless 1998). This strategy is mainly used despite other alternatives including racemic synthesis followed by amino acids resolution or other asymmetric syntheses (Williams and Hendrix 1992).

Scheme 1
scheme 1

Sharpless catalytic asymmetric aminohydroxylation for preparing the critical arylglycinol intermediary applied to Dpg synthesis. A Alkaloid ligand (DHQ)2PHAL or (DHQD)2PHAL for controlling the stereochemistry of the arylglycinol; B original synthesis by Sharpless; C application to ristocetin aglycon total synthesis by Boger; D application to feglymycin total synthesis by Süssmuth. For the incorrect regioisomer (benzyl alcohol), the enantiomeric excess was not described in the original manuscripts

The commercial availability of substituted/modified styrenes, the carbon efficiency (4% of osmium catalyst and 6% of ligand) and the synthetic time (two steps) are encouraging the choice of that strategy above the others (Williams and Hendrix 1992). Interestingly, the catalytic asymmetric aminohydroxylation of substituted/modified styrene is driven by two parameters; namely the solvent to control the correct regioisomer formation and the ligand for enhancing the correct enantiomer formation. The use of n-propanol as the solvent reaction favours the formation of benzylic amine (correct regioisomer) over the benzyl alcohol while using benzyl or tert-butyl carbamates. The use of a catalytic amount of (DHQ)2PHAL or (DHQD)2PHAL (Scheme 1A) confers high enantiomeric excess (< 80%). Importantly, substituted styrene with bulky groups helps with the formation of the right regioisomer; albeit having a limited impact on the enantiomeric excess (ee). The arylglycinols (Scheme 1B–D) are further modified and incorporated during the total chemical synthesis of numerous GPA aglycones such as vancomycin (Boger et al. 1999; Evans et al. 1998; Nicolaou et al. 1998), teicoplanin (Boger et al. 2000) and ristocetin aglycones (Crowley et al. 2004) as well as feglymycin (Figs. 1 and 2) (Dettner et al. 2009; Fuse et al. 2016). The oxidation leading to the carboxylic acid function is one of the last remaining steps in the GPA aglycone chemical synthesis (Okano et al. 2017). But in every case, the oxidation proceeds efficiently with numerous oxidation conditions such as TEMPO/NaOCl or Dess-Martin/NaClO2 (Boger et al. 1999, 2000; Crowley et al. 2004; Reddy and Sharpless 1998).

Recently, other strategies have been developed for the arylglycine stereoselective synthesis targeting the synthesis of bioactive small molecules such as an inhibitor of the ileal bile acid transporter (IBAT) (Elobixibat hydrate), the P53-MDM2 inhibitor (RO-5963), antiplatelet agents (clopidogrel or vicagrel), or HCV NS3/4A protease inhibitor. Imines (Yamamoto et al. 2011; Makley and Johnston 2014), paramethoxyphenyl-(PMP)-protected glycine ester (Liu et al. 2020), N-PMP imino esters (Wei et al. 2015), chiral nickel(II) glycinate (Zhang et al. 2015) or N-tert-butylsulfinyl ketimines (Wei et al. 2017) are amongst the precursors used to prepare arylglycines (Scheme 2).

Scheme 2
scheme 2

Recent examples of stereoselective arylglycine syntheses targeting small bioactive molecules. A Nickel-catalyzed asymmetric hydrogenation of N-PMP imino esters (R2 group: amide or ester); B ruthenium-catalyzed hydrogenation of N-tert-butylsulfinyl ketimines; C palladium-catalyzed α‑arylation of a chiral nickel(II) glycinate; D Palladium C–H oxidative cross-coupling on protected glycine ester; E Three-component reaction involving an aryl boronic acid precursor. PMP para-methoxybenzyl, TFE trifluoroethanol, DMSO dimethylsulfoxide, T+BF4 2,2,6,6-tetramethylpiperidine-1-oxoammonium tetra-fluoroborate, DCE dichloroethane

Mainly two strategies have emerged in the last decade: the stereoselective reduction of an achiral imine (Scheme 2A, B) or the creation of the chiral centre from a glycine precursor (Scheme 2C, D). Albeit, one recent example of a three-component reaction (sulfonamide, aryl boronic acid and ethyl glyoxalate) was used to access arylglycine in high enantiomeric excess (Scheme 2E) (Beisel et al. 2016). The stereoselective reduction of a racemic imine is an efficient process with high yield and enantiomeric excess. However, the access to those precursors often represents a limitation in comparison to Sharpless strategy (Reddy and Sharpless 1998). The creation of the chiral centre from a glycine precursor represents a good alternative for the preparation of a broad library of arylglycines, but suffers from poor to moderate enantiomeric excess; whether using a palladium-catalyzed α‑arylation or palladium C–H oxidative cross-coupling (Noisier and Brimble 2014).

In all the cases, the amino group can be released using hydrochloric acid with cerium ammonium nitrate in aqueous condition for PMP-protected glycine ester, in methanol for N-tert-butylsulfinyl ketimines and chiral nickel(II) glycinate (Scheme 2A–D) or TFA for the Pbf group (2,2,4,6,7-pentamethyldihydrobenzofuran-5-sulfonamide) (Scheme 2E). Overall, all the strategies for amino acid syntheses are thought from a solution phase perspective to access small molecules or used in total chemical synthesis; albeit not from a solid phase peptide synthesis perspective. Arguably, aryglycines are sensitive to basic treatment leading to epimerization—particularly during the SPPS process well known for iterative basic treatment. Albeit, some strategies have been developed to limit the epimerisation by selecting the right coupling condition and amino protecting group removal.

Total Synthesis and Arylglycine Coupling Condition

Over the years, numerous teams have tackled the total synthesis of some of the most complex non-ribosomal peptides (Fig. 2). A convergent approach is used consisting of the preparation of specific amino acids and small fragments that are assembled until the completion of the synthesis. The overall effort in glycopeptide synthesis has been recently regrouped in well-detailed reviews on GPAs (Okano et al. 2017) and ramoplanin (McCafferty et al. 2002); while the cyclization of arylomycin C-terminal tripeptide core was thoughtfully studied in the past 10 years (Dufour et al. 2010; Lim et al. 2019; Liu et al. 2011; Peters et al. 2018; Roberts et al. 2007, 2011; Wong et al. 2019).

Cyclic peptides present the advantage of being relatively flexible in the strategy applied for the cyclization. In the case of the arylomycin or vancomycin (or teicoplanin) C-terminal tripeptide core, the chemical strategies are fluctuating between either bis-aryl bond formation/macrolactamization or peptide formation/aryl bond cyclizing oxidation (Nicolaou et al. 1998; Evans et al. 1998; Boger et al. 1999). It is often more yielding to perform some reactions between two synthons (intermolecular) than along the same molecule (intramolecular). Consequently, such synthetic strategies diverge from the original biosynthesis—cyclization(s) at the level of a fully elongated peptide (Tailhades et al. 2019). In addition, the total synthesis must consider atroposiomers (Gulder and Baran 2012); which is another difficulty for which the condition must be specially optimised. Despite clear complexities, all the total synthesis routes have in common the formation of peptide bonds (Table 1).

Table 1 Summary of the coupling condition reported for the total synthesis of arylomycinA-C16, feglymycin, vancomycin, teicoplanin, ristocetin, eremomycin and ramoplanin aglycones

For that purpose, the choice of the coupling reagent is important due to the sensitivity to epimerization of arylglycines (Al Toma et al. 2015). From this perspective, N-protected arylglycines are comparable to cysteine or histidine (El-Faham and Albericio 2011). So, the priority is given to coupling conditions known for limiting the epimerisation such as EDC with any additive other than DMAP (for forming ester bond), DEPBT/NaHCO3 or DPPC/Oxyma. Other coupling reagents such as IBCF/NMM, PyBOP/NaHCO3 or HATU/HOAt/collidine are a great alternative. In terms of strategy, the key is to limit the exposure of α–proton to the basis by using a heterogenic mixture (NaHCO3 in DMF) or condition deprived of nucleophilic basis. The coupling condition often becomes a compromise between overall yield and level of epimerisation. N-protected arylglycines are often a great candidate to identify alternative coupling conditions such as coupling additive (Jad et al. 2014), solvent (Jad et al. 2016) or ball-milling/solvent-free (Yeboue et al. 2021). Some greener solvents have recently been used and shown good promises for limiting the epimerization during the coupling such as MeTHF (Wong et al. 2019; Jad et al. 2016). Most of the arylglycines are N-protected and O-protected during the coupling and the choice of the protecting groups are guided by the total synthesis strategy to have a certain level of orthogonality (Isidro-Llobet et al. 2009). With several coupling conditions and protecting groups routinely used in total synthesis, the focus of solid-phase peptide synthesis was on the N(α)-protecting group removal to maximise the formation of the correct peptide.

Solid-Phase Peptide Synthesis and Unmasking the Amino Group

The shift of the peptide synthesis from Boc/Bzl (Merrifield 1963) to Fmoc/tBu strategies has participated in the expansion of SPPS all over the world (Atherton et al. 1978). Nevertheless, the Fmoc/tBu strategy had to push some boundaries to address some limitations due to the iterative basic treatment or amino acid side chain protecting group bulkiness. Aside from the difficulties of protected arylglycine incorporation, direct peptide thioester synthesis or on-resin aggregation were amongst the limitations while using the Fmoc/tBu strategy. Multiple options became available to ensure the success of peptide synthesis other than the coupling reagents mentioned earlier: resin matrix, use of temperature for protecting group removal and coupling reaction, ligation of the unprotected fragments for the synthesis of peptides or proteins (Behrendt et al. 2016; Palomo 2014). Despite all that, the incorporation of arylglycine using SPPS remains quite limited to the biochemical study of non-ribosomal peptides (Zhao et al. 2020b). Particularly, the enzymatic transformation of linear peptide precursor into a monocyclic (arylomycin) or polycyclic (vancomycin or teicoplanin) peptide helped to fill a gap in the SPPS (Scheme 3) targeting in fine a peptide thioester of Co-enzyme A.

Scheme 3
scheme 3

SPPS of heptapeptide precursors containing numerous arylglycines for chemoenzymatic assay. A Alloc chemistry and mild cleavage to protect the β-hydroxy groups (R2″ and R3″). R1 = –H or Me; R2′ = R3′ = –H or –Cl; B Fmoc chemistry applied to the synthesis peptide hydrazide. Final tripeptide for Vancomycin: H-DLeu-DClTyr-LAsn and teicoplanin: H-DHpg-DClTyr-LDpg. R1 = R3 = Any amino acid side chain; R2 = –H or –Cl

The early work on the synthesis of the heptapeptide precursor of vancomycin was achieved using Alloc chemistry on the 2-chlorotrityl resin (Scheme 3A) (Bo Li and Robinson 2005). The peptide elongation was achieved with repetitive cycles of Alloc removal and protected amino acid coupling using DIC with either HOBt or pentafluorophenol. The limitation of that strategy was the formation of peptide thioester of Co-enzyme A that was performed by activating the C-terminal l-Dpg leading to epimerisation and consequently limiting the chemoenzymatic transformation of the linear into the corresponding GPA aglycone (Woithe et al. 2007). Later, the SPPS was adapted to the usage of Fmoc-protected amino acids by optimising the Fmoc removal and coupling conditions using new reagents or by applying temperature (Brieke and Cryle 2014; Elsawy et al. 2012; Liang et al. 2017). In terms of Fmoc removal, several studies have shown that piperidine or piperazine is poorly compatible with the SPPS process leading to a high amount of epimerization (Elsawy et al. 2012; Liang et al. 2017). To date, the usage of DBU (sterically hindered basis) together with short reaction time (3 × 30 s) is the only way to breach the incompatibility of arylglycines incorporation by SPPS (Scheme 3B) (Brieke and Cryle 2014). Over the years, this protocol—DBU for Fmoc removal and COMU/lutidine as coupling reagent—was extensively used to prepare a library of peptide hydrazide (Fang et al. 2011) that were transformed into peptide thioester of Co-enzyme A and successfully tested in a chemoenzymatic assay (Tailhades et al. 2018, 2020). Another positive point is the carbon efficiency of this protocol using unprotected phenol groups for Hpg and Dpg. Arguably, this protocol was only applied to short peptide sequences and must be further optimised to answer the demand of longer peptide sequences. Nevertheless, it is possible to prepare arylglycine rich sequences such as vancomycin (3 arylglycines) and teicoplanin heptapeptide (5 arylglycines) (Zhao et al. 2020a).

Future Perspective

Numerous bioactive peptides incorporating arylglycines issued from the biosynthesis or the chemical synthesis are available on the market while others such as ramoplanin-A2 are still in clinical trials (Koo and Seo 2019). Additionally, modification of GPAs such as vancomycin or teicoplanin through coupling reaction at the C-terminal Dpg remains the ideal strategy to retain the antimicrobial properties and add new ones (Marschall et al. 2019). Over the past 10 years, the modifications of GPAs have been successfully applied for optimising the original compound (Okano et al. 2017), for targeting gram-negative bacteria (Antonoplis et al. 2019), for reducing the biofilm formation (Antonoplis et al. 2018) and for recruiting the immune system to the site of infection (Payne et al. 2021). Finally, the usage of arylglycine in peptide drug design applied to non-natural sequences is also ongoing with the recent success of Pasireotide and Rapadocin (Ma et al. 2019; Wang et al. 2021).

Altogether, this review regroups the strategies to expand the usage of arylglycines in peptide medicinal chemistry. The multitude of stereoselective amino acid syntheses and peptide elongation conditions should promote the incorporation of arylglycine such as Dpg that has an unusual reactivity profile (Cohen et al. 2019; Pavlov et al. 1997) into any peptide of interest. The recent optimisation of the SPPS on GPA peptide precursor is also a great driving point to propose arylglycines as an alternative to other aromatic amino acids. Hopefully, all those interests in the usage of aryglycine will create a positive loop in which more structure–activity relationships (SAR) will include those amino acids, leading to more amino acids being commercially available and more information gathered in fine about in vivo pharmacokinetic and dynamic (PKPD).