Keywords

1 Introduction

The free fatty acid receptors are members of the rhodopsin family of G protein-coupled receptors (GPCRs) and relay binding of endogenous free fatty acids at the cell surface into intracellular activation of a heterotrimeric G protein. Currently, five GPCRs have been identified to bind free fatty acids with a different level of specificity. The free fatty acid receptor 1 (FFA1), previously known as GPR40, binds preferably long- and medium-chain fatty acids with more than 12 carbon atoms (Briscoe et al. 2006; Itoh and Hinuma 2005). The free fatty acid receptor 2 (FFA2), previously known as GPR43, and the free fatty acid receptor 3 (FFA3), known as GPR41, respond to short-chain fatty acids with less than 5 carbon atoms (Brown et al. 2003). FFA1–3 have more than 30% of sequence identity and belong to the branch of nucleotide receptors in the phylogenetic tree of the rhodopsin family of GPCRs (Fig. 1). The free fatty acid receptor 4 (FFA4), also known as GPR120, has a preference to bind long-chain fatty acids (Hirasawa et al. 2005), like FFA1, but has a substantially low sequence identity with FFA1 (<19%) and other members of the family. Phylogenetically, FFA4 is clustered with several orphan receptors. The last receptor, GPR84, is activated by medium-chain fatty acids with 7–12 carbon atoms (Wang et al. 2006), has low similarity with FFA1–4 (<21%) and is branched with the prostanoid receptors. Since free fatty acid signalling is a molecular messenger system regulating energy storage, metabolism and inflammation, these receptors are thought to be involved in numerous metabolic and inflammatory conditions such as obesity, type 2 diabetes, atherosclerosis, cardiovascular diseases, ulcerative colitis, Crohn’s disease and irritable bowel disease (Ulven 2012; Dranse et al. 2013; Milligan et al. 2014; Bindels et al. 2013).

Fig. 1
figure 1

A phylogenetic tree of the rhodopsin family of GPCRs. The phylogenetic tree was constructed using the sequence of the transmembrane helices by the online GPCRDB server (www.gpcrdb.org). The circular tree type, neighbour-joining distance calculation and no bootstrapping options were used. Clusters to which the free fatty acid receptors belong are highlighted

Knowledge of the receptor three-dimensional structure is crucial for an understanding of the molecular mechanisms underlying diseases as well as for structure-based design of small-molecule modulators. The FFA receptors share the overall topological structure typical of GPCRs: seven alpha-helices that cross the cell membrane and are connected by three extracellular and intracellular loops, the N terminus in the extracellular side and the C terminus in the intracellular side of the receptors. The recent breakthrough in structural biology of GPCRs enabled crystallographic structures for 27 receptors of the rhodopsin family of GPCRs, including the FFA1 receptor. A FFA1 structure is solved in the complex with the ago-allosteric ligand, TAK-875 (Srivastava et al. 2014), and provides a crucial point to study ligand recognition at FFA1. However, FFA2, FFA3, GPR120 and GPR84 have not been yet crystallised. The crystallisation of GPCRs remains a challenging and long process, and it will take years to get the structure of other free fatty acid receptors. Solving GPCR structures in the complex with various ligands is also labour-intensive and time-consuming. Therefore, computational techniques such as homology modelling and ligand docking will play an important role in mapping ligand-receptor contacts and providing working hypotheses for site-directed mutagenesis of the receptors.

Homology modelling in combination with mutagenesis data enables indirect structural information and has been proven to be insightful in an understanding of ligand-protein interactions in many GPCRs. In homology modelling, the structure of the protein is constructed from its amino acid sequence and an experimental structure of a related homologous protein. A detailed procedure of GPCR homology modelling is provided in the recent article by Costanzi (Costanzi 2012). Crystal structures as well as homology models of GPCRs serve as a basis to explore ligand binding through various molecular docking protocols. Ligand docking provides a rapid solution as to how and where a ligand binds in a receptor and predicts key interactions important for ligand recognition. Besides scoring functions, the obtained ligand docking pose is evaluated based on its agreement with mutagenesis data and structure-activity relationships (SARs) of the ligands. GPCR docking has been recently reviewed in several studies (Costanzi 2013; Beuming et al. 2015; Weiss et al. 2016; Levit et al. 2012). An ability to explain experimental data and accurately predict the results of new experiments defines the quality of receptor modelling.

In this chapter, we will focus on molecular modelling of the FFA structures and ligand binding using available structural information, mutagenesis data and ligand SARs. We will analyse the recently published FFA1 crystal structure and assess past and future modelling of ligand interactions at FFA1. We then examine the application of available structural templates in modelling of structures and prediction of ligand-receptor interactions for other FFA receptors. In the text, we will provide the Ballesteros-Weinstein index (Ballesteros and Weinstein 1995) for residues in the transmembrane helices as a superscript. This index represents X.Y, where X is the helix number and Y is the residue number relative to position 50, which is assigned to the most conserved residue in a given helix.

2 Free Fatty Acid Receptor 1

2.1 X-Ray Crystallography of FFA1

In 2014, an X-ray crystallographic structure of the human FFA1 in complex with the ago-allosteric ligand, TAK-875, provided unprecedented three-dimensional insight into molecular recognition at FFA1 (Srivastava et al. 2014). The crystal structure revealed that TAK-875 binds to the site involving transmembrane helices 3–5 and the second extracellular loop (EL2). In the binding site, the oxygen atom of the carboxyl group of the ligand forms hydrogen bonds with two arginines, R1835.39 and R2587.53, whereas the other one is in hydrogen bonding with two tyrosines, Y2406.51 and Y913.37; the dihydrobenzofuran ring of TAK-875 is stabilised via π-π stacking with W174EL2 and F142EL2; the biphenyl scaffold protrudes into the interhelical gap between helices 3 and 4; and the methylsulfonyl linker is pointed to the extracellular side (Fig. 2a, b).

Fig. 2
figure 2

The binding mode of TAK-875 in the crystal structure of FFA1. (a) The overall view of the binding site. (b) The zoomed view of the hydrogen bonding between the carboxyl group of TAK-875 and four residues of the binding site. (c) The binding site of TAK-875 in the presence of water molecules. Hydrogen bonds and π-π stacking are in pink and blue, respectively. The crystal water molecules are shown in light cyan and the water molecules with a low B factor are shown in dark cyan. Residues involved in ligand binding, water-mediated interactions and coordination of the arginine pairing, as well as used in site-directed mutagenesis, are shown in sticklike representation. Numbering of receptor helices in Roman numerals

Interestingly, the guanidinium side chains of R1835.39 and R2587.53 are in an unusual charge-pairing interaction. Despite being both positively charged residues, these arginines do not repulse but are in planar stacking with the distance between the Cζ-Cζ atom pairs of 3.6 Å. From the detailed analysis of the crystal structure, it becomes clear that the electrostatic repulsion between positively charged guanidinium groups is balanced by the surrounding environment. The counterion from the aspartate acid residue E172EL2 forms an ionic bridge with R2587.53, interacting with two hydrogens of the guanidinium (Fig. 2c). Two other guanidinium donors of R2587.53 are involved in a direct hydrogen bond with Y2406.51 and through a water-mediated hydrogen bond with N2446.55. In the case of R1835.39, two guanidinium donors of R1835.39 form a direct hydrogen bond with the backbone of L171EL2 and a water-mediated hydrogen bond with the backbone of W174EL2 and E172EL2. Considering that the guanidinium group can be involved in five hydrogen bonds, R2587.53 and R1835.39 have one and three guanidinium donors, respectively, that are free from interactions and could coordinate the carboxyl group of a ligand (Fig. 2). Indeed the carboxyl group of TAK-875 forms one hydrogen bond with R2587.53 and two hydrogen bonds with R1835.39, thus fully balancing the arginine pairing.

Although this is the first arginine pairing observed in GPCRs, arginine-arginine short-range interactions are found in many other proteins, where these residues are involved in the binding of nucleotides, organic acids and other negatively charged bioactive molecules. From analysis of 67,520 crystal structures containing clusters of positively charged arginines, Neves and colleagues (Neves et al. 2012) demonstrated that guanidinium groups tend to sit in polar pockets or are exposed to solvent. In a much earlier study, computer simulations of Magalhaes and colleagues (Magalhaes et al. 1994) have predicted the importance of solvent environment for arginine pairing. Similar to other proteins, the two pairing arginines in FFA1 are solvent exposed and have interactions with hydrophilic residues. It is believed that controlling a countercharge in arginine-arginine interactions represents a biological regulatory mechanism. I propose here that stabilising the arginine-arginine pairing might be crucial for activation of FFA1.

In addition to water molecules involved in the arginine pairing, several other water molecules in the binding site are observed in the crystal structure of 2.33 Å resolution (Fig. 2c). Although water molecules do not interact with the ligand directly, they are engaged in an hydrogen bonding network with the residues of the binding site. There are three water molecules with a low B factor that are engaged in several hydrogen bonds with the receptor (Fig. 2c). One water molecule links R2587.53 via a water-mediated interaction with N2446.55, the second water molecule is within the extracellular tip of helix 7 and the third water molecule is trapped in EL2. With a recognised importance of water molecules in GPCRs (Mason et al. 2012), it is likely that ordered water molecules found in the FFA1 binding site could also play a critical role in ligand binding and receptor modulation.

Our recent examination of the crystal structure has revealed K622.60 as a third positively charged residue in the binding cavity of FFA1 (Tikhonova and Poerio 2015). This residue is in the binding site due to a proline kink causing unwinding of helix 2. Notably, the residue in the same position in the peptide (PAR1, opioid and CXCR4), lysosphingolipid (S1P1) and nucleotide (P2Y12) receptor crystal structures contributes to ligand recognition.

The intracellular side of helix 6, which is known to move away from the helical bundle during receptor activation (Rasmussen et al. 2011), is in the inwards position, suggesting that the FFA1 receptor in the crystal structure is in the inactive state.

Four point mutations, A88F, L42A, G103A and Y202F, were made to improve expression and thermal stability of FFA1 for crystallisation. The alanine mutation at position 3.34 is located close to TAK-875 (5 Å). Our docking study indicates that phenylalanine of the wild type at this position slightly changes a docking position of the biphenyl moiety of TAK-875 (Tikhonova and Poerio 2015).

2.2 Molecular Modelling of FFA1

The FFA1 X-ray structure follows upon a nearly decade long progression of knowledge of the binding site at FFA1 and other structural features based on homology modelling and mutagenesis. The first structural model of FFA1 was obtained through an iterative approach that combined rhodopsin-based homology modelling and receptor mutagenesis (Tikhonova et al. 2007). Rhodopsin, a light-activated receptor found in retinal rod cells was a single template available in 2007 to model GPCRs before breakthroughs in GPCR crystallography. Although the sequence identity in the transmembrane helices with FFA1 is only 16%, the first model of FFA1 helped to identify the putative binding cavity within helices 3, 4 and 5 and the positively charged residues R1835.39 and R2587.35 anchoring the negative carboxylic group of agonists. The homology model in conjunction with mutagenesis also predicted Y913.37, Y2406.51, N2446.55, H1374.56 and L1865.42 to form interactions with an agonist. From the available FFA1 crystal structure bound to TAK-875, it is clear that Y913.37 and Y2406.51 are in hydrogen bonding with the carboxyl group of the ligand and not in aromatic and hydrophobic contacts, as it was predicted in homology modelling, and H1374.56, L1865.42 and N2446.55 do not form a direct contact with the ligand. Overall, the hydrophobic tail of the ligand is pointed to the gap between helices 3 and 4 and not situated inside the helical bundle as was previously predicted.

Because EL2 is buried into the helical bundle in rhodopsin and prevents ligand interactions with the anchoring residues, the first FFA1 model was built without involvement of EL2 (Tikhonova et al. 2007). The next model of FFA1 was constructed using a template based on the β2-adrenergic receptor, the first GPCR with a diffusible ligand crystallised (Sum et al. 2009). Although there is also a low sequence conservation in the transmembrane region (18%) and no homology in EL2, the β2-adrenergic-based model allowed mapping the solvent-accessible orientation of EL2 and made predictions of ionic interactions between two glutamates, E145 and E172, of EL2 and two arginines, R1835.39 and R2587.35, of transmembrane helices. Molecular dynamics (MD) simulations of the receptor and mutagenesis of E145EL2 and E172EL2 suggested that these ionic interactions play roles of ionic locks keeping the receptor in an inactive state and breaking upon receptor activation. The FFA1 crystal structure has confirmed the existence of the ionic interaction between E172EL2 and R2587.35 (Fig. 2). However, it remains unclear whether this ionic interaction breaks upon activation, as the available crystal structure is an inactive conformation of the receptor. E145EL2 is outside of the binding cavity and in hydrogen bonding with S178EL2 in the experimental structure (Fig. 2a).

FFA1 homology modelling based on the templates with 16–18% sequence identity helped to map the ligand-binding area and anchoring residues but were not precise in predicting the type of interactions and the ligand-binding mode (Tikhonova and Poerio 2015). This conclusion fits to the results of community-wide GPCR homology modelling assessments conducted by the Abagyan and Stevens labs (Michino et al. 2008; Kufareva et al. 2011, 2014). According to GPCR homology modelling assessments, templates with a sequence identity of at least 30–35% allow building accurate homology models (Michino et al. 2008; Kufareva et al. 2011, 2014).

2.3 Docking to the FFA1 Crystal Structure

The crystal structure of FFA1 provides a good starting point for ligand docking. In our recent work, we have used different docking protocols including standard docking, induced fit docking and docking to conformations obtained from conformational search of the binding site residues to explore docking of linoleate, GW9508, TUG-770, AMG837, AM1638 and AM8182 in the wild-type receptor (Tikhonova and Poerio 2015). The carboxyl group of all the compounds is coordinated by two arginines and two tyrosines in all docking studies, similar to TAK-875. However, there is a difference in position of the hydrophobic tail of the ligands. While a standard docking protocol with a rigid receptor enables a TAK-875-like binding mode for most of the agonists, in which the hydrophobic tail is in the interhelical gap between helices 3 and 4, docking to the flexible receptor predicts, in addition, a new binding mode, where the hydrophobic tail of the agonists is pointed to the gap between helices 4 and 5. Whether it is an artefact of docking or this mode truly exists could be further validated by receptor mutagenesis. For example, residues facing one of the interhelical gaps could be mutated to bulky ones so that they provide hindrance in binding to one gap and validate binding to another gap.

Figure 3 shows different binding modes on the example of linoleate. For linoleate, in addition to two modes of binding in the gaps between helices (Fig. 3a, b), there are two other possible binding modes (Fig. 3c, d). In one of the modes, the carboxyl group has the same interactions with four residues, but the hydrophobic tail is within the helical bundle, around helices 1, 2 and 7. In another mode, the carboxyl group forms hydrogen bonding with arginines from the extracellular side, and the rest of the molecule occupies the sites of water molecules protruding between helices 6 and 7. Although docking of flexible linoleate is the least reliable, it helps to generate hypotheses as to how agonistic activity of linoleate could be amplified by synthetic agonists, like TAK-875, at a structural level. For example, targeting arginines from the extracellular side by substituting existing water-mediated contacts could still leave a possibility of targeting the arginines from the side involved in interactions with tyrosines.

Fig. 3
figure 3

Putative binding sites in FFA1. (ad) Docking solutions for linoleate binding at FFA1. (e) The surface-based representation of the FFA1 extracellular cavity with contiguous dummy atoms that define the putative binding sites. Prediction of the binding sites was made with the site map (Schrodinger, LLC, New York, NY, USA 2014b SiteMap 3.3). The surface is coloured based on the electrostatic potential. Hydrogen bonds are in pink. Numbering of receptor helices in Roman numerals

Interestingly, docking of bulky AM1638 does not allow to place the ligand between helices in either way and gives a solution where the ligand is located within the extracellular side (Tikhonova and Poerio 2015). Since mutagenesis studies suggests that AM1638 does not bind to the arginines (Lin et al. 2012; Luo et al. 2012), the carboxyl group of AM1638 might be coordinated by other charged or hydrophilic residues within the extracellular binding cavity. We have previously shown that K622.60 is the third positively charged residues in the extracellular side and potentially could interact with AM1638 (Tikhonova and Poerio 2015).

The fact that the ligands could have different binding modes and occupy different binding sites at FFA1 can be also seen from a site map search (Schrodinger, LLC, New York, NY, USA 2014b) (Fig. 3). This computational technique is able to add dummy atoms and cluster contiguous dummy atoms into a putative site in several areas of the FFA1 extracellular binding cavity: sites around the arginine pairing and a site within helix 2 and 7, involving K622.60.

The use of X-ray structural data is now able to provide greater insight into ligand recognition than earlier approaches. Given that four residues are involved in coordination of the carboxyl group of an agonist, we could hypothesise that there could be more variation allowed in distances between ligand coordinating atoms of the residues than if only one residue, an arginine, chelates the carboxyl group (see the example with GPR120). Therefore, substitution of the carboxyl group to nonclassical bioisosteric groups could likely be more tolerated, providing a new route to further modify agonists and improve ligand selectivity.

Examination of available structures of antagonists (Holliday et al. 2012) shows that antagonists either do not have a carboxyl group in the structure or have a carboxyl group that is less approachable. This suggests that antagonists likely do not have interactions with the four anchoring residues or have interactions with only a few of them.

In summary, new mutagenesis efforts will be instrumental in mapping the precise binding mode of orthosteric and allosteric agonists and antagonists. The recent development of a fluorescent agonist could further assist in distinguishing between orthosteric and allosteric sites at FFA1 (Christiansen et al. 2016). The crystal structure is also helpful in initiation of MD simulation studies to probe flexibility of the receptor and explore the dynamics of the arginine pairing and the surrounding hydrogen bonding network in relation to ligand binding and receptor activation. In addition, simulations of the empty form of the receptor could help to clarify the position of helix 4 and therefore the dynamics of the interhelical gaps, further clarifying the possibility of different ligand-binding modes.

3 Free Fatty Acid Receptor 2

The initial structural model of FFA2 was built based on the crystal structure of the β2-adrenergic receptor with sequence identity of 18%. This model, in conjunction with mutagenesis, was used for an understanding of the structural basis of selectivity at FFA2/FFA3 (Schmidt et al. 2011) and FFA2 species orthologues (Hudson et al. 2012a) for short carboxylic acids; elucidation of molecular determinants in binding of potent synthetic agonists, cmp 1 and cmp 2, in human and rodent orthologues of FFA2 (Hudson et al. 2013); and identification of possible ionic locks of activation (Hudson et al. 2012a, b), similar to FFA1.

The recently published crystal structure of FFA1 provides a more suitable template for the modelling of FFA2. FFA1 and FFA2 share 32% sequence identity and have conserved anchoring residues for the carboxylate of the ligand. It should be noted that, however, as receptor subtypes they have relatively low sequence identity in comparison with the adrenergic, muscarinic and opioid receptor subtypes, which have over 60% sequence identity. This could suggest more structural divergence in the ligand-binding site between FFA1 and FFA2. Indeed, our recent modelling study suggests that the FFA2 binding cavity is notably smaller than in FFA1 as a result of a more intensive hydrogen bonding and aromatic network, created by nonconserved residues (Tikhonova and Poerio 2015). In terms of physicochemical properties, the FFA2 binding site is more aromatic compared to FFA1 and FFA3.

Redocking of selective carboxylic acids to the new FFA2 model and comparison with the FFA1-based FFA3 model suggest that Y903.33, I1454.61 and E166EL2 in FFA2 (F963.33, Y1514.61 and L171EL2 in FFA3) are likely responsible for ligand selectivity at FFA2/FFA3 (Tikhonova and Poerio 2015).

We compare docking of cmp 1, an agonist, and CATPB, an antagonist, in the new FFA1-based homology model in Fig. 4. Similar to FFA1, the carboxylate of the agonist can form interactions with two arginines and two tyrosines (Fig. 4). In contrast, docking of CATPB predicts that the carboxylate could form interactions with only the arginines. More importantly, a recent mutagenesis study suggests that CATPB preferably binds to R2557.35, whereas GLPG0974, another antagonist, binds to R1805.39 (Sergeev et al. 2016). It appears that the carboxylate of the agonists and antagonists differently coordinate the arginine network of interactions. While agonists are able to be in hydrogen bonding with two arginines and two tyrosines, antagonists are unable to form interactions with all the residues. Like in FFA1, I suggest that modulation of the arginine pairing is critical in triggering receptor activation. The arginine pairing in FFA2 is also stabilised by countercharged E166EL2 similar to FFA1.

Fig. 4
figure 4

Ligand binding at FFA2. (a) The agonist binding mode using the example of cmp 1 (Hudson et al. 2013). (b) Antagonists binding mode using the example of CATPB. (cd) The binding mode of 4-CMTB in the orthosteric and allosteric sites. Hydrogen bonds and hydrophobic interactions are in pink and yellow, respectively

A different hydrogen bonding network of the carboxylate in agonists and antagonists is likely due to variation in the hydrophobic and aromatic moieties. Docking predicts the position of the phenylthiazole moiety of cmp1 pointing towards helices 4 and 5, and the remaining group is directed towards helices 3 and 4 (Fig. 4a). Two aromatic moieties of CATPB are pointed in similar directions (Fig. 4b). Unlike FFA1, it appears that the interhelical space of helices 3 and 4 is less accessible. Although it seems that the conformationally restricted phenylthiazole moiety by an ortho substituent in the phenyl ring stabilises the position of the carboxylate in the agonist to be able to form hydrogen bonding with four polar residues, the precise position of this moiety needs to be further validated by mutagenesis.

Finally, the FFA1-based homology model of FFA2 was helpful in understanding a two-step activation process of 4-CMTB (also known as AMG7703) (Grundmann et al. 2016) at a molecular level. Innovative label-free biosensors and functional assays have shown that 4-CMTB briefly activates the receptor through the orthosteric site and subsequently induces prolonged activation through the allosteric site. We used the new FFA2 homology model to identify interactions of 4-CMTB with the orthosteric and allosteric sites and to characterise a precise scenario of dynamic binding that highlighted key interactions with the receptor, using steered MD simulations. In particular, the new model has helped to establish four residues S863.29, Y903.33, I1454.61 and E166EL2 in the orthosteric site and K652.60 in the allosteric site (Fig. 4c, d). Similar to FFA1, K652.60 is within the binding cavity and not outside as it was predicted in earlier models based on the β2-adrenergic receptor. Simulations of 4-CMTB binding and unbinding suggested the ligand enters the receptor through the extracellular tips of helices 4 and 5; moves to the orthosteric site composed of S683.29, Y903.33, I1454.61, H2426.55 and R1805.39; and, subsequently, with the help of the inter-site translator residues Y165EL2 and Y903.33 occupies the allosteric site by interacting with K652.60. The obtained allosteric binding pose of 4-CMTB is consistent with the SAR studies (Smith et al. 2011; Wang et al. 2010). In the binding site, the carbonyl group of the amide group of 4-CMTB forms hydrogen bonding with K652.60, while the hydrogen of the amide group forms an hydrogen bond with the backbone of C164EL2. The importance of the amide group to act as a hydrogen bond acceptor and donor in binding of 4-CMTB to the receptor is confirmed by the absence of activity for analogues containing amino or methylamide groups (Smith et al. 2011; Wang et al. 2010). The isopropyl moiety of 4-CMTB sticks to helices 2 and 3 forming hydrophobic interactions with L612.56, W75EL1, T853.28, F893.32 and the hydrophobic tail of K652.60. The small hydrophobic pocket does not allow growing the isopropyl group. Indeed, replacement of the isopropyl group for bulkier groups led to reduction of potency (Smith et al. 2011). The polar thiazole ring of 4-CMTB is pointed to the extracellular side and in π-π stacking with F73EL1. The contribution of the aromatic interaction to the potency of the ligand is demonstrated by the absence of activity for analogues with acetylaminoethyl and cyclooctyl groups (Smith et al. 2011). The remaining phenyl group of 4-CMTB is pointed towards the region covered by EL2 and is in hydrophobic and aromatic interactions with Y165EL2 and Y903.33.

Overall, the homology model built based on a template with sequence identity of 32% together with mutagenesis data enabled the characterisation of binding of orthosteric agonists and antagonists as well as a first sequential activator. Unequal binding to two arginines suggests that planar pairing of arginines is differently stabilised in the presence of agonists and antagonists and could be crucial for receptor activation.

4 Free Fatty Acid Receptor 3

FFA3 has higher sequence identity in the transmembrane domains with FFA2 (49%) than with FFA1 (33%). The first model of FFA3 was constructed based on the β2-adrenergic receptor and used for an understanding of a preference in binding of short carboxylic acids with sp3 over sp2 hybridised alpha-carbons (Hudson et al. 2012a, b).

The recently available FFA1 crystal structure provides a better quality template for FFA3 modelling. Like in FFA1 and FFA2, this receptor has two conserved arginines and two tyrosines for coordination of the carboxyl group, suggesting a similar network of interactions, involving the arginine pairing (Fig. 5). Interestingly, FFA3 has L171EL2 instead of glutamate of FFA1 and FFA2, indicating that this residue cannot be around the arginines to stabilise the planar pairing. It is likely that the conformation of EL2 in FFA3 should be notably distinct from FFA1 and FFA2, bringing available hydrophilic or negatively charged residues of EL2 close to the arginine pairing to balance this interaction. Indeed, there is low sequence homology in EL2 with FFA1. In the binding site of FFA3, there is a third arginine at position 2.60, contrasting with a lysine residue at this position in FFA1 and FFA2. The role of this residue in receptor function is unclear; however it likely contributes to ligand recognition, similar to a lysine in FFA2, and is clustered with other two arginines in planar pairing, due to close spatial proximity, thus potentially modulating receptor activation.

Fig. 5
figure 5

The extracellular binding cavity of FFA3. The residues anchoring the carboxyl group of free fatty acids, L171, of the second extracellular loop and, R712.60, the third positively charged residue in the extracellular binding cavity are visualised

The medicinal chemistry of FFA3 is the least developed of the two subtypes, with only a series of allosteric modulators with modest activity available to date. Pharmacological studies have shown that mutation of R1855.39 and R2587.35 to alanine does not change the binding of these compounds (Hudson et al. 2014a, b). To further assess the recognition of these aromatic compounds by FFA3 residues, F963.33, Y1003.37, Y1514.61, Y170EL2 and F173EL2 forming a putative binding site could be mutated.

With availability of a reasonable homology model of FFA3 progress in the development of ligands could be accelerated in the years to come. Indeed, the structural data allows a move from traditional high-throughput screening methods to cheaper and efficient virtual screening approaches for the identification of novel ligands. In the recent study, Huang and colleagues have used homology models of the poor-characterised receptors, GPR68 and GPR65, for virtual screening and identified potent ligands (Huang et al. 2015). A similar approach could be applied for FFA3 to find potent modulators for future pharmacological and modelling studies of the receptor.

5 Free Fatty Acid Receptor 4 (GPR120)

Although GPR120 binds long- and medium-chain fatty acids like FFA1 as well as some synthetic FFA1 agonists, the anchoring site for the carboxylate and the overall location of the binding site are distinct. While two arginines and two tyrosines of FFA1 are involved in coordinating one of the oxygen atoms of the carboxylate, these residues are absent in GPR120. Instead, R992.64 chelates the carboxylic group of the ligand by forming two hydrogen bonds as predicted initially by the rhodopsin-based homology model of GPR120 (Negoro et al. 2010; Takeuchi et al. 2013). The binding site is located in the centre of a helical bundle involving helices 2, 3, 5, 6 and 7 (Hudson et al. 2014a, b) (Fig. 6a). This is different from FFA1, where the TAK-875 binding site is on the side of the extracellular cavity within the helical bundle and is created by helices 3, 4, 5 and 6 (Fig. 3e).

Fig. 6
figure 6

The binding site of FFA4. (a) The surface-based representation of the FFA4 extracellular cavity. (b) The superimposition of the β2-adrenergic and OX2-based homology models based on the backbone of the helices. The models are in pink and cyan ribbons, respectively. (c) The binding mode of TUG891. R992.64 coordinates the carboxylate of TUG891. The surface is coloured based on the electrostatic potential

Recently, docking of several agonists to the GPR120 homology model built based on the β2-adrenergic receptor active state (24% sequence identity) and supported by residue mutagenesis has been reported by Hudson and colleagues (Hudson et al. 2014a, b). The authors suggested that besides R992.64, W104EL1 is in hydrogen bonding with the caboxylate, while hydrophobic residues F882.53, F1153.29, W2075.38, F2115.42, I2806.51, W2776.48, F3037.35 and F3047.36 create a favourable environment for the hydrophobic tail of linoleate, TUG891, TUG-670 and GW9508 (Hudson et al. 2014a, b). This study predicts that V2125.43 and I2816.52 form a hydrophobic pocket for the methyl group of TUG-891 (Hudson et al. 2014a, b). Unlike other ligands, GW9508 interacts with F3117.43 and not F882.53, suggesting difference in the binding pose. The importance of hydrophilic residues T1193.33, T3107.42 and N2155.46 in ligand binding is also demonstrated.

Recent advances in structural biology of GPCRs have provided alternative templates for homology modelling of GPR120. Among currently available templates, the orexin receptors (OX1 and OX2) have recently published crystal structures (Yin et al. 2015, 2016) and the highest sequence conservation, 30% in the transmembrane region. Moreover, EL2 of the OX2 receptor has a similar length and some similarity (16%), especially in the region after the disulphide bridge, making OX crystal structures more suitable templates than other available GPCR crystal structures. We have built the improved GPR120 model using these templates to evaluate the ligand-binding site. Unlike the β2-adrenergic-based model of GPR120 with only the side chain of R992.64 clearly pointing to the centre of the extracellular cavity, both the backbone and side chain of R992.64 are orientated towards the centre of the helical bundle (Fig. 6b). This somewhat affects ligand docking. Figure 6c shows the docked pose of TUG891, in which the carboxyl group only interacts with R992.64. Other contacts are relatively similar to the β2-adrenergic-based model of GPR120 apart from W2075.38, F3037.35 and F3047.36, which are likely having a nondirect effect. The biphenyl moiety of TUG891 is predicted to be placed perpendicular to the helix and firmly locked in the pocket between helices 2–3 and 4–5. This pose is consistent with SAR data and provides steric reasons why large substitutions in the para position of the terminal ring are unfavourable as well as explaining the preference of ortho position of the terminal ring over meta or para positions. Comparison of the binding cavities of GPR120 with FFA1 shows that the GPR120 cavity is large and likely more exposed to the extracellular side, contrasting with a narrow cavity of FFA1 that is covered by EL2 and extends to the interhelical space between helices 3 and 4 (Figs. 3 and 6). In GPR120 the ligand needs to adopt a ‘V’ shape and should be relatively short but bulky, while in FFA1 the ligand is longer but needs to be narrower to squeeze between helices to adopt a ‘U’ shape. The new model predicts the side chain of L196EL2 and W198EL2 facing the binding cavity. W198EL2 is conserved with the OX receptors, suggesting a similar position in GPR120. In the GPR120 model, W198 EL2 is in π-π stacking with W2075.38, a critical residue for ligand binding (Hudson et al. 2014a, b).

The known microswitches of activation and inactivation for the rhodopsin family of GPCRs are conserved in GPR120, contrasting to other free fatty acid receptors. Thus, W6.48 of the conserved ‘CWxP’ motif, the so-called toggle switch of GPCR activation, is conserved in GPR120 and predicted to be in close proximity to the ligand. In addition, GPR120 has conserved residues at the D[E]RY motif and aspartate at position 6.30, thus forming a salt bridge, known as an intracellular ionic lock that holds an inactive state of the receptor. The tyrosine of ‘NPxxY’ motif of GPR120 is predicted to interact with the tyrosine at position 7.53 in the active state, similar to rhodopsin and the β2-adrenergic receptor. Therefore, GPR120 likely has a typical mechanism of activation observed in many GPCRs of the rhodopsin family.

The validated GPR120 binding site, together with the improved homology model, could be further used for establishing the binding properties of a recently identified antagonist (Sparks et al. 2014) and biased agonist (Li et al. 2015). For instance, by introducing modifications in the cholecystokinin 2 (CCK2) receptor as well as in its biased antagonist suggested by modelling, Magnan and colleagues (Magnan et al. 2013) were able to identify the key moiety in the ligand responsible for biased signalling and a microswitch of activation, involving M3.32 and Y7.43 that stabilises the β-arrestin active state of the CCK2 receptor. Such a bidirectional study could be explored in GPR120.

6 GPR84

The first attempt to delineate the binding site of GPR84 has been made recently by Nikaido and colleagues (Nikaido et al. 2015). They used a GPR84 homology model based on a β2-adrenergic template with 20% sequence conservation to predict and then validate by mutagenesis the putative residues interacting with decanoic acid, a native agonist, and diindolylmethane, a synthetic agonist. They found that mutations of L1003.32, F1013.33 and N1043.36 dramatic altered potency of decanoate. However, these residues are not important for binding of diindolylmethane. The authors suggested that N1043.36 plays an anchoring role for the carboxylate of the ligand at GPR84.

To further explore the properties of this receptor, we modelled its structure using the recently published crystal structure of the OX1 receptor (Yin et al. 2016), the template with highest sequence identity, 31%. Importantly, the sequence analysis of EL2 among the receptors with available crystal structures shows that EL2 of GPR84 has 35% identity and a similar length with rhodopsin (Fig. 7a). This is interesting, as generally EL2 is quite divergent, even within the same receptor family. From such relatively high similarity, we could suggest that EL2 of GPR84 has a similar β-sheet conformation and covers the cavity within the helical bundle like in rhodopsin. We modelled the GPR84 structure using a hybrid template, where EL2 of rhodopsin was attached to the OX1 crystal structure. The obtained model was subjected to a short optimisation using molecular mechanics and dynamics tools of MacroModel (Schrodinger, LLC, New York, NY, USA 2014a). The resulting model is shown in Fig. 7b, c. The model predicts that R172 of EL2 is pointing into the binding cavity within the helical bundle and might play a coordinating role for the carboxylate of the ligand. This residue could not have been predicted with the β2-adrenergic receptor template used by Nikaido and colleagues (Nikaido et al. 2015) due to the absence of any similarity in EL2. In addition, there are several residues with hydrogen bonding capability such as Y812.65, Y692.53, N1043.36, Y1865.45, N3396.55 and H3527.36 in the putative ligand-binding site. Interestingly, asparagine in position 6.55 is conserved with the OX1 receptor in which it forms hydrogen bonding with the ligand. Overall, the putative binding site is located at the centre of the binding cavity within the helical bundle and is created by aromatic residues, F170EL2, F1013.33, F3356.51 and W3607.43, and aliphatic residues, L732.57 and L1003.32. Unlike FFA1–4, GPR84 has a few aliphatic residues in the binding cavity.

Fig. 7
figure 7

A GPR84 homology model. (a) The sequence alignment of the second extracellular loop 2 (EL2) for GPR84 and rhodopsin. The putative anchoring R172EL2 and the corresponding residue in rhodopsin are highlighted. (b) A homology model based on a hybrid template involving the OX1 receptor and EL2 of rhodopsin. (c) The putative ligand-binding cavity of GPR84. Only potentially interesting residues for ligand binding are visualised

From sequence analysis and homology modelling, it is evident that the ligand-binding site of GPR84 is distinct from FFA1–4. GPR84 does not contain positively charged residues at positions 5.39 and 7.35 like in FFA1–3 or at position 2.64 like in FFA4. Moreover no charged residues are found in the transmembrane bundle facing the putative binding cavity. The improved homology model based on the hybrid template predicts that EL2 could play an important role in anchoring a free fatty acid at GPR84 by means of a positively charged arginine. Given that phylogenetically GPR84 belongs to the prostanoid receptor subfamily, for which the positively charged residue is predicted to coordinate the negatively charged ligands, as exemplified by the prostacyclin receptor (Stitham et al. 2003), it is suggested that GPR84 also could attract the anionic part of the ligand via a countercharged residue.

7 Summary and Future Directions

The recent progress in GPCR crystallography has enabled the elucidation of the structure of FFA1 and provided reliable templates (>30% sequence conservation) for homology modelling of other free fatty acid receptors. The atomic models of the receptors could now be used for an understanding of ligand recognition and, subsequently, in structure-based ligand design. In particular, it will be important to explore the binding modes of ligands with different chemotypes as well as a different pharmacological profile (orthosteric agonists and antagonists, allosteric modulators and biased ligands) for FFA1, FFA2 and GPR120 using molecular docking in combination with mutagenesis and ligand SARs. This knowledge should pave the way for the discovery of small-molecule drugs with improved properties with a high chance of success.

For FFA3 and GPR84 receptors with a limited number of ligands available to date, it will be interesting to probe homology models for virtual screening with a hope to identify new binders and therefore to increase the arsenal of ligands to study receptor pharmacology and physiology. Structure-based virtual screening of chemical libraries using an X-ray structure or even a homology model is now an accepted method of discovering new chemotypes in GPCRs (Ngo et al. 2016). Indeed, prior to FFA1 crystallisation, the virtual screening based on the FFA1 homology model led to the identification of 15 compounds acting as agonists, partial agonists and antagonists (Tikhonova et al. 2008).

FFA1–3 represent the first GPCRs with an unusual arginine pairing interaction. It appears that the regulation of this interaction is crucial for defining the pharmacological property of ligands. Further studies are required to explore charge pairing of arginines in receptor modulation.