Introduction

The olfactory system detects and distinguishes among a large number of structurally diverse odorant molecules. In the nasal cavity of vertebrates, volatile molecules interact with odorant receptors (ORs), which are located in the cilia of olfactory sensory neurons. The odorant binding to ORs activates a transduction cascade that leads to the production of action potentials, which are transmitted to the brain (for review, see Menini et al. [1]).

Mammalian ORs were first identified in 1991 by Buck and Axel [2]. Because of the difficulty in expressing ORs on the membrane surface of heterologous cells, to date only a limited number of relations between ORs and odorant molecules could be provided (for reviews see Mombaerts [3] and Godfrey et al. [4]). One odorant can activate numerous types of ORs, while a single OR can be activated by several different odorants [5]. Each type of odorant molecule is recognized by the activation of a unique combination of ORs in a combinatorial code. ORs have been subdivided in two classes: [6, 7] Class #1 ORs, the so-called fish-like ORs, which share a rather high sequence identity (SI) and Class #2 ORs, mammalian-like ORs, which comprise all the others. Class #1 ORs bind structurally similar ligands [5, 8] such as aliphatic acids, alcohols, aldehydes and so on, whereas Class #2 ORs bind structurally divergent odorants [5, 819].

ORs belong to Class A of the GPCR superfamily, the so-called Rhodopsin-like Class [20]. They feature seven transmembrane (TM) helices in the membrane domain, an extracellular N-terminus and an intracellular C-terminus [20]. Resting on the significant divergence of SI and on the results obtained by site-directed mutagenesis experiments, helices TM3, TM4, TM5 and TM6 are believed to form the binding pocket [1, 14, 21]. Since there is still a lack of experimental structures at atomic level, structural information on ORs is essentially based on computational methods.

As for the prediction of the entire transmembrane domain, structural models of some ORs were based on the bacteriorhodopsin X-ray structure [2224] or on a low-resolution (7.5 Å) electron-microscopy map of rhodopsin [21, 2529]. Molecular-dynamics simulations of one of these models, the rat OR I7, [28] helped to identify the dissociative pathway along which the ligand accesses the pocket [30]. Models of ORs (MOR31-2, MOR32-4, MOR33-1, MOR42-1, MOR42-3, MOR103-15, MOR175-1 and MOR204-32) were also predicted using molecular-simulation based protocols [3134]. These models were structurally rather different from each other and from rhodopsin, although several positions in TM3 and TM6, which are involved in ligand binding, are conserved across the family. The binding pockets of these models turned out to be constituted by helices TM3 to TM7 along with extracellular loops. Free-energy calculations based on these models turned out to be in agreement with experimentally-determined binding affinities [911, 15].

Man et al. [35] suggested that residues that might be involved in ligand binding in ORs are located on helices TM2 to TM7 and on the second extracellular loop. Their work was based on a comparison between mouse and human OR sequences, under the assumption that functional contact residues would be conserved among pairs of orthologous receptors, but considerably less conserved among paralogous pairs [35]. Recently, Katada et al. [14] published a study that combines computational biology with molecular biology tools. They performed several site-directed mutagenesis experiments for MOR174-9 OR to identify residues involved in ligand binding and selectivity and recognized nine crucial amino acids—Ser113, Phe206, Asn207, Thr211, Leu212, Phe252, Thr255, Ile256 and Leu259—located on helices TM3, TM5 and TM6. Models based on the bovine rhodopsin X-ray structure, [36] which is the only GPCR for which an X-ray structure has been determined, were fully consistent with the experimental results [14].

The reliability of homology models for ORs has been questioned, based on first-principles simulations and the low SI between rhodopsin and ORs (in the case of those considered in this work it ranges between 14 and 20%) [3134]. However, homology models with low SI (ranging from 10 to 30%), based on rhodopsin and other templates, have been reported for a variety of membrane proteins [3739]. These models provided insights into structure/function relationships for these systems. In addition, the recent observation that structural features are well conserved across all class A GPCRs [40], which include ORs and rhodopsin, has led to the conclusion that OR rhodopsin-based structural predictions may be reliable, strengthening a posteriori the conclusions of Katada et al. [14].

Prompted by these studies, we have here extended the rhodopsin-based modeling to 29 ORs for which ligand affinity data have been measured (Table 1). Because of the aforementioned structural conservation, we postulate that: (1) the presence of conserved residues in ORs binding ligands possessing the same functional groups may provide hints on their functional role, in spite of the low SI across the family. This assumption is further substantiated a posteriori by an analysis of the sequences, which shows that, based on this assumption, ORs binding to specific ligands feature conserved residues that bind to those ligands. (2) The relative orientation of the seven TM helices is similar across the OR family, in spite of the diversity of their sequence [2, 5, 35].

Table 1 Residues in the putative binding pocket of mouse ORs identified by this study

In addition, the structural information for MOR174-9 [14] may be included in the modeling for most ORs (Table 1), given the structural conservation among members of GPCR family [40]. Our calculations permit us to identify a number of residues located on helices TM3, TM4, TM5 and TM6, which may play a role in odorant binding for 23 out of the 29 ORs considered (Fig. 1 and Table 1). For these receptors, mutations in the putative binding site were also suggested. These mutations could be used to validate the model against molecular biology experiments.

Fig. 1
figure 1

Side (a) and top (b) views of a representative structural model of ORs (\(C_{\alpha } \) carbons in TM helices are only shown). The positions of the residues that may be involved in ligand binding are here indicated. Nine of them, indicated by I to IX, coincide with those previously obtained for MOR174-9 receptor [14]. Figures were created with VMD 1.8.2 program. [47]

Materials and methods

Sequence alignments

The sequences of mouse ORs for which binding affinity data are available [5, 8, 1018] (Fig. 2), were aligned using the ClustalW multiple alignment program [41].

Fig. 2
figure 2

Sequences of all mouse ORs, for which ligand affinity data have been measured [5, 8, 1018]. In the absence of a commonly agreed recent nomenclature for ORs, we have used the nomenclature proposed by Zhang and Firestein [7], in which Class #1 OR subfamilies are given numbers lower than 100 and Class #2 OR subfamilies are given numbers higher than 100. (For a comparison among different OR names see Table 5 by Godfrey et al. [4]). Only the sequences corresponding to helices TM3, TM4, TM5 and TM6 are here reported. Basic, acid, polar and Cys residues are colored in blue, red, green and yellow, respectively. Residues that have been shown to bind eugenol in MOR174-9 [14] are reported inside black boxes. Positions I to IX and 1 to 14, discussed in the text, are labelled. The position in TM4, mentioned in the text, is indicated with a red arrow above the alignment. Classes and specific ligands, which bind to corresponding ORs, are labelled on the right side. These are: a Alcohols, c carboxylic acids, b bromocarboxylic acids, d dicarboxylic acids, e alkylic aldehydes, f ketones, g benzaldehyde, h citronellal, i eugenol, j carvone, k acetophenone, l piperonal, m ethyl-vanillin, n vanillin, o citral, p limonene, q coumarin, r isovaleric acid, s cinnamaldehyde, t 2-octenal, u lyral, v chromanone, w 2-coumaranone [5, 8, 1018]. Small differences in the response profiles of ORs to odorants have been found using different experimental systems by Malnic et al. [5] and Saito et al. [8] In case of discrepancy, we used the most recent experimental results obtained by Saito et al. [8] because their newly developed system of OR expression permits to perform site-directed mutagenesis experiments and to functionally test the odorant ligand specificity, whereas the pioneer studies by Malnic et al. [5] were obtained on isolated olfactory sensory neurons

Three-dimensional structural models

Three-dimensional (3D) structural models of ORs (Fig. 1) were constructed using the MODELLER 6.2 program [42]. The calculations were based on the X-ray structure of bovine rhodopsin [43]. Sequences of ORs were aligned against the rhodopsin sequence using ClustalW [41]. The experimental information about residues that bind eugenol in the MOR174-9 receptor [14] was included into the alignment. To achieve this, (1) we considered the positions of the residues known to bind eugenol in the MOR174-9 receptor, which are believed to face the active-site cavity [14]. (2) We looked at the alignments of MOR174-9 with the other receptors and identified the corresponding positions of the latter proteins. These positions were assumed to belong to residues facing the active-site cavity. (3) The alignments of OR sequences against bovine rhodopsin sequence were modified manually according to this information. The alignments are available as supplementary material.

Energy minimization based on the parm99 AMBER force field [44] was carried out for all the 3D-models investigated here. The analysis presented in the Results and discussion section refers to such models.

Results and discussion

Comparisons among ORs

Here we attempt to provide a rationale for ligand binding to mouse ORs (Fig. 2) by first investigating the role of the nine residues, identified as constituents of the binding site in MOR174-9 (Ser113, Phe206, Asn207, Thr211, Leu212, Phe252, Thr255, Ile256 and Leu259) [14]. For purposes of clarity, we will refer to the positions occupied by these residues, as well as by the residues in equivalent positions in the other ORs, as positions I–IX throughout the text. We constructed approximate 3D-models for the 15 Class #1 ORs (Fig. 2) and the 13 Class #2 ORs (MOR103-15, MOR106-1, MOR106-13P, MOR171-2, MOR175-1, MOR203-1, MOR204-32, MOR258-5). However, for ORs MOR118-1, MOR136-6, MOR174-4, MOR267-13 and MOR276-1, the experimentally available information and/or the similarity with other ORs is not sufficient to make reliable predictions for the binding pocket regions. Consequently, these ORs were not considered further.

We will first focus on Class #1 ORs, which share some common ligands, and later on Class #2 ORs, whose ligands are more variable.

Class #1 ORs

The odorant specificities of 15 Class #1 ORs have been tested experimentally [5, 8]. Aliphatic carboxylic acids were found to bind to 12 of the 15 Class #1 ORs, while dicarboxylic acids bind to the remaining three: MOR42-1, MOR42-3 and MOR13-6 (Table 1). We investigate here the role of residues in positions I–IX in the binding of aliphatic carboxylic and dicarboxylic acids, in which the charged groups are at the ends of the chains, for the experimentally tested Class #1 ORs.

Position IV always has an Asp group. We suggest, at a speculative level, that this Asp could interact with a basic position in TM4 (indicated by a red arrow in Fig. 2) conserved across this class of ORs. Thereby, this position may serve structural and/or functional purposes, as previously noticed by Malnic et al. [5] Position I has polar residues in most ORs, with the exception of MOR42-1 and MOR42-3, which have a Cys in position I. Most ORs show polar residues at position IX and at least one polar residue in positions II and III. Therefore, some of the residues in positions I–IV and IX might serve as polar/charged anchors for the carboxylate group present in the odorants. By contrast, positions V–VIII are mostly occupied by apolar residues, except for MOR42-1 and MOR42-3, in which polar residues are mostly present. As a result, positions V–VIII might form a hydrophobic pocket to accommodate the aliphatic tail of the ligands, except for MOR42-1 and MOR42-3, in which the polar pocket might bind the additional carboxylate group present in dicarboxylic acids. In MOR13-6, which also binds to dicarboxylic acids, the only polar residue that may bind the second carboxylate group is the Ser residue in position V, in addition to positions I and II. Nevertheless, at present, it is difficult to state whether this Ser residue is involved in the binding.

Class #2 ORs

Ligands binding to Class #2 ORs exhibit a very large structural diversity. In the experimentally tested receptors, positions I, II and IX are occupied mostly by apolar groups, positions IV and VII by polar groups, whilst the chemical nature of positions III, V, VI and VIII is similar to that of Class #1 ORs. However, as expected from the great variability of Class #2 ligands, no significant common features could be identified for these positions.

Models built on rhodopsin’s structure

By constructing 3D-models of ORs, we could identify 14 other positions (hereafter 1–14 for the sake of clarity) located on helices TM3, TM4, TM5 and TM6, which face the active-site cavity and, therefore, may play an important role in binding the odorants (Fig. 1). In this paper we analyze the possible role of all the residues we identified as putative constituents of the binding pocket and the ligand specificity for some ORs (Table 1 and Fig. 2). The main residues forming the binding pocket are summarized in Table 1. Residues that may form H-bond with ligands are in italics. As our analysis is based on static models, at present it is not possible to identify if some additional residues exist that are not located in the cavity but still might participate to the binding in a dynamic manner.

Most of the Class #1 ORs that bind to n-aliphatic carboxylic acids have a His residue in position 3, except for MOR40-1 (Asn), MOR42-1 and MOR42-3 (Tyr), (Fig. 2). Some of these ORs also bind to n-aliphatic alcohols. His in position 3 could form an H-bond with the polar group of the ligands. As can be seen from Fig. 1, usually there is a second polar residue present (positions 4, 5 and 9) that can form a second H-bond with the carboxylate group of carboxylic acids. The aliphatic tail of these ligands might be surrounded by apolar residues in positions 10, II and III at TM5 and 12, VI, VII, VIII, 13 and IX at TM6.

The main difficulties in predicting which residues are responsible for binding carboxylic acids and/or alcohols are expected for Class #2 ORs: MOR106-1, MOR106-13P MOR203-1 and MOR204-32. They show sequence diversity and therefore, as can be seen from Table 1, the distribution of polar and/or charged residues in the binding sites is very different within the class and with respect to Class #1 ORs. As a result, it is not possible to formulate a unique model of the binding pocket. For MOR106-1 and MOR203-1, which bind carboxylic acids, we can propose two different binding models. As for MOR106-1, the ligand carboxylate group can bind either Thr in positions 5 or VII. In the MOR203-1 case, one possibility is that Thr in position VII and Tyr in position 13 bind the odorant carboxylate group; the other is that the ligand is coordinated by Thr in position 5 and Asn in position III.

MOR106-13P and MOR204-32 bind alcohols. The first could bind the ligand hydroxyl group with Ser either in position 5 or VII. The second could bind the ligand either with Thr in position 5 or VII, or with Ser in position I.

MOR42-1 and MOR42-3, which bind aliphatic dicarboxylic acids, exhibit polar residues in positions 3 at TM3, and in positions 11, VII, 13, IX at TM6. MOR42-3 also shows a polar residue in position 9 at TM5. Thereby, we may assume that there are two available sites for the binding of the two carboxylate groups. MOR13-6, which also binds aliphatic dicarboxylic acids, exhibits polar residues in positions 5 at TM3 and 13 at TM6. These also could be two sites for binding the two carboxylate groups.

Some ORs that bind aliphatic or aromatic aldehydes, such as MOR103-15, MOR31-6 and MOR171-2 (Table 1 and Fig. 2), exhibit a polar residue at position VIII, which contains mostly apolar residues for other ORs. These residues could form H-bonds with the carbonyl group of the ligands. Two other ORs that bind aldehydes, MOR175-1 and MOR258-5, do not have polar residues in position VIII. However MOR175-1, which also binds ketones, exhibits polar groups in positions 3 and VII, which can form H-bonds. MOR258-5 shows a Ser in position I. As MOR258-5 binds ligands that may form several H-bonds, Ser in position I is, at present, the only candidate in the putative binding site likely to be involved in H-bond formation with ligands.

Comparison with other models

Our proposal that position 3 forms H-bonds to ligands in most of the ORs that bind carboxylic acids and in MOR175-1 is consistent with previous models [31]. Also our hypothesis that Thr in position VII of MOR204-32 forms a H-bond with its ligands (heptanol and hexanol) is consistent with Floriano et al.’s observations [31].

Man et al. proposed a set of 22 amino-acid positions that are important for the ligand binding site [35]. Sixteen of them are located in helices TM3, TM4, TM5 and TM6 [35]. Ten of these positions, located mostly in helices TM3 and TM4 (1, 2, 3, 4, 5, I, 7, 8, III, VIII) are also proposed to be involved in ligand binding in this work, where the main discrepancies pertain to helices TM5 and TM6.

TM4 is predicted to be part of the stable core of the Class A of GPCRs [45, 46]. This helix has a position conserved in most proteins of Class A and contains Trp (Trp or Tyr in many ORs), which H-bonds to a conserved Asn on TM2. In addition, we have shown that in all Class #1 ORs there is a position in TM4 that is always occupied by a basic residue when that position IV in TM5 presents an Asp residue. We have therefore assumed that this information is sufficient to include TM4 helix in our model for all Class #1 ORs.

Finally, the ab initio model of MOR103-15 has pointed out that Lys164, located in TM4 (position 8), is the crucial residue for binding aldehydes [32]. This model is supported by free-energy calculations, which proved to be in agreement with experiment. However, our modeling suggests that this residue is disposed in the neighbor of the binding pocket, probably forming a salt bridge with Asp285, whereas Ser in TM6 (position VIII) is the best candidate for binding the carbonyl group of aldehydes: firstly, most ORs that bind aldehydes have a polar or a Cys residue in position VIII in TM6; secondly, it has been shown that in MOR174-9, the most crucial residue for binding the carbonyl group is Ser in TM3 (position 3), [33] despite this OR binding an aldehyde—heptanal—and also having the basic residue His in TM4. MOR171-2 has a Glu residue in position 8 and binds ligands acetophenone and benzaldehyde, which cannot be donors of H-bonds. Finally, MOR276-1 has the polar residue Gln in this position, but it binds an absolutely apolar ligand—limonene. Consequently, this issue could be resolved by including molecular biology data.

Concluding remarks

Our calculations suggest that: (1) residues in positions I–IV, IX and V–VIII might form electrostatic and van der Waals interactions for most ligands of Class #1 ORs; (2) in Class #2 OR, positions I, II and IX are occupied mostly by apolar groups, positions IV and VII by polar groups, and the nature of residues at positions III, V, VI and VIII is similar to that of Class #1 ORs; (3) other 14 positions (indicated as 1–14 for clarity purposes), in addition to I–IX, may play an important role for ligand binding.

It is clear that the possibility of performing further mutagenesis experiments could dramatically improve the general comprehension of the odorant recognition process and eventually provide a straightforward validation to our models. We have here emphasized the achievement of Saito et al., [8] who have been able to express a set of 11 ORs by using identified accessory proteins. Therefore, it should now be experimentally possible to test the effect of site-directed mutagenesis experiments on ligand specificity for this set of ORs.

As for those Ors that bind carboxylic acids (MOR23-1, MOR31-2, MOR31-4, MOR31-6, MOR32-5, MOR32-11 and MOR203-1), [8] the critical experiment is the substitution of the polar groups, which constitute the binding site, with apolar residues (see Table 1). Apart from MOR203-1, which shows many polar residues facing the binding pocket and possibly participating to the carboxylic acid binding, in order to observe a change in ligand binding affinity, His in position 3 seems the best candidate for mutation (Fig. 3). Since all ORs that bind carboxylic acids show two very close polar residues in positions 3 and 9 (MOR31-2, MOR31-4 and MOR31-6), 3 and 4 (MOR23-1), 3 and II (MOR32-5, MOR32-11), 5 and III or VII and 13 (MOR203-1) (Fig. 2), a double mutation could be necessary to observe a complete loss of affinity for carboxylic acids.

Fig. 3
figure 3

Schematic representations of the interactions between several ORs, for which mutagenesis experiments have been predicted, and their ligands. ac Examples of binding aliphatic carboxylic acids in MOR31-2, MOR31-4 and MOR31-6, respectively. Half-rounds in (a) and (c) indicate unfavorable interactions between aliphatic tails of ligands and polar residues in position VIII. d and e Right and wrong orientation of the pentanal bound to the binding site of MOR31-6, respectively. f Binding of dicarboxylic acids in MOR42-1 and MOR42-3. A redundant number of possible hydrogen bonds is shown due to several candidate residues suitable for forming H-bond with the second carboxylate group of the ligand

Some mutations could be suggested to modify the affinity for odorants belonging to the same chemical class. For example, MOR31-4 binds n-aliphatic carboxylic acids with a chain length ranging from six to ten carbon atoms, contrary to MOR31-6, which can bind only pentanoic acid (and pentanal) (Fig. 3 and Table 1) [8]. We believe that the main reason for their different behavior is given by the nature of the amino-acid residues in position VIII (a Ser for MOR31-6 and a Gly for MOR31-4). Therefore, in principle MOR31-4 mutation of Gly257 to Ser or bulky residues should reduce OR affinity for long-chain carboxylic acids. However, the very low accuracy of our models does not allow us to establish this point firmly. On the contrary, the mutation of Ser256 in MOR31-6 into Gly should allow the OR to bind longer carboxylic acids. In addition, the mutation of Ser256 into Gly or apolar residues should reduce the affinity of MOR31-6 for pentanal, validating the hypothesis that position VIII is involved in the binding of aldehydes.

In the case of MOR42-1 and MOR42-3, which bind dicarboxylic acids, we proposed the presence in the binding pocket of two polar regions, constituted by several residues (Fig. 3 and Table 1). We can advance that mutations of the polar residues present in one of these two regions with apolar residues should alter ORs affinity for dicarboxylic acids, whereas the possibility of binding at least one carboxylate group should be conserved.