Introduction

Mammalian aldehyde oxidases (AOXs, EC 1.2.3.1) and xanthine oxidoreductase (XOR, EC 1.17.3.2) are molybdo-flavoenzymes characterized by a high degree of structural similarity [14]. While a single XOR enzyme is known, the number of AOXs varies according to the mammalian species considered. The two extremes are represented by humans, which are characterized by a single AOX, i.e., hAOX1 (the human orthologue of mAOX1) [5, 6] and rodents which synthesize four AOX isoenzymes, i.e., mAOX1, mAOX2 (previously known as mAOX3L1) [7], mAOX3 and mAOX4. Each mammalian AOX isoenzyme is the product of a distinct gene and the current species-specific number of such genes is the result of a series of duplication and pseudogenization/inactivation or deletion events. While the substrate specificity of XOR is very restricted, AOXs catalyze the oxidation of various types of organic aldehydes into the corresponding carboxylic acid as well as the hydroxylation of various types of heteroaromatic rings [8]. The physiological substrates of XOR are known, as the enzyme catalyzes the oxidation of hypoxanthine into xanthine and xanthine into uric acid, two key steps in the catabolism of purines. In contrast, the physiological substrate(s) of AOXs have not yet been identified. Given the broad substrate specificity and the high expression levels in liver, human AOX1 as well as mAOX1 and mAOX3 play an important role in drug and xenobiotic metabolism [811]. In spite of this, the function of AOXs is currently unknown and the identification of physiological substrate(s) is likely to provide clues as to the physiological significance of this small family of enzymes. In this context, it would be of particular importance to establish whether the four mAOXs are characterized by overlapping or unique substrate specificities. Indeed, this type of information is likely to provide clues as to why certain mammals are endowed with a multiplicity of AOX isoenzymes expressed in a tissue-specific fashion.

From a structural point of view, the primary products of all mammalian AOX genes are highly conserved proteins of approximately 150 kDa. In their catalytically active form mammalian AOXs are characterized by a dimeric structure consisting of two identical subunits corresponding to the 150 kDa primary gene product. Each subunit consists of three domains: (a) a N-terminal ~25 kDa domain containing two non-identical 2Fe/2S redox centers; (b) an ~45 kDa intermediate domain containing the FAD-binding site; (c) an approximately 85 kDa domain containing the molybdenum cofactor (Moco), a molybdopterin cofactor present in all molybdoenzymes with the exception of nitrogenase. Thus mammalian AOXs consist of a short chain of redox centers, transferring the reducing equivalents deriving from the oxidation of the substrate to molecular oxygen with the production of superoxide anions [1].

Mouse AOX3 is the only mammalian AOX for which a crystal structure is available (mAOX3, PDB ID: 3ZYV [12]) and no information regarding the structures of mAOX1, mAOX2 and mAOX4 isoforms is available. In this report, we predict the three-dimensional structure of all the mouse AOX isoenzymes using the corresponding amino acid sequences and homology modeling techniques. We have optimized the mAOX structures through molecular dynamic simulations and we provide insights into the structural elements modulating the substrate specificity and activity. Finally we have localized the isoenzyme-specific amino acid residues characterizing the three domains of each protein, through a detailed structural and sequence analysis of the four different proteins.

Materials and methods

Homology modeling and identification of specific isoenzymes residues

The primary sequences of the four mouse AOX isoforms were obtained from the Universal Protein Resource (UNIPROT) database (available in http://www.uniprot.org). (UNIPROT ID: Q8R387, Q8VI15, Q5SGK3 and Q148T8 for mAOX1, mAOX3, mAOX2 and mAOX4 enzymes, respectively).

The mAOX3 crystal structure (PDB ID: 3ZYV) was also used as a template for the homology model of the other mAOX isoforms. The homology modeling procedures and the construction of some missing loops in the X-ray structure of mAOX3 were carried out with program MODELLER v9.11 [1316]. The overall stereochemical quality of the enzyme models was evaluated by thorough visual inspection and the discrete optimized energy (DOPE) [17]. The initial multiple sequence alignment with the primary sequences of the mAOX isoforms was performed using Clustal O (v1.2.1) as provided on the EBI webserver (http://www.ebi.ac.uk/Tools/clustalw2/index.html). The identification of the isoenzyme-specific amino acid residues was obtained from previously reported studies [3], where genome sequencing data analysis was performed in 84 different Aox1, Aox3, Aox3l1 and Aox4 genes. The relevant residues concerning the mouse isoenzymes were then mapped on the multiple sequence alignment.

Moco cofactor parameterization

The missing parameters of the Moco cofactor (molybdenum pentacoordinated to a molybdopterin cofactor, a sulfido, oxo, and hydroxo ligands) were determined. The Moco model structure was optimized using DFT, with the exchange correlation functional B3LYP [1820] and basis set 6-31G+(d) for all atoms except molybdenum, for which the LanL2DZ pseudopotential was employed. A semiflexible model approach was used to calculate the force constants for the bond and angle parameters of the molybdenum metal center [21]. Electrostatic charges were determined from a RESP fitting of Merz–Kollman charges [22]. Dihedral force constants involving Mo were set to zero, while transferable van der Waals atomic parameters were taken from the literature [23]. The parameters for the two [2Fe–2S] centers were also taken from the literature [24].

Molecular dynamics simulation

The structures of the four different mouse AOX isoforms were obtained from the previous homology modeling protocol. The geometry optimizations and a molecular dynamics (MD) simulation on these enzymes were performed with the parameterization adopted in AMBER 12.0 [25], using the parm99SB [26] and GAFF [27] force fields for the protein and cofactors, respectively. All of these force fields lead to the reliable description of the structure, energetics and dynamics of biomolecules [2831]. All hydrogen atoms were added with the Amber software X-Leap, taking into account all residues in their physiological protonation state. Several counter-ions (17, 21, 16 and 21 Na+ in mAOX1, mAOX3, mAOX2 and mAOX4) were employed to neutralize the high negative charge of each system. The X-Leap program was also used for this purpose. In these simulations, an explicit solvation model with pre-equilibrated TIP3P water molecules was used, filling a truncated rectangular box with a minimum distance of 12 Å between the box faces and any atom of each enzyme. The average size of each system was ca. 170,000 atoms. All systems were minimized in two stages: first the protein, cofactor and substrate were kept fixed and only the position of the water molecules was minimized. Afterwards the full system was minimized. Subsequently, an MD simulation of 100 ps at constant volume and temperature, and considering periodic boundaries conditions was run, followed by 20 ns of MD simulation with the NPT ensemble, in which Langevin dynamics was used (collision frequency of 1.0 ps-1) to control the temperature at 310.15 K [32]. These simulations were carried out using the PMEMD module. Bond lengths involving hydrogen atoms were constrained using the SHAKE algorithm, and the equations of motion were integrated with a 2-fs time step using the Verlet leapfrog algorithm [33]. The Particle-Mesh Ewald (PME) method [34] was used to include the long-range interactions, and the non-bonded interactions were truncated with a 10-Å cutoff. The MD trajectory was saved every 2 ps and the MD results were analyzed with the PTRAJ module of AMBER 12.0.

Funnel shape, dimer interface and SASA calculations

The free volume of each isoform substrate-binding site was measured using the software VolArea and included the space that is occupied by the metal cofactor [35].

The solvent accessible surface area (SASA) corresponds to the surface area of the molecule that is accessible to the solvent. In this report, we use it to measure the area of the binding site that can interact with a potential protein substrate (using a 1.4 Å probe in all calculations), and to study the amino acid residues important for protein dimerization. To do this the dimer interface was analyzed by extracting the average structure of the molecular dynamic simulations from the last 10 ns of simulation in the mAOX3 model.

Results and discussion

Overall homology models comparison

To identify and compare the factors that modulate the substrate specificity and activity of the four mouse isoenzymatic forms (mAOX1, mAOX2, mAOX3, and mAOX4), we built three-dimensional models, as an X-ray crystal structure is available only for mAOX3. This was accomplished by homology modeling techniques using the structure of mAOX3 (PDB ID: 3ZYV [12]) as a template. All the homology models included only one monomer of the isoform structures. In the case of the mAOX3 model, the homodimer was later generated to study the impact of conserved residues on the dimerization protein interface and for comparisons with the available crystal structure.

The primary amino acid sequences of mAOX1, mAOX2 and mAOX4 indicate a high proportion of sequence identity with mAOX3 (61, 65 and 64 %, respectively, Fig. 1; Table 1). From the structural point of view, the homology models present good stereochemical quality parameters, as determined by PROCHECK20, with backbone Φ and Ψ dihedral angles of 98.5, 98.4 and 98.3 % of the residues located within the generously allowed regions (84.9, 85.1 and 84.1 % in the core region) of the Ramachandran plot, respectively. The mAOX3 structure template has also 98.7 % of its residues located in the generously allowed regions, with 84.5 % of the residues in the core of the Ramachandran plot (Table 1).

Fig. 1
figure 1

RMSD values of the AOX isoforms (blue mAOX1, red mAOX3, green mAOX2, purple mAOX4)

Table 1 Percentages of sequence identity and stereochemical quality parameters obtained from the homology modeling protocol applied to mAOXs

To confirm the stable behavior of the proteins over the period of the simulation, the predicted three-dimensional structures of mAOX1, mAOX2 and mAOX4, along with the mAOX3 X-ray structure, were optimized through MD (20 ns) simulations. The protein backbone root-mean-square deviation (RMSd) values, having as reference the starting structures, are shown in Fig. 1. These values ranged from 1.0 to 1.5 Å in the last 10 ns of the simulation, allowing us to conclude that the equilibration of each protein system was achieved. An inspection of the average isoform structures in that period of time also revealed that the overall fold and secondary structural elements are stable and very similar between each other (RMSD values below 2.9 Å—Table 1). Thus, the structures of the isoforms are highly preserved and even more conserved than their amino acid sequence.

A closer inspection reveals that the isoform core region presents the highest degree of structural conservation. This region is directly holding and contacting the metal cofactor and it is, therefore, expected to be highly conserved among all isoforms, due to the similarity of their catalytic reactions [3]. From the structural point of view, the deviations in the isoform structures are mainly found in loops located at the protein surface and in the substrate-binding site region (Fig. 2).

Fig. 2
figure 2

a, b Structurally conserved regions in mAOX1, mAOX2, mAOX3, and mAOX4 (the blue and red regions represent the amino acid residues that are structurally conserved and unique, respectively). c Active-site regions sequentially conserved in all mAOX isoforms (the blue and red regions represent the amino acid residues that are structurally conserved and unique, respectively)

A closer view of the substrate-binding site indicates that the amino acid sequence of the active site is also highly conserved in all the structures considered (Fig. 2c). This suggests that the catalytic activity of the enzymes is similar. The region with the lowest sequence and structural similarity is located above the active-site region. This region modulates the shape and chemical nature of the funnel that gives access the active-site region, indicating that it may influence the substrate specificity of the enzymes.

Identification and localization of the isoenzyme-specific amino acid residues

The data previously collected on multiple mammalian AOX proteins allowed the identification of the common and isoenzyme-specific residues [3]. The specific residues of each mouse isoenzyme are represented in Table 2. As variations in the substrate and catalytic activity of some mouse AOX isoenzymes have been described [33] these residues are likely to be significant for the prediction of differences and similarities in the ability of each isoenzyme to bind and accommodate substrates inside the catalytic substrate-binding pocket. The localization of the isoenzyme-specific residues in the homology models indicates that the majority of them are located at the protein surface and in the core region of the Moco domain (Table 2). These observations are in line with the results obtained in the homology models above, where the major deviations between the different isoenzymes were found in the same regions. Nevertheless in structural terms, the residues located at the protein surface should have no major impact on the catalytic activity or substrate specificity of the isoforms. The identified specific residues located in the core region of the Moco domain can be divided into groups: the residues located in the entrance of active-site funnel (from mAOX2 and mAOX3, Table 2) and the residues in the dimerization site (all from mAOX1, Table 2).

Table 2 Identification of the isoenzyme-specific amino acid residues according to Garattini et al. [9] and corresponding protein domain localization

Analysis of the dimerization interface

The reason as to why AOXs are homodimers is unknown, since the monomeric subunits have been proven to be functionally active [6]. The identified isoenzyme-specific residues located in the dimerization site (mAOX1 Arg1068, Gly1069 and Glu1073; Table 2) are interesting and could be related with the association of the two monomers to form the homodimer. It is now generally accepted that residues buried in the protein interface between two monomers with exposed areas above 40 A2 are major contributors to the dimerization process [36]. These residues are often called hot spots. We investigated if the mAOX1 isoenzyme-specific amino acid residues previously mentioned correspond to hot spots in the dimer interface, but our results do not support this idea given a low exposed area of only 18 A2.

To better study the dimerization interface and to identify other key residues, the mAOX3 dimer interface was also generated and analyzed. The results indicate that the dimer interface is small when compared to the total area of each monomer (5,000 vs. 106,495 A2, respectively). As for potential hot spot candidates of the mAOX3 dimer, we identified 6 residues: Pro607, Glu760, Asn790, Asn1078, Tyr1029 and Ser1129 (Fig. 3). These amino acids are spread along the dimerization interface but two regions are particularly important, since they contain hot spot residues from both monomers. The first region includes residue Pro607 from both chains (Region 1—Figs. 3, 4), while the other one (Region 2—Figs. 3, 4) consists of residues Asn790 and Asn1078 from chain B and residue Tyr1029 from chain A. In these two regions, with particular reference to Region 2, the hot spots of each monomer are interacting, indicating that they may be important sockets for protein dimerization. These regions are ideal candidates for mutational studies aiming to develop target inhibitors, which can be used to prevent protein dimerization.

Fig. 3
figure 3

Residues in the dimerization site considered as hot spots in the mAOX3 structure are highlighted in orange. ChainA is represented in blue and ChainB in red. a Before dimerization and b after dimerization

Fig. 4
figure 4

Two important binding sites (Region 1 and Region 2) in the mAOX3 dimerization interface (center). In the left and right close-ups are highlighted, respectively, the most important residues in Region 1 (Pro607) and Region 2 (Asn790, Tyr 1029 and Asn1078). The mAOX3 ChainA is represented in blue and ChainB in green

Except for Glu761 and Ser1129 the identified mAOX3 hot spots are not conserved in the other mouse isoforms (Fig. 1). These residues constitute also an interesting target for further studies on the dimerization process. It is generally accepted that all the catalytically active mouse isoforms are homodimers [3], but this new results show that they may differ in their propensity to dimerize.

Isoenzymes active-site funnel comparisons

To further investigate the isoenzymes region of the funnel leading to the active site, the corresponding residues were displayed based on a solvent accessible surface area above 8 Å2 (Fig. 5). These residues should therefore coincide with the ones that would ideally interact more closely with the isoenzymes substrates. Unlike the other isoforms, with particular reference to mAOX1 and mAXO3, the funnel of mAOX4 is characterized by general hydrophobic nature although a few charged amino acid residues lineup in the funnel, close to the surface. From here it can be inferred that besides being smaller, the substrates of mAOX4 should be mainly hydrophobic. The other isoforms are likely to accept larger and more polar substrates due to the wider size of the funnel. In the case of mAOX1 and mAOX3, it cannot be excluded that the two isoenzymes accept charged substrates, as the binding site is populated with several charged amino acid residues.

Fig. 5
figure 5

Shape of the binding site in mAOX1, mAOX2, mAOX3, and mAOX4 as well as residues in close contact with the binding site funnel. The yellow-colored residues indicate the amino acid conserved in all the mAOX isoforms. Cyan-colored residues indicate the amino acids whose nature is conserved in all the mAOX isoforms

Figure 5 also illustrates the region of the active site that is highly conserved from the sequential and structural point of view. The region containing the most conserved residues is located around the molybdenum ion and ligands, and it contains the highly conserved Gln772, Lys889, Ser1085 and Glu1266 residues (mAOX3 numbering, Fig. 1). Gln772 and Glu1266 interact directly with the molybdenum cofactor and establish two important hydrogen bonds with the axial oxygen that is bound to the molybdenum ion (2.6 ± 0.5 Å in average) and with the hydroxyl ligand (2.5 ± 0.4 Å, in average), respectively. These interactions are deemed to be fundamental for the correct positioning of the cofactor in the active site and for the catalytic processes, due to the direct involvement of Glu1266 in the nucleophilic attack of the substrates. In the mAOX3 crystal structure, Gln772 and Glu1266 are, respectively, 3.5 and 4.0 Å away from the axial oxygen and hydroxyl group, supporting the correctness of the models obtained by the simulation studies. The remaining residues Lys889 and Ser1085 interact very closely with Glu1266 by three short hydrogen bonds. These interactions are conserved in all mouse isoforms revealing that the position and conformation adopted by Glu1266 is important for the catalytic activity of the enzymes.

As expected from the previously reported mAOX3 crystal structure [12], Phe919 is another key residue in the active site and its position is conserved in all the mouse isoforms. This residue is proximal to Lys889 at the entrance of the active-site cavity and may play a role in the correct alignment of the substrates onto the active site. The presence of large hydrophobic residues characterized by the presence of aromatic rings (His745, Phe772 and Phe918 in mAOX1; Phe929 and Thr1092 in mAOX2; Phe 919 and Thr1082 in mAOX3; Phe921 and Phe1083 in mAOX4) besides Phe919 is another common feature of all AOX isoforms (Figs. 1, 5). These residues make the area close to the active site very tight, which implicates them in the correct alignment of the substrates for catalysis. In addition, all mAOX isoforms share a region in the middle of the funnel which is populated with one or more negatively charged residues (Asp876 in mAOX1; Asp880, Glu883 and Asp888 in mAOX2; Asp877, Asp878 and Glu880 in mAOX3; Glu782 and Glu781 in mAOX4). Interestingly Asp877 (mAOX3 numbering) is sequentially and structurally conserved in all isoforms. Similarly, the immediately downstream Asp877 residue is also sequentially conserved: Glu in mAOX1 and mAOX4, Asp in mAOX2 and mAOX3. The following residue in the sequence (Glu880 in mAOX3) is also conserved in mAOX2, mAOX3, and mAOX4, but not in mAOX1, where it is substituted by a Leu. These results suggest that this negative region of the protein is important in the substrate orientation toward the active site.

Conclusions

The results obtained in this work have shown that the overall structures of the mouse aldehyde oxidase isoenzymes are generally conserved. However, deviations are found in some loops located at the protein surface and these should not interfere with protein function. The other deviations observed in the substrate-binding site may be responsible for the differences in catalytic activity, substrate and inhibitor specificity described for mouse AOX1 and AOX3 [37].

We have observed that the binding site is composed by a highly conserved region of the Mo center and ligands, and by residues Gln722, Lys889, Arg917 and Ser1085 (mAOX3 numbering), and by a region, which is different among the several isoforms and is located just above the conserved region.

The binding site conserved region is also composed by Phe919 (mAOX3 numbering) and by at least one or two hydrophobic residues containing aromatic rings in their side chains (His, Tyr and Phe). We propose that this region might be important to direct and align the substrates correctly into the active site of each isoform.

The different region in the binding site can be defined as the specificity region, since the shape of the binding site and the nature of the amino acid residues change significantly (Fig. 6). The mAOX1 isoform has the wider specificity region with a variety of polar and charged amino acids possibly indicating that this isoenzyme might accept a wider range of substrates with different shapes, sizes and nature. In contrast, mAOX4 has the narrowest specificity region, mainly composed by hydrophobic residues, suggesting the binding of small, linear and hydrophobic substrates. The mAOX3 and mAOX2 isoforms are very similar but with the presence of more hydrophobic amino acid residues in the mAOX2 form, suggesting that the substrate specificity of mAOX2 should be closer to the one from mAOX4 and the mAOX3 more similar to mAOX1.

Fig. 6
figure 6

Schematic representation of the mAOX binding site of all the isoforms. The representation includes the conserved region of the active site and the corresponding non-conserved region that dictate substrate specificity

We have also identified potential hot spot candidates in the dimerization interface of mAOX3. The amino acid residues Pro607, Asn790 and Asn1078 can be considered as targets for future mutational studies aiming to better understand the protein dimerization process.

In general, these results suggest that the different mouse aldehyde oxidase isoforms might have different substrate specificities and this would have an impact on the ability of the mouse to oxidize a wider range of substrates, with different characteristics and more efficiently. This might constitute an advantage over other mammalian species, particularly in humans, whose genome encodes only for a single aldehyde oxidase protein.