1 Introduction

Carbonic anhydrases (CAs) (E.C. number 4.2.1.1) comprise an important class of enzymes catalyzing reversible hydration of carbon dioxide to bicarbonate [1]. In this chapter, our focus will be on selected aspects of computational modeling of CAs. First, we will overview applications of molecular docking to study CA and its inhibitors. Some issues encountered when docking into the CAs will also be touched upon, such as the presence of the metal ion and the ligand rotamer problem. We will briefly overview the existing literature on molecular dynamics (MD) and quantum mechanics (QM) calculations with CA as a target. Finally, the successes and shortcomings of quantitative structure–activity relationship (QSAR) method will be discussed.

2 Molecular Docking

Molecular docking , or simply docking, is a popular method of determination of the optimal geometry of the ligand/protein complex. The idea of docking is based on the intuitively simple “lock and key” theory suggested by Emil Fischer more than 120 years ago [2]. “Virtual screening” is a selection of compounds, active against a certain receptor, from a large compound database, commonly by means of the molecular docking. Docking and/or virtual screening are often used to analyze the CA inhibitor interactions within the binding site, and to predict new and hopefully better binders.

A search in Scopus database (https://www.scopus.com/) for articles containing “carbonic anhydrase” together with “docking” or “virtual screening” in their title, abstract, or keywords leads to 245 hits (end of March 2018). After a manual examination of the list, it was reduced to about 200 papers directly dealing with the molecular docking of the small molecules (ligands) into CA. It should be noted that this list does not include papers in which CA is a part of a wide selection of docking targets, and therefore not specifically listed in the abstract of keywords, for example, when testing performance of docking programs and/or scoring functions (e.g., GOLD [3], FlexX [4], and Glide [5]), or as a part of specially designed ligand/receptor sets, such as Astex diverse test set [6].

Further analysis of the list of the 200 papers shows that docking into CA receptor is becoming progressively more widely used in research: 11, 23, and 38 papers were published in 2011, 2014, and 2017, correspondingly. The first relevant paper related to the development of the first molecular docking program DOCK goes back to 1988 [7]. Probably the most usual application for the docking procedure is the rationalization of the observed inhibition results. The examples are too numerous so we will just mention several. Docking was used to investigate binding to CA of 5-aryl-1H-pyrazole-3-carboxylic acids [8], bisindolylmethanes [9], isatin analogs [10], indolin-2-one-based sulfonamides [11], carbohydrazones [12], coumarin derivatives [13], and so on.

It would be interesting to analyze which docking programs were used the most. In about 195 papers to which we had access and where the used docking program was explicitly named, about 20 docking programs have been employed. AutoDock [14] or its variants were used in 54 papers, followed by GOLD [3]—48 papers, Glide [5]—46, MOE-Dock [15]—18, FlexX [4]—7, CDOCKER [16]—5, and DOCK [17]—5 papers. It should be noted that some papers employ several docking programs. Here we did not analyze the use of the scoring functions used to evaluate docking quality, because the scoring functions usually have multitude of variants, and they tend to develop much more dynamically than the docking engines. The frequencies of the use of the first three abovementioned programs are consistent with the trends observed in the general docking literature [18]. Interestingly, MOE-Dock [15] was used about three times more often when docking to the CAs compared to the general use [18].

Comparing performances of the docking programs when dealing with CAs is especially interesting, but the performance seems to be inhibitor dependent. Tuccinardi et al. found that GOLD yielded better docked conformations compared to AutoDock for a series of CA II sulfonamide inhibitors [19]. Mori et al. tested four programs to dock to CA II a non-sulfonamide inhibitor, and found AutoDock to perform best [20]. Kontoyianni et al. tested the performance of 5 docking programs on many targets, including CA, and out of 25 CA dockings, only in 2 cases, one using Glide, and the other using GOLD, the top ranked conformation (pose) was characterized as being “close” to the experimental structure [21]. This shows that the success of docking to CA is not guaranteed. Below, we will briefly analyze some of the issues which are encountered during docking.

The most important CA isoforms contain a metal ion, usually zinc. Zinc ion requires a special care in order to be able to use the receptor for any meaningful computer simulations, including docking. For example, in the AutoDock program the proper handling of Zn, named AutoDock4Zn force field, was implemented only relatively recently [22]. This is especially important for sulfonamide inhibitors since only the sulfonamide nitrogen is being ligated to the metal, but the sulfonamide oxygens are similar to the nitrogen from the viewpoint of simple electrostatics and therefore could confuse the docking algorithm. In Vdock docking program [23] the proper coordination with zinc can be most easily ensured by fixing the ligating atom in space by setting the dimensions of the translational box to zeros [24]. Other docking programs deal with the metal ion issue in their own way, for example, in GOLD program metal coordination is modeled by “pseudohydrogen bonding,” and metals bind to H-bond acceptors. In addition, the ion can be set to a particular coordination geometry, e.g., tetrahedral [25]. At any rate, some caution and a bit of a common sense is required when docking, and probably some sort of validation of the program using known structures as well. For example, in a paper by Suthar et al., docking using AutoDock, apparently the version without the patch to treat the zinc, missed sulfonamide ligation to zinc, which seems to be not likely [26].

It is important to use the correct structures when doing docking simulations (in fact, any kind of simulations), and especially when validating simulation results. For example, Hartshorn et al. pointed out that in the Protein Data Bank (PDB) entry 1jd0, the ligand rotamer is probably incorrect (cis amide instead of trans) [6]. This makes this particular ligand/protein complex slightly tainted when used for validating the program or comparing docking program results. The rotamers in the experimental structures could also be incorrect not only in the ligand but also in the protein [27]. However, most often simulation errors arise not from the experimental data but from the theoretical framework, e.g., force fields.

Sulfonamide derivatives are by far the most common inhibitors of the carbonic anhydrases. The importance of the correct energy profile of the dihedral angle in the substituted sulfonamides to interpret the experimental results has been shown [28, 29]. A survey by Morkūnaitė et al. showed that in the PDB the majority of carbonic anhydrases bound sulfonamides, where the sulfonamide is connected to phenyl ring with H at ortho positions, the phenyl ring plane forms average 14.7° angle with one of the bonds, i.e., it is slightly off from being aligned with (position a in Fig. 15.1) [30]. Interestingly, in most molecular (i.e., not bound to receptor) sulfonamide X-ray crystal structures the phenyl plane is rotated in such way so that p-orbital of the ipso carbon divides angle in half (position b in Fig. 15.1) [31, 32]. However, docking of the ligands parameterized using CHARMm force field [33] had a strong tendency of aligning phenyl ring with bond which was clearly incorrect [30]. To overcome this, a constraint on the torsion was imposed during the docking. A more proper way to handle the problem should involve the improvement of the force field parameters. Indeed, CGenFF force field [34], which similarly to CHARMm is also compatible with CHARMM [35], contains improved arylsulfonamide torsional angle parameters.

Fig. 15.1
figure 1

Newman projection of the aryl sulfonamide torsion angle. Conformation a corresponds to the average dihedral angle for ortho unsubstituted sulfonamides in CA; angle b is most often found in unbound sulfonamides; and angles b and c are found in 2,3,5,6-tetrafluorobenzenesulfonamides bound to CAs in the Protein Data Bank

Another example of CA inhibitor rotamer problem is related to 4-substituted 2,3,5,6-tetrafluorobenzenesulfonamides. A quick PDB survey finds 14 structures containing compounds belonging to this series, complexed with various CA isoforms. Some of the PDB entries have several chains; moreover, ligands in several PDB chains have two alternative positions. Within these 14 structures there are 33 instances of tetrafluorobenzenesulfonamides bound to Zn. This includes molecules bound to different chains, as well as their alternative conformations.

Seven conformations out of 33 are bound to Zn in the CA II isoform. In the 4ww6 PDB entry one of the conformers is bound outside of the binding site; therefore, this conformation is not included into this analysis. While the non-fluorinated benzylsulfonamides tend to have the phenyl plane nearly aligned with as illustrated vide supra, in 3 cases out of 7 the fluorinated phenyl ring is aligned with , and in the rest of 4 cases it is aligned with (conformations c and d in Fig. 15.1, correspondingly). Notably, probably due to the bulkiness of the fluorines, and rotamers are somewhat tilted with respect to each other (Fig. 15.2). This is most likely due to the bulkiness of the fluorines, and this tilt could be important for the drug design.

Fig. 15.2
figure 2

Two superposed CA II structures with different tetrafluorobenzenesulfonamide rotamers (4ht0: pink protein and purple ligand and 5lle: salmon protein and orange ligand). Zinc is shown as a grayish sphere. Possibly important water molecule in 4ht0 (small red sphere) is also shown

Examination of Fig. 15.2 shows that the conformations of protein sidechains near the ligand are quite similar in both cases; therefore, answering a question of why one rotamer is preferred over the other one is not trivial. Interestingly, preliminary exploration of rotamer interaction energies with the protein using CHARMM/CHARMm force fields for the and rotamers of 4ht0 ligand shows a high energetic preference (by at least 10 kcal∕mol, and mostly due to van der Waals clashes) for the rotamer. This raises a question of the reason of the existence of the aligned rotamer in 4ht0. The most rational explanation would be the presence of the water molecule near the rotamer, almost at the same position where one of the fluorines of the rotamer is situated. Probably this water molecule, which makes a hydrogen bond with the sulfonamide nitrogen, is forcing the phenyl plane into the rotamer. Modeling of the rotamer in 4ht0 without the relevant explicit water molecule or without constraining of the dihedral angle would be quite challenging. An even bigger challenge is predicting which rotamer, or , will be prevalent for a particular ligand. Because of the apparent role of the water influencing the rotamer choice, it comes as no surprise then that the rotamer preference can be directly correlated with the enthalpy or entropy being the driving force for binding [36].

Interestingly, different CA isoforms show different rotamer preferences for this series of ligands. Three ligand instances in CA I exhibit only the rotamer, and in five instances in CA XIII the ligand adopts exclusively the rotamer. In the case of CA XII, the situation is mixed: out of 17 instances, 10 and 7 belong to the and rotamers, respectively. Moreover, different rotamers happen to be in the same PDB (4ht2, 5msa, and 5msb). This behavior is puzzling especially since hydrogenated analogs of these compounds (benzenesulfonamides) essentially have a preference for the rotamer (Fig. 15.1). Incidentally, in 4ww6 the ligand of this series bound off-site (i.e., not bound to Zn) has the rotamer.

This analysis shows the importance of rotamers when targeting different CA isozymes. It might well be that the rotamer preferences for the ligands could be solved only when entropy and enthalpy interplay for a given ligand/receptor is fully understood; therefore, these inhibitor/receptor complexes are good systems to test proposed thermodynamic energy component calculation algorithms. At the very least, when performing docking validation tests, one should also be aware of how well the key torsional angles are reproduced.

The flexibility of the protein often needs to be taken into account. Fortunately, CA does not undergo gross conformational changes upon inhibitor binding [37]. However, one should always watch out for the rotamers of some sidechains such as Gln, Asn, and His, in which the X-ray crystallography cannot distinguish between C, N, and O atoms [27], so there could always be ambiguity in atom assignment.

To further illustrate the problems arising during docking into CA, we performed a docking of 3sbi ligand into the CA II receptor from the same PDB id [24]. The pyrimidine tail of the ligand is partly hydrophobic, partly hydrophilic (structure shown in Fig. 15.3) yet it is bound to an extremely hydrophobic site lined by residues Phe131, Val135, Leu198, Pro202, and Leu204. A majority of CA inhibitors of sufficient length present in the Protein Data Bank are bound to that site, hence we consider this as an exemplary test. The docking used CHARMM [35] and CHARMm [33] force fields for proteins and the ligand, correspondingly. The solvent effects were modeled using a simplistic distance-dependent dielectric approximation ε = 4r ij [38], in which the electrostatic force between two charges is divided by the distance between the charges (with the additional scaling factor). This essentially mimics electrostatic shielding of water molecules between two remote charges, but does not include other effects such as desolvation, or entropy of the solvent. Also, due to reasons which were already discussed, the sulfonamide atom was fixed in space at the correct position, and the torsional angle of the bond connecting the phenyl ring with the sulfonamide group was constrained to ± 15 around the experimental value. Since we wanted to explore the conformational space within the whole binding site, the genetic algorithm was switched off. One hundred conformations were generated using this setup. In addition, to explore the conformations near the X-ray conformation, additional 20 docked minima were generated where all torsional angles had ± 15 constraints around the experimental values.

Fig. 15.3
figure 3

120 docked conformations of 3sbi ligand (structure depicted on the left), docked with the genetic algorithm switched off to enhance sampling of the binding pocket. The conformations are colored by the rank according to the docked energy: from blue (the best) to red (the worst). The X-ray conformation is shown in green. Note the self-folded conformations in the lower right of the picture. See text for more explanations

The 120 docked conformations (sometimes also called “poses”) are shown in Fig. 15.3. Each conformation is colored according to the docked score, blue being the best and red being the worst. Perhaps strikingly, the poses close to the native X-ray conformation, pictured green, have a relatively poor score. The best scores belong to the self-folded conformations of the ligand (Fig. 15.3, lower right) in which the pyrimidine ring is stacked against the phenyl ring. While the feasibility of the self-folded conformations will not be discussed here, we will note in passing that these conformations probably would be penalized because they would have to push out one or two water molecules out of the binding site. Nevertheless, ligands have been found to bind to CA in a self-stacked binding mode, but only for fluorinated compounds [36, 39].

For the other, presumably more realistic, non-self-folded conformations, the scoring function seems to underestimate the hydrophobic effect of the attraction of the ligand tail into the hydrophobic pocket. The funnel-like binding site of CA is hydrophobic only in one part of the packet, and mostly hydrophilic in the rest; therefore, the docking program tends to bind ligand pyrimidine nitrogens to the opposite, hydrophilic part of the binding pocket. The hydrophobic effect of the ligand bound into its proper location is therefore underestimated by the force field (including the apparently insufficiently adequate distance-dependent dielectric approximation for the estimation of the solvent effects).

The hydrophobicity and hydrophilicity of the CA II binding site is illustrated in Fig. 15.4. The picture was generated using Voronota program [40] by drawing Voronoi cell faces between the receptor and the conformations of the ligand in Fig. 15.3. Voronoi cells are cells taken by an individual atom in space, and are enclosed by planes (or spheres because of the different radii of the atoms, to be exact) which divide the distance between the atoms exactly in half. Alternatively, Voronoi cell of an atom can be comprehended as a zone in the space consisting of points that are closest to that atom but not to the others. The yellow and blue colors signify hydrophobic and hydrophilic surfaces. The hydrophobicity of the subpocket which is holding the tail of the ligand is very obvious, and the mismatch between the hydrophobicity of the pocket and the hydrophilic nitrogen atom pointing downwards is also apparent.

Fig. 15.4
figure 4

The binding site of CA II, shown as Voronoi cell faces drawn between the CA II receptor and the ligand conformations from Fig. 15.3. The faces are colored based on the hydrophobicity (yellow) or hydrophilicity (cyan) of the receptor atoms. Two views are shown (a and b). The X-ray conformation of the 3sbi ligand is shown as colored sticks

The concept of mismatching hydrophobicities and hydrophilicities between the receptor and the ligand is further explored in Fig. 15.5 using Voronoi cells. This time, only the X-ray conformation is analyzed. The Voronoi cell faces are colored based on the hydrophobicity/hydrophilicity of the atoms which are separated by that face. The red and blue colors signify a match between the atom types (both receptor and ligand atoms are hydrophobic and hydrophilic, correspondingly), and the yellow color represents a mismatch: one atom is hydrophobic, and the other atom is hydrophilic, or vice versa. Note that the pyrimidine ring nitrogens are causing a surface mismatch inside the hydrophobic pocket (cf. Fig. 15.4). Perhaps it is then not surprising that the docking tries to place the ligand tail in the more hydrophilic parts of the binding site. A “perfect match” between the ligand and the protein formally would have to contain no “mismatching” atom types, with an important caveat: this scheme does not show interactions with the bulk solvent.

Fig. 15.5
figure 5

Voronoi cell faces between the X-ray structure of the 3sbi ligand (shown as sticks) and the CA II receptor. The faces are colored based on the hydrophilicity of the ligand and receptor atoms. Red color signifies that both atoms (ligand and receptor) are hydrophobic (i.e., carbons), blue—both are hydrophilic (heteroatoms), and yellow denotes a “mismatch”: one atom is hydrophilic, and the other atom is hydrophobic. Two views are shown (a and b)

First of all, in the “open” surface area on the top of the ligand in Fig. 15.5 the ligand is interacting not with the protein but with the bulk solvent (water). Since by definition water solvent is hydrophilic, for the formation of good hydrogen bonding network with water it is advantageous that at least some of ligand atoms which are exposed to the solvent are hydrophilic, such is one of the pyrimidine nitrogens in the 3sbi ligand. If the free (unbound) ligand in the water makes hydrogen bonds, it is important that these hydrogen bonds are not lost when the ligand is bound to the protein [41], i.e., the hydrogen bonds should also be formed between the ligand and the protein upon desolvation of the ligand and/or protein. Also, the hydrophobic surfaces of both unbound ligand and the protein (they bind poorly to the water) will improve the binding energy if they match when they are bound. Many sophisticated scoring functions are trying to address desolvation and other ligand binding to protein issues [42, 43].

It should be kept in mind that the whole framework of interactions valid for one receptor could change, sometimes dramatically, if the ligand is modified, or if one CA isoform is replaced by another isoform. A simple change of the rotamer or a change of the overall binding mode can have far reaching enthalpic and entropic effects on the binding affinities [36].

Several papers by our group, for example [36, 44,45,46], are using intrinsic binding affinities for their analysis of inhibitor binding, which are very important for ionizable inhibitors such as sulfonamides because they lose a proton when binding to zinc. In short, because of the involvement of protons in the enzymatic reaction, the observed binding affinities are dependent on the pH of the reaction environment. The intrinsic binding constants K b,int are pH- and buffer-independent and are calculated as the observed binding constant K b,obs divided by the fractions of the deprotonated inhibitor \(f_{\mathrm {SA}^{-}}\) and the protonated Zn-bound water form of CA \(f_{\mathrm {CAZnH_{2}O}} \) [36]:

$$\displaystyle \begin{aligned} {K_{b,int}}= \frac{{K_{b,obs}}}{\left(f_{\mathrm{SA}^{-}}\right) \left(f_{\mathrm{CAZnH_{2}O}}\right)} \end{aligned} $$
(15.1)

One of the consequences of this is that the deprotonated form of inhibitor is the active form. Hence, more acidic sulfonamides will tend to be better inhibitors, because a larger fraction of them will be in the anionic form in the solution. Moreover, because of the independence from the pH, the thermodynamic analysis of the binding, especially when considering reaction enthalpy and entropy, is more robust and more meaningful. Therefore it would be advantageous to use intrinsic thermodynamic parameters by the computational approaches covered in this chapter.

3 Molecular Dynamics

Molecular dynamics (MD) method, along with some related methods such as Monte Carlo approach, offers a detailed picture of the development of molecular systems in time. While being able to give very useful insights, it is also much more computer-demanding compared to docking. A quick Scopus search showed up to 200 papers in which MD was applied to study carbonic anhydrases. Without going into too much detail, we will briefly mention some of the MD applications.

In many papers, MD was used after the molecular docking to further improve (refine) the predicted pose of the ligand, and to allow for the more thorough relaxation of the protein. For example, Özgeriş et al. used MD to further explore docking results of 2-aminotetralins and tacrine into CA I and acetylcholinesterase (AChE): MD was employed to investigate the effect of the ligand on the protein rigidity, and the interactions within the binding site of both receptors [47]. Similarly, Costa et al. used MD to examine interactions between the CA VA binding site residues and docked compounds from essential oils and acetazolamide [48].

An important application of molecular dynamics is exploration of the dynamics of the protein. For example, Prakash et al. investigated unfolding of CA IX in urea solutions of various concentrations [49]. Maupin and Voth explored variability of orientations of His64 sidechain in CA II [50]. MD was also used to explore ligand tail conformations [51].

To perform successful MD simulations, a reasonable set of force field parameters for zinc (as well as for other with components of the protein/ligand complex) is necessary. CA has been used to derive force field parameters for zinc bound to the protein, and the parameters were further validated using molecular dynamics simulations [52]. The obtained improved parameters are used to perform simulations not only on CA but also on other zinc proteins. Often parameter derivation also requires use of ab initio (quantum mechanical) calculations as well as validation using MD calculations, so these two methods are employed together, e.g., in a paper by Bernadat et al. [53].

A large number of papers investigate the catalytic reaction occurring in the CA binding site, or to better understand the nature of the ligand–protein binding. For example, Chen et al. investigated diffusion in the CA active site using Markov-state model and coarse-grained MD simulations [54]. Maupin et al. used multistate empirical valence bond (MS-EVB) method combined with MD to investigate proton transfer in CA [55, 56]. MD was used to deeper understand the hydrophobic effect [57] or the enthalpy/entropy compensation in protein–ligand binding [58]. Ganguly et al. combined molecular dynamics (MD) simulations, quantum mechanics/molecular mechanics (QM/MM) geometry optimizations, and QM/MM free energy simulations on a small protein which mimics CA to investigate hydrolysis of the p-nitrophenylacetate substrate [59]. Paul et al. used molecular dynamics and QM/MM calculations to explore intramolecular proton transfer reaction parameters in human CA II [60]. Koziol et al. used MD and ab initio density functional theory (DFT) calculations to investigate hydration using Zn-bound tris(imidazolyl) calix[6]arene aqua complex as a CA binding site mimetic [61]. This far from exhaustive list of applications shows that MD is a powerful tool for exploration of CA properties, as are quantum mechanics based calculations, which are briefly covered in the next section.

4 Quantum Mechanics

Similar in number to MD-related papers, Scopus showed up to 200 hits which combine “quantum” or “ab initio” calculations and “Carbonic Anhydrase” as the search terms. Quantum mechanical (QM) calculations have already been mentioned above, as they often were used in combination with other methods. In contrast with the methods based on force fields, QM calculations include the actual electronic structure of the molecules which are being investigated, and generally are very CPU and/or memory demanding. A variation of QM calculations where a smaller part of the system is represented by quantum theory governed nuclei/electrons, and the rest of the system described using empirical force fields is called QM/MM (quantum mechanics/molecular mechanics) . Because QM allows for a relatively accurate representation of atoms, it is used to explore the reaction mechanisms at a great detail. Below we very briefly will mention some representative QM applications.

Jiao and Rempe used density functional theory (DFT) calculations coupled with a continuum model of the surrounding environment to understand the factors determining the pK a of zinc-bound water in CA [62]. In several papers QM calculations were used to design and explore the behavior of biomimetics—compounds which mimic the active site of CA. Ma et al. used DFT calculations to study the mechanism of hydrolysis by Co-(1,4,7,10-tetrazacyclododecane) in order to model the activity of cobalt containing CA [63]. QM was also used to derive force field parameters for zinc-containing systems, for example, for AMBER [64] and OPLS-AA [53] force fields. It can be used to clarify details of binding of inhibitors. For example, Pecina et al. used QM/MM calculations to understand the differences between the binding of two carborane sulfonamide inhibitors [65]. Ghiasi et al. used QM approach to investigate the thermodynamic properties of fullerene based inhibitor bound to CA binding site [66]. As the computer resources are becoming cheaper, and the QM methods are improving, we will be seeing more of this type of simulations in the future.

5 Quantitative Structure–Activity Relationship (QSAR)

The methods mentioned above have a common feature: they are structure-based approaches. They employ a known or a modeled structure of the receptor, and also they use the experimental or calculated structure of the ligand bound inside the receptor. In contrast, quantitative structure–activity relationship (QSAR) method usually (with some exceptions) does not require the receptor structure to be known, and is based on the ligand structure. For this reason, it is sometimes described as a ligand-based approach.

QSAR is an area of computational research that builds virtual models to predict quantities such as the binding affinity or the toxic potential of existing or hypothetical molecules [67]. QSAR finds the parameters of the compounds that govern their biological activities and elucidate their mechanism of action [68]. Both these aspects of QSAR greatly help in modifying the structures of the compounds leading to compounds of high therapeutic value [69]. The determination of binding energies in QSAR studies is by no means simple. Free energies of binding depend on the ligand–protein interactions as well as on the loss of energy associated with stripping solvent molecules off the small-molecule ligand while moving from the aqueous environment of a cell or a body fluid to a protein binding pocket during the binding process [67]. The first stage of QSAR modeling is descriptor calculation. A molecular descriptor “is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiment” [70]. Molecular descriptors are calculated for chemical compounds and used to develop QSAR models for predicting the biological activities of novel compounds [71]. Feature selection is an important but still poorly solved problem in QSAR modeling [72]. In the second stage the most relevant descriptors for model must be elected, and finally fitting between selected features/descriptors must be carried out to have the QSAR model with optimum prediction ability.

Some papers are dealing with descriptors and their fitness for CA. It was found that quantum descriptors are critical of the pyrazolo[4,3-e][1,2,4]triazine sulfonamides antiproliferative activity against human MCF-7 cells [73]. This can indicate a more complex mechanism of cytotoxicity than the inhibition of CA IX and CA XII isozymes [73]. A new ad hoc descriptor T(OH..Cl) was designed to improve the quality of the CA XII QSAR affinity model which was defined as the sum of the topological distances between the hydroxyl groups and chlorine atoms in the molecule [74].

A series of QSAR papers deal with the analysis of the inhibitor binding to CA and propose modifications of the compounds. All compounds more active than acetazolamide have hydrophobic groups in phenyl substituted part and because of this reason have a higher activity than acetazolamide towards CA II [75]. If there are hydrogen bond acceptors on one side of acetazolamide pyrazole ring, then the activity of compounds towards CA II can be increased [75]. The sum of the topological distances between the hydroxyl groups and chlorine atoms in the molecule is important in benzensulfonamide affinity towards CA XII. Through 3D-QSAR it was shown that most of the benzensulfonamide selectivity against CA isoforms is caused by the benzene ring substituents [74]. It was suggested that locating of a dual moiety with hydrophobic/hydrogen bond acceptors properties on compounds, at a 13.6 Å distance, can result in higher selectivity of the compounds for isoforms CA II and CA IX and that locating of a dual moiety with hydrophobic/hydrogen bond acceptors or hydrophobic/hydrogen bond donor properties on compounds, at a 9.6 Å distance, can improve the selectivity of the compounds towards isoform CA IX [76]. Investigated molecules with a larger number of oxo groups show better interactions with enzymes and receptors, which could be the consequence of their stronger hydrophilicity [77]. Compounds lacking halide group to fulfill the hydrophobic feature in the designed pharmacophore model were seen as the main reason behind their imperfect CA inhibitor activity [9]. It was suggested that very close and very distant hydrophobic moieties are not in favor of ligand–receptor binding [78]. In a calibration set, the bond order and molecular orbital bonding have greater influence on aromatic/heterocyclic sulfonamide inhibitor activity value on β-CA [79]. The presence of the methyl, sulfur atom, and amino thiadiazole groups is not favorable to inhibitor activity on β-carbonic anhydrase [79].

Clinically important Plasmodium falciparum CA (PfCA) is a popular QSAR target, because its X-ray structure is not available. On the other hand, homology modeling was feasible. The QSAR model developed by the author concludes that the average free valence of H atoms and polarizabilities are favorable to P. falciparum CA inhibitory activity [80]. The large percentage in weight of CHNS, C, and F atom fragments seems to be favorable for the P. falciparum CA inhibitory activity, in contrast to fragment [80]. In the absence of 3D-protein structure and the lack of sufficient experimental data using the PfCA target, QSAR models were developed for inhibitors of P. falciparum CA [81]. The 2D-QSAR modeling analysis suggested the importance of electro-topological, electronic, extended topochemical atom, and spatial (Jurs) indices for modeling the inhibitory activity against PfCA [81].

It was found that formal (negative) charge and molecular polar surface area on benzoic acid analogs overwhelm the other correlations with CA III affinity constants [82]. The QSAR model concluded that the maximum charge of H in bonds is not favorable to CA IX inhibitory activity on ureido-substituted benzene sulfonamides, low percentage of atoms in aromatic circuits are favorable for inhibitory activity, polarity also influences the activity, and the number of aromatic carbon–nitrogen bonds play a dominant role for inhibitory activity [83].

QSAR was also used to propose novel inhibitors. ZINC database of purchasable compounds [84] was screened for possible new CA II inhibitors and three compounds were suggested as possible candidates, but the experimental assay on them was not performed [75]. QSAR was also performed on decorated nanotubes as CA inhibitors. It is possible that the entire CA enzyme interacts with the nanotube causing the enzyme to denature or prevent access to the active site. Another possibility is that the substituents on nanotube bind in the active site or at a secondary binding site on the CAs surface [85].

In some cases the analysis also used CA structural features to help explain the observed trends. It seems that compounds interact with the isoforms CA I and CA VII in a much less selective way than the remaining isoforms [78]. The highly potent cliff partner, dorzolamide (used as an antiglaucoma agent), shared the critically important sulfonamide group with its less potent partner, but formed two additional hydrogen bonds with residues Gln92 and Thr200 (PDB ID: 1kwr, 1cil), resulting in a 670-fold difference in potency [86]. Gln92 and Thr200 were identified as potency-modulating hot spots in CA II [86]. It was concluded that the entry of compounds with different sizes into the cavity of CA can be influenced by the bulkiness of the residue in position 131 [76].

CA inhibition data sets quite widely served as model data for QSAR method comparison and improvement. Compared to other methods, QSAR most effectively helped to quantify the subtle empirical relationship between the structure and the activity towards the CA II target [87]. Without filtration to remove all values from discordant sources/assays (with affinity differences greater than 2.0), no model whatsoever could be obtained for CA II [88]. Template CoMFA (comparative molecular field analysis), which was successfully applied to CA II, also has ease of use and versatility (having been developed with different applications in mind) that are superior to those of most other computer-aided molecular design methodologies [88]. The results show that for certain purposes genetic algorithm–multiple linear regressions are better than stepwise multiple linear regressions and for others, artificial neural network overcomes multiple linear regressions models [89]. Two six descriptor models were created for 22 benzenesulfonamides data set leading to R 2 as high as 0.99, and the leave-one-out (LOO) technique was used to establish the validity of the models [90]. The mathematical models revealed a poor relationship between the anticonvulsant activity related to CA inhibition and molecular descriptors obtained from DFT and docking calculations, so a QSAR model was developed using Dragon software [91] descriptors [92].

6 Concluding Remarks

In this chapter we described some highlights from the computational modeling of CA and its inhibitors. The computer simulations allowed to better understand the reasons for the observed binding affinities, help to reveal the reaction mechanisms, and to propose new and improved CA binders. CA is a challenging receptor due to difficulties in computational treatment of metals, sulfonamide compound rotamer problems, and the influence of the water solvent, among other issues. Nevertheless, the volume of theoretical calculations involving CA is increasing each year, witnessing a constant progress in designing better models for biomolecules.