Keywords

Introduction

Living cells have the cell factories operate as a collection of efficient molecular characteristics. The success of these factories depends on the efficiency of a particular class of biomolecules-protein enzymes (Agarwal 2006). Enzymes are the complex protein molecules that catalyze chemical reactions, i.e. transformations from one or more substrates to one or more products (Bugg 2004). An integrated view of protein structure, dynamics and function is emerging, where proteins are considered as dynamically active machines and internal protein motions are closely linked to function such as enzyme catalysis (Agarwal 2006). Enzymes exhibit the physico-chemical properties including solubility, electrophoretic properties, electrolytic behaviors and chemical reactivity of proteins (Lee 2006; Bhatia 2018). The sequence of amino acid of an enzyme also called as primary structure of enzyme plays an important role in enzyme function including substrate/cofactor binding or release (Yadav and Tiwari 2015). Thus the degree of biocatalytic activity chiefly depends on the integrity of the enzymes structure as a protein. The complete biochemically active enzyme is composed of a protein part (apoenzyme) with a co-enzyme or a metal ion and is called a holoenzyme. The co-enzyme in the enzyme structure may bind covalently or non-covalently to the apoenzyme. When the co-enzyme is tightly and permanently bound to protein part (apoenzyme) in this case it is known as a prosthetic group.

$$ {\displaystyle \begin{array}{l}\mathrm{Apoenzyme}+\mathrm{Prosthetic}\ \mathrm{group}=\mathrm{Holoenzyme}\\ {}\kern0.62em \left(\mathrm{Protein}\right)\kern2.1em \left(\mathrm{Non}-\mathrm{protein}\right)\kern2.1em \left(\mathrm{Complete}\kern0.62em \mathrm{Enzyme}\right)\end{array}} $$

An International Commission on enzymes was established by the International Union of Biochemistry [now termed the International Union of Biochemistry and Molecular Biology (IUBMB)] in 1956 to address the problems of enzyme classification and nomenclature based on the overall chemical transformation they catalyze. Enzymes are now named and classified systematically with an EC number to a four level hierarchical description depending on the overall chemical transformation of substrates into products (Cuesta et al. 2015). The EC classification is still made on the basis of the main reaction catalyzed. The EC denotes the six classes of enzymes based on general type of reaction being carried out including (EC-1) oxidoreductases, (EC-2) transferases, (EC-3) hydrolases, (EC-4) lyases, (EC-5) isomerases, and (EC-6) ligases, where EC stands for Enzyme Commission (Kumar et al. 2015).

Enzyme function is intrinsically linked to its structure, determining how it performs substrate binding, catalysis and regulation. The amino acid-based enzymes are globular proteins that range in size from <100 to >2000 amino acid residues. These amino acids can be arranged into polypeptide chains that are folded and bent to form a specific three-dimensional structure (Robinson 2015). Some of the amino acids in enzymes are involved in binding ligands (substrates, intermediates, products, organic cofactors, metal cofactors or allosteric regulators) and some are actively involved in catalysis by interacting with the substrate, intermediate or product of the reaction (Soding et al. 2005). The structures of enzymes can be elucidating by techniques such as spectroscopic methods, X-ray crystallography and more recently, multidimensional NMR methods. The X-ray crystallography has been the most widely used technique for structural characterization of enzymes. The first enzyme to be crystallized and its structure successfully solved was chicken egg lysozyme in 1965. NMR spectroscopy is a powerful tool for elucidating the structure–function relationships of enzymes. It yields detailed information regarding structure of enzyme and the specific ligands which bind to the enzyme. The structure of the ligands at the binding sites of enzymes and the structure of enzyme–ligand complexes can also be obtained, as well as the dynamics of the ligand and the associated structure of the protein binding site (Monasterio 2014). The aim of this chapter is to present and update the existing knowledge about basic principles of enzymes such as proteinaceous nature and substrate binding, detailed description of the enzyme classification and structural characterization.

Proteinaceous Nature of Enzymes and Substrate Binding

All enzymes are proteins made up of amino acids linked together by peptide bonds except small group of RNAase molecules (Bhatia 2018). The structure and reactivity of a protein depends its amino acid sequence, called primary structure, which is genetically determined by the deoxyribonucleotide sequence in the structural gene that codes for it (Illanes 2008). The deoxyribonucleotide sequence is transcribed into a mRNA molecule. The mRNA molecule upon reaching the ribosome of cell is translated into an amino acid sequence and synthesizes a polypeptide chain. The polypeptide chain is finally transformed into a three dimensional structure, called native structure, which is having the biological functionality (Schumacher et al. 1986; Longo and Combes 1999). The secondary three-dimensional structure is the result of interactions of amino acid residues in the primary structure, mainly by hydrogen bonding of the amide groups. For the globular proteins, like enzymes, these interactions dictate a predominantly ribbon-like coiled configuration termed ɣ-helix. The tertiary three-dimensional structure is the result of interactions of amino acid residues located apart in the primary structure that produce a compact and twisted configuration in which the surface is rich in polar amino acid residues, while the inner part is abundant in hydrophobic amino acid residues. This tertiary structure is essential for the biological functionality of the protein. Some proteins have a quaternary three-dimensional structure, which is common in regulatory proteins, that is the result of the interaction of different polypeptide chains constituting subunits that can display identical or different functions within a protein complex (Dixon and Webb 1979; Creighton 1993).

In enzymes, proteins (apoenzyme) can be conjugated or associated with other molecules like, co-enzyme or co-factor or a prosthetic group (Fig. 1). However catalysis always occurs in the protein portion of an enzyme. The co-enzyme in the enzyme structure may bind covalently or noncovalently to the apoenzyme. When the co-enzyme is tightly and permanently bound to protein part (apoenzyme) in this case it is known as a prosthetic group (Yadav and Tiwari 2015). Prosthetic groups may be organic macromolecules, like carbohydrates (glycoproteins), lipids (lipoproteins) and nucleic acids (nucleoproteins), or simple in organic entities, like metalions. Prosthetic groups are tightly bound (usually covalently) to the apoenzyme and do not dissociate during catalysis (Union of Pure and Applied Chemistry 2005–2009; Illanes 2008). Although there are also prosthetic groups that are not cofactors (e.g. retinal in light receptors), only those prosthetic groups that are located in the active site of an enzyme are denoted cofactors. Therefore a prosthetic group is distinguished from a coenzyme in that it stays with the enzyme over many catalytic cycles, possibly until the enzyme is degraded. The coenzyme, on the other hand, binds to the enzyme at the beginning of each catalytic cycle and leaves at the end of it (Union of Pure and Applied Chemistry 2005–2009).

Fig. 1
figure 1

The components of a holoenzyme

Small portion of the enzyme (active site) is involved in catalysis which is usually formed by very few amino acid residues. In enzymatic reaction substrate binds to the enzyme at the active site and produces changes in the distribution of electrons in its chemical bonds which lead to the reactions that result to the formation of products. The products formed are then released from the enzyme and is ready for the next catalytic cycle. It is the shape and charge properties of the active site of enzyme which enable it to bind to a specific substrate molecule, and demonstrate it specificity in catalytic activity (Whitehurst and van Oort 2009). According to the early lock and key hypothesis proposed by the German chemist Emil Fischer in 1894, the active site has a unique geometric shape that is complementary to the geometric shape of the substrate molecule that fits into it. However this rigid hypothesis hardly explains many experimental evidences of enzyme biocatalysis (Sonkaria et al. 2004). Later on through some techniques such as X-ray crystallography, it became clear that enzymes are quite flexible but not rigid structures. In the light of this finding, induced-fit theory was proposed by Daniel Koshland in 1958 according to which the substrate induces a change in the enzyme conformation after binding that may orient the catalytic groups in a way prone for the subsequent reaction. This theory has been extensively used to explain enzyme catalysis (Yousef et al. 2003). Since, it is the active site alone that binds to the substrate. The rest of protein acts to stabilize the active site and provide an appropriate environment for interaction of the site with the substrate molecule (Robinson 2015). According to the transition-state theory, enzyme catalysis is the transition state complementariness, which considers the preferential binding of the transition state rather than the substrate or product (Benkovic and Hammes-Schiffer 2003).

Classification of Enzymes

Classifying enzymes in different groups based on the type of reaction they catalyze is a possible way to gain an understanding of the bonds they create or break. Classification of enzymes is developing constantly and one current issue is that the recommendations for enzyme classification and nomenclature are inappropriate for several enzyme groups (e.g. carbohydrate-active enzymes), especially in case of enzymes with multiple substrate specificity and for isoenzymes. The enzyme classification system is being constantly updated with new enzymes or corrections to existing entries and the details of recommendations for enzyme classification are provided. Because of the growing complexity in the naming of enzymes, the International Union of Biochemistry [now termed the International Union of Biochemistry and Molecular Biology (IUBMB)] set up the Enzyme Commission (EC) for providing a systematic approach to the naming of enzymes and published first report in 1961. The sixth edition, published in 1992, contained details of nearly 3200 different enzymes, and supplements published annually have now extended this number to over 5000 (Robinson 2015). The E.C. number classification is a four level hierarchical system of an enzyme’s overall reaction or function. The E.C. first level corresponds to six classes according to the type of reaction being carried out includes oxidoreductases catalyze oxidation/reduction reactions (EC 1), transferases transfer a chemical group (EC 2), hydrolases perform hydrolysis of chemical bonds (EC 3), lyases also cleave chemical bonds by other means than by oxidation or hydrolysis (EC 4), isomerases catalyze geometric and structural changes between isomers (EC 5), and ligases joins two compounds with associated hydrolysis of a nucleoside triphosphate molecule (EC6). The next two classification levels are sub-class and sub-sub-class (level 2 and level 3) depends on a various criteria such as chemical bond cleaved or formed, the reaction center, the transferred chemical group or the cofactor used for catalysis. The final level (fourth) gives a serial number for each enzyme reaction, substrate specificity. One E.C. number denotes an overall chemical reaction of an enzyme. Thus, several enzymes, which may be non-homologous, may be identified by the same E.C. number if they catalyze the same overall reaction. For example, the enzyme with the trivial name lactate dehydrogenase has the EC number 1.1.1.27, is an oxidoreductase (indicated by the first digit) with the alcohol group of the lactate molecule as the hydrogen donor (second digit) and NAD+ as the hydrogen acceptor (third digit), and is the 27th enzyme to be categorized within this group (fourth digit). The basic E.C. number classification layout of enzymes is described in Table 1.

Table 1 The E.C. classification layout of enzymes according to the IUBMB enzyme nomenclature

The EC classification is still made on the basis of main reaction being catalyzed (Cuesta et al. 2015). Nowadays the assignment of EC numbers to enzyme is a common routine in the functional annotation of proteins and protein-coding genes in databases such as UniprotKB (UniProt Consortium 2013) and Ensembl (Kersey et al. 2014) and has been adopted by the widely uses Gene Ontology (GO) (Ashburner et al. 2000). However possible changes between EC classes are observed. There are some preferences such as transferases (EC 2) becoming oxidoreductases (EC 1), hydrolases (EC 3) and lyases (EC 4) (Martınez Cuesta et al. 2014). Exchanges between different EC classes suggest that the chemistry of enzymes is more complex than previously classified with close relationships between enzymes with radically different EC numbers. The substrate specificity of enzyme is represented by the last digit of the EC number, while the first three digits describe the type of the reaction. In case the sequence identity is below 70%, all the four digits of the EC number start to diverge quickly (Rost 2002). This creates an urgent need to choose alternative methods to sub-group enzymes that reflects their function or substrate specificity. The chemistry of related enzyme functions can now be explored using robust computational approaches like EC-BLAST (Rahman et al. 2014). This tool searches and compares reactions on the basis of bond charges, reaction centers, and structures of substrates and products (Cuesta et al. 2015; Rausch et al. 2005). For a dataset of functionally known protein sequences belonging to different enzyme groups, group-specific features can be extracted to build models using machine learning algorithms or computational approaches to predict the function of an unknown protein sequence or to assign a group label to it (Juncker et al. 2009; Ong et al. 2007). Table 2 shows the enzyme classification attempts based on sequence similarity, structural similarity and protein descriptors.

Table 2 Enzyme classification attempts based on sequence similarity , structural similarity and protein descriptors

Structural Characterization of Enzymes

The proteins in enzyme molecules fold into three-dimensional structures determining how it performs substrate binding, catalysis and regulation. Some of the amino acids are involved in binding ligands (substrates, intermediates, products, organic cofactors, metal cofactors or allosteric regulators) and some are actively involved in catalysis by interacting with the substrate, intermediate or product of the reaction (Soding et al. 2005). Thus the catalytic activity of enzymes depends on the integrity of their native protein conformation. The structures of enzymes can be elucidating by techniques such as spectroscopic methods, x-ray crystallography and more recently, multidimensional NMR methods.

X-ray Crystallography

X-ray crystallography has been the most explored technique for obtaining three-dimensional structures of proteins and in particular enzymes. Knowledge of three-dimensional structures is essential to understand reaction mechanisms at the atomic level (Feiten et al. 2017). One of the pioneers of enzyme crystallography was David Blow (1931–2004); he shared the Wolf Prize in Chemistry in 1987 for this research along with David Phillips (1924–1999), who first successfully solved the structure of chicken egg lysozyme in 1965 (Helliwell 2017). The Wolf Prize 1987 citation stated “for their contributions to protein X-ray crystallography and to the elucidation of structures of enzymes and their mechanisms of action”. Its structure was solved to a resolution of 2°A. The diffraction of X-rays caused by a single protein molecule is too weak to be measured (Rhodes 2006). Therefore, protein crystals are used for X-ray structure determination to amplify the signal. A protein crystal contains many copies of the molecule neatly arranged in a highly ordered regular three dimensional array or crystal lattice (Rhodes 2006). The suitability of enzyme crystals for structure determination is based on their ability to interact with X-rays. In the experimental setup (Fig. 2) a narrow beam of monochromatic X-rays of suitable wavelength is directed to the crystal which either traverses straight through the crystal, in between the enzyme molecules, or hit the electron clouds of the atoms in the enzyme molecules. The molecules arranged side-by-side in a periodic way form a lattice from which the waves diffracted to the same directions accumulate and strengthen each other to produce diffraction maxima that can be recorded by sensitive detectors (Petsko and Ringe 2004). Enzyme crystals are almost invariably frozen during the X-ray crystallography achieved by directing a cold stream of nitrogen gas onto the crystal or soaking in a solution called “cryoprotectant ” so that, when frozen, vitrified water, rather than crystalline ice, is formed. Freezing makes the crystal tolerant to damage by the radiation and usually allows a higher quality and higher resolution diffraction data, while providing more accurate structural information (Ilari and Savino 2008). Additionally, freezing may sometimes help in trapping substrates or other molecules that bind to the enzyme to become part of the structure, which is fundamental for structure-function studies (Rhodes 2000).

Fig. 2
figure 2

Structural characterization of enzymes by X-ray crystallography

Atomic’ resolution at ≥1.2°A resolution allows the placement of atoms with fewer geometrical restraints and gives a better picture of the protein structure. Advances in X-ray sources and cryo-crystallography have led to increasing numbers of structures solved at these high resolutions (Kleywegt et al. 1996). The three-dimensional representation of the protein may be displayed in a molecular structure viewer as a model that was created by the crystallographer to be chemically realistic and to match the observed electron density as precisely as possible. The resolution of a crystal structure is measured in angstrom and refers to the minimum distance between two points that can be distinguished. Although there is a large number of quality assessment methods available, resolution is a straightforward and robust parameter to assess the quality of a protein structure model (Kleywegt et al. 2004).

Nuclear Magnetic Resonance (NMR) Spectroscopy

NMR spectroscopy is a powerful tool for elucidating the structure–function relationships of substrates, peptides, proteins and in particular enzymes. It yields detailed information regarding structure of enzyme and the specific ligands which bind to the enzyme. The structure of ligands at the binding sites of enzymes and the structure of enzyme–ligand complexes can also be obtained, as well as the dynamics of the ligand and the associated structure of the protein binding site. The tertiary structures of proteins can now be obtained independently of diffraction data in solution by homo nuclear and hetero nuclear multi-dimensional NMR. In principle one can investigate the magnetic nuclei of each of the atoms within the molecule of the enzyme (1H, 13C, 15N, …) or ligands which bind to the enzyme (1H, 19F, 31P, 13C, …), or of the environment of the active-site (solvent 1H2O, 2D2O, 23Na, 39K, 35Cl, …) (Monasterio 2014). Until recently, NMR spectroscopy has yielded structures of protein complexes with small and medium size (~30 to 40 kDa). Major breakthroughs during recent past especially in isotope-labeling techniques, have enabled NMR characterization of large protein systems with molecular weights of hundreds of kDa. This has provided unique insights into the binding, dynamic, and allosteric properties of enzymes (Huang and Kalodimos 2017).

The useful approach to study enzyme structure by protein NMR is the observation of the resonances from histidine. The C-2 and C-5 proton resonances are downfield from the aromatic protons (Markley 1975). The classical use of these properties was with the small enzyme (Mr = 23,500) RNAase (Meadows and Jardetzky 1986) and the large enzyme (Mr = 237,000) pyruvate kinase (Meshitsuka et al. 1981). The C-2 proton resonance is especially sensitive to the ionization state of the imidazole nitrogens, thus the pKa for each individual histidine within the native enzyme can be obtained from titration studies. The binding of a ligand or metal ion to a specific histidine or histidines could result in a change in the magnetic environment (chemical shift) of the resonance and an alteration in the pKa. This application of NMR has been useful in some limited number of enzymes. Enzymes enriched with 13C and 15N have been used to increase the range of chemical shifts of these nuclei in order to enhance spectral dispersion and increases the possibility of resolving more resonances. The detailed structural and dynamic studies of larger proteins have been done with 13C and 15N isotope labels through NMR and nuclear Overhauser effect (Redfield et al. 1989). This type of studies is routine for determining the structure of enzymes and their dynamics using multidimensional NMR (Kevin et al. 1998; Bachovchin 2001). An alternative approach is use of a reporter group such as 19F on the enzyme or on the substrate to obtain information regarding enzyme structure and the effects of ligand binding on the enzyme (Geric 1981; Danielson and Falke 1996). 19F nucleus is 83% as sensitive as 1H, and has a large range of chemical shifts in addition there are no back ground resonances of 19F to cause interference. The 19F reporter groups can be incorporated by different methods. A fluorinated amino acid i.e. fluorotyrosine, fluor-oalanine can be added to growth medium and incorporated into the protein (Sykes and Weiner 1980). The amino acids i.e. tyrosines, alanines containing the 19F are labeled and will exhibit a resonance. The hetero dimer of tubulin, the principal protein of microtubules, fluoro tyrosine can be incorporated to α-subunit on the C-terminal amino acid through the reaction catalyzed by tubulin–tyrosine-ligase (Monasterio et al. 1995). An alternative approach is to covalently label the enzyme at a specific residue with a fluorine-containing reagent like trifluoroacetic anhydride, trifluoroacetyliodide, or 3-bromo-1, 1, 1-trifluoro-propanone. The chemical shift and/or the line width (1/T2) of the 19F label, a “reporter” for a change in the enzyme structure, must reflect ligand binding and/or catalysis. In case 19F resonance is sensitive to conformational changes in the enzyme then site-specific modification of groups at the active site will be reflected by changes in the 19F resonance. The method of using reporter groups can be also be elucidated by using other labels like 2H or 13C labels. However, most other labels are less sensitive than fluorine. A potential strength of using these labels is the incorporation of 2H for 1H or 13C for 12C into the protein will have a very minor, if any, effect on the protein itself. Use of reporter groups yield information regarding the environment of the group. But not the specific structural features of the enzyme, comparative structural changes can be studied by photo-chemically induced nuclear polarization (photo CIDNP) originating from free radical reactions. This has been developed as a sensitive method to measure structural changes on the surface of proteins (Kaptein 1982; Berliner 1989). Photo-chemically induced nuclear polarization (photo CIDNP) requires a modified spectrometer and a proper light source (laser) to begin to probe surface changes. This technique has the advantage of high sensitivity, and it yields general conformation information (Monasterio 2014).

Conclusions

Enzymes are proteins responsible for catalysis of biochemical reactions. The classification information-rich EC number given by the Enzyme Commission as a simple identifier still persists. However robust approaches to quantitatively compare catalytic reactions or to accurately predict enzyme mechanisms are just beginning to appear. Further combining bond changes and reaction centers with structural information about the substrates, products and mechanisms are needed to capture the essence of enzyme chemistry in a functional classification.

X-ray Crystallography and NMR are most explored technique for structural characterization of proteins and in particular enzymes. Recent technical advances in crystallography, as well as better computational programs have made it much more rapid in solving enzyme crystal structures. Modern NMR spectroscopy techniques make extensive use of isotopically enriched proteins and should prove a powerful approach for structural characterization of proteins in particular enzymes in the future. Further technological advances are needed to establish NMR as the primary tool for obtaining atomic structures of challenging systems with even higher complexity. The accumulating data on enzyme structures—and novel approaches, particularly genome projects and bioinformatics—are expected to increase our understanding of enzyme function and mechanisms in the future.