FormalPara Key Points

Cytochrome P450 enzymes (CYPs) are critical in the metabolism of medications and endogenous substances related to health and diseases.

Proteomics and metabolomics technologies are needed to identify the disease biomarkers and optimize treatment strategies.

Integration of proteomics and metabolomics can facilitate a CYP-based personalized medicine approach.

1 Introduction

Cytochrome P450 enzymes (CYPs) represent a diverse family of heme-thiolate proteins involved in the metabolism of a wide range of endogenous compounds and xenobiotics. These enzymes play a crucial role in detoxification, drug metabolism, and the production or breakdown of various physiological compounds, such as steroids, fatty acids, and bile acids. The complex interplay between CYPs and their substates is fundamental to our understanding of drug response, drug-drug interactions, and potential adverse drug reactions [1, 2]. Alterations in the expression or activity of specific CYP enzymes can influence the metabolome, ultimately affecting response to drugs or other xenobiotics. For example, polymorphisms in CYP genes cause some individuals to be more susceptible to adverse drug reactions, while others experience sub-therapeutic effects [3, 4]. Furthermore, in disease states like cancer, inflammation, or neurodegenerative disorders, changes in CYP activity may result in disruptions in the balance of key metabolic pathways, further complicating treatment strategies and prognosis.

Proteomics involves the study of protein expression, modification, and interactions, providing insights into the complex machinery of cellular functions [5, 6]. Conversely, metabolomics provides a detailed snapshot of the metabolic state of a cell, tissue, or organism, reflecting the outcome of both genetic and environmental influences [7]. In recent years, there has been a keen interest in applying proteomics and metabolomics approaches to better understand the role of CYP enzymes in health and disease [8,9,10]. Combining these two robust methodologies allows for the dissection of the complex network of biochemical pathways that underpin the roles of CYPs in diverse physiological and pathological contexts. The integration of proteomics and metabolomics presents enormous opportunities for CYP-related research, addressing challenges like data interpretation, sample limitation, costs, and time. These combined methodologies, as described in various studies, facilitate a detailed exploration of protein functionality, as metabolomics complements proteomic insights. Current advancements include combining proteomic and metabolomic data analysis and even integrating methodologies from the sample preparation phase [11]. For patients on multiple medications, this combined approach affords a detailed understanding of drug interactions, potentially optimizing therapeutic interventions through a detailed analysis of protein levels and functionalities in a consolidated platform. The roles of specific CYP enzymes in a range of disease conditions, including liver diseases, cancer, and neurological disorders like Parkinson's disease and autism spectrum disorder, have been explored [12,13,14,15,16]. The outcomes of these studies offer critical data on disease pathology as well as suggest possible avenues for therapeutic or diagnostic applications. For instance, altered levels of certain CYPs in specific cancers can serve as potential biomarkers, facilitating early detection or predicting outcomes. Additionally, understanding the metabolism of drugs by certain CYPs can inform dosing regimens or predict possible drug interactions, ensuring patient safety and maximizing therapeutic efficacy.

Numerous studies have emphasized the use of proteomics and metabolomics in understanding diverse biochemical pathways [17,18,19,20]. However, a holistic review specifically focusing on the interplay of these technologies with CYP enzymes remains absent. To address this gap, this review sheds light on the pivotal roles of proteomics and metabolomics in leveraging insights into CYPs in different physiological and pathological contexts. The objective is to underscore the present status and potential of proteomics and metabolomics in CYP research and highlight their significance as therapeutic and diagnostic markers in optimizing therapeutics and patient safety.

2 Literature Search Strategy

This narrative review was developed through a focused search using specific keywords and electronic databases until July 15, 2024. Various combinations of keywords such as “cytochrome P450 enzymes,” “proteomics,” “metabolomics,” “drug metabolism,” “integration,” “liquid chromatography-mass spectrometry,” “disease conditions,” “metabolites,” and “pathological conditions” were employed to gather relevant articles. Alternative spellings, variations, and abbreviations of these keywords were also considered. Databases like PubMed, Medline, Embase, and Google Scholar were utilized. Only original research articles in English with a focus on personalized medicine, proteomics, and metabolomics of cytochrome P450 enzymes were included, excluding conference abstracts or unpublished prints. All authors independently performed the literature search, evaluated articles, and reconciled resources. Any minor inconsistencies were resolved through consensus.

3 Cytochrome P450 Enzymes

CYP enzymes are a superfamily of heme proteins that are primarily involved in the biotransformation of xenobiotics including medications, environmental toxins, health supplements, and dietary components. CYP isoforms are widely present in mammals, invertebrates, and microorganisms [21]. Although deactivation (detoxification) leading to the formation of more water-soluble and easily excretable forms is the primary goal of CYP-mediated conversion, a significant number of CYP enzymes are also involved in the biosynthesis and catabolism of endogenous substances including vitamins, hormones, and fatty acids [22]. In addition, the formation of toxic metabolites, also known as bioactivation, from xenobiotics or endobiotics is an outcome of CYP-mediated metabolism [22]. Currently, 57 CYP enzymes are categorized into 18 families and 43 subfamilies based on their amino acid sequence similarity. A sequence similarity of ≥ 40% places an enzyme into the same family, whereas ≥ 55% similarity categorizes an enzyme into the same subfamily [23]. The CYP enzymes from families 1 to 4 are known to have a primary role in xenobiotic metabolism. In contrast, the CYP enzymes from the remaining families (e.g., 7, 11, 17, 19, 24, 51) are solely involved in the anabolism and catabolism of physiological compounds [1, 24]. In general, the CYP enzyme amino acid sequences and functions are highly conserved between species. However, certain CYP orthologs can significantly differ in their substrate specificity because of amino acid differences [23, 24]. Among the xenobiotic-metabolizing enzymes in families 1 to 4, CYP3A4, CYP2C8, CYP2C9, CYP2D6, and CYP1A2 have major roles in terms of number of drugs metabolized by them [25]. Metabolism of environmental toxicants is primarily catalyzed by CYP1 family enzymes such as CYP1A1, CYP1A2, and CYP1B1. In contrast, CYP11A1, CYP17A1, and CYP19A1 are examples of steroidogenic CYP enzymes that are involved in the metabolism of hormones such as testosterone and estrogen [22]. Interestingly, certain CYP enzymes do not have established xenobiotic or physiological functions and are often termed orphan CYP enzymes [26]. CYP2S1 and CYP2U1 are examples of orphan CYP enzymes that are still being explored for their role in xenobiotic and/or endobiotic metabolism [27, 28].

CYP enzymes are ubiquitously present in humans; however, their level of expression and relative importance in drug and endogenous substance metabolism are differential in nature. The liver and intestine are considered the most critical sites of CYP-mediated xenobiotic metabolism because of their physiological role and anatomical localization [29]. As such, the liver, adrenals, and small intestine have the highest amount of CYP protein per milligram of microsomal protein. Approximately 60% of the total CYP enzymes are expressed in the human liver [21, 30, 31]. Induction of CYP protein levels and inhibition of CYP catalytic functions by a wide variety of chemical compounds are a hallmark of most CYP enzymes. For example, pregnane X receptor-mediated regulation of the CYP3A4 enzyme is induced by commonly used medications (e.g., dexamethasone) and over-the-counter supplements (e.g., St. John’s wort) [32]. Due to the frequent induction of CYP enzymes, the CYP protein levels are highly variable in the population and can significantly influence drug response and disease outcomes. Thus, the determination of CYP protein levels is important in the overall framework of drug discovery, development, and clinical application. Traditionally, CYP protein levels have been determined using gel electrophoresis followed by immunoblots [33]. However, this process is laborious and often leads to inconsistency from laboratory to laboratory for several reasons, for example, different antibodies and the need for CYP protein standards. The sensitivity of the secondary antibody-substrate-based biochemical reactions on the immunoblots is another major source of variability in CYP protein quantification [34, 35]. In contrast, drug metabolism reactions employ a wide range of analytical methods such as chromatography, ultraviolet-visible spectrophotometry, mass spectrophotometry, and nuclear magnetic resonance spectroscopy. The identification and measurement of metabolites, rather than monitoring substrate depletion, have been the gold standard of drug metabolism reactions [32]. However, carrying out reactions with multiple metabolites for each CYP isoform is laborious and time-consuming. These limitations in protein level determinations and metabolic function identification necessitate the use of more high-throughput and precise methods such as proteomics and metabolomics.

4 Proteomics in CYP Enzymes

Proteomics entails the comprehensive investigation of diverse protein attributes, encompassing facets such as expression level, post-transcriptional modifications, and protein-protein interactions. This effort is directed toward the primary aim of achieving a comprehensive understanding of disease pathogenesis or the complex mechanisms occurring within cellular structures at the protein level [5, 6, 36]. The significance of CYP enzymes transcends major scientific domains such as biochemistry, pharmacology, and toxicology, arousing a heightened interest in CYP proteomics. This enthusiasm stems from the pivotal roles CYP enzymes play in diverse biological processes, such as drug metabolism, xenobiotic detoxification, and endogenous compound transformation [8, 37].

CYP proteomics allows for a multidimensional approach to understand the dynamics, abundance, diversity, and functional significance of CYP enzymes within complex biological systems. A stepwise workflow of proteomics in biomarker identification and other applications is described in Fig. 1. The application of advanced proteomic methodologies enables investigators to recognize isoform-specific responses, explore various protein interactions, elucidate regulatory mechanisms, and identify potential biomarkers associated with disease susceptibility or drug response [12, 38, 39]. This insight is essential for the advancement of personalized medicine and the development of safer and more effective therapies [40].

Fig. 1
figure 1

Proteomics workflow of cytochrome P450 (CYP) enzymes and their applications. 2D-DGE 2-D gel electrophoresis, 2D-DIGE 2-D differential gel electrophoresis, ELISA enzyme linked immunosorbent assay, HPLC high-performance liquid chromatography, LC-MS liquid chromatography-mass spectrometry, MALDI matrix-assisted laser desorption/ionization, TOF time of flight

The array of methodologies employed in CYP proteomics encompasses a broad spectrum of techniques aimed at elucidating the structure, function, and post-translational modifications of CYP enzymes. These methods include mass spectrometry (MS), 2D gel electrophoresis, Western blotting [41, 42], and activity assays. Two-dimensional gel electrophoresis involves the separation of proteins based on their isoelectric point (pI) and molecular weight while Western blotting utilizes specific antibodies to detect and quantify CYP proteins, thereby revealing insights into their abundance and expression patterns within a sample. Western blotting, though widely used, comes with its set of limitations due to the reliance on indirect protein quantification methods. One notable limitation lies in the necessity of having a specific primary antibody targeting the protein under investigation. This prerequisite can pose a hurdle, as not all proteins have readily available antibodies, thereby restricting the scope of analysis. Moreover, extracting precise quantitative information from Western blots can prove to be a daunting task due to various factors such as variability in antibody binding efficiency and non-linear detection ranges. These challenges underscore the necessity of considering alternative or complementary methods for protein quantification, particularly in research and diagnostic settings where precision and reliability are paramount. MS is a pivotal tool for CYP proteomics, facilitating the identification, quantification, and characterization of CYP proteins. Mass spectrometry surpasses Western blotting and other quantification methods because of its capacity to provide high-throughput and comprehensive analysis of CYP enzymes within complex biological samples [38, 43, 44]. MS offers an unbiased examination of the proteome, permitting simultaneous quantification of multiple CYP isoforms, whereas Western blotting requires predefined antibodies for specific epitopes. Additionally, MS excels in its ability to detect subtle modifications, such as phosphorylation and glycosylation, offering a holistic understanding of CYP expression, function, and regulation [45, 46].

MS may be coupled with other techniques to enable the analysis of CYPs within biological matrices. For example, in vivo or in vitro stable isotope labeling, involving growing cultures in a specific medium containing a heavy isotope, such as 15N, can be coupled with LC-MS/MS methods [47]. More so, isobaric tags for relative and absolute quantification (iTRAQ) may be coupled with LC-MS for quantitative analysis of various CYP enzymes [48]. The selection of an appropriate approach to investigate CYP enzyme proteomics depends on the specific research goals, characteristics of the samples, and the desired information to be obtained. For instance, a strategy based on mass spectrometry was applied to concurrently detect and distinguish various human CYP isoforms. These included closely related isoforms like CYP3A4, CYP3A5, CYP3A7, as well as CYP2C8, CYP2C9, CYP2C18, CYP2C19, along with CYP4F2, CYP4F3, CYP4F11, and CYP4F12. Moreover, the utilization of titanium dioxide resin in conjunction with tandem mass spectrometry facilitated the identification of eight phosphorylation sites on human CYP enzymes, with seven of these sites being previously unreported [44]. In another study, the absolute quantification (AQUA) methodology was synergistically applied with protein multiple reaction monitoring (MRM) to assess the presence of human CYP2D6 across a cohort of 30 genotyped liver samples. The outcomes of this study revealed the detection of approximately 30 femtomoles of CYP2D6 per milligram of microsomal protein. Remarkably, the observed values exhibited a wide range, spanning from negligible levels up to nearly 80 femtomoles per milligram. Notably, these findings showcased a remarkable increase by a factor of 5–10 when compared to the currently accepted value, which had been formerly established at approximately 5 femtomoles per milligram. Grangeon et al. (2021) investigated the protein expression of 16 key isoforms within the CYP superfamily across human organ donors, emphasizing their involvement in drug pre-systemic biotransformation across the duodenum, jejunum, and ileum. The identification and quantification of these proteins were accomplished using the LC-MS/MS technique [38]. Chen et al. highlighted the efficacy of bioinformatics data analysis in LC–MS/MS experiments for elucidating phenotypic patterns linked to disease advancement, uncovering complex protein regulatory mechanisms at the molecular level and establishing connections between these aspects [39].

While mass spectrometry (MS) offers remarkable advantages in proteomics, it also comes with notable limitations that warrant consideration. A major drawback is the challenge of low dynamic range [49], where signals from abundant proteins overshadow those from less abundant ones, potentially masking crucial proteins of interest. Additionally, MS relies on peptide-based inference for protein identification, which can introduce inaccuracies when multiple proteins share similar peptides. The substantial cost associated with acquiring and maintaining MS instruments also poses a barrier to entry for smaller research laboratories engaged in proteomic studies. Moreover, the operation of MS demands expertise, and the interpretation of data requires sophisticated software, rendering it less user-friendly compared to other analytical techniques.

5 Metabolomics in CYP Enzymes

Metabolomics is a systems biology science-based ‘omics’ technology that enables real-time comprehensive profiling of small molecule metabolites in biological systems under specific conditions [50]. Metabolomics can be defined as the evaluation or analysis of metabolomes (i.e., molecules with weight < 1000 Da) in samples that are obtained from biological systems [51, 52]. Two types of methods are available in metabolomics, namely targeted (i.e., measurement of specific metabolite) and untargeted (comprehensive analysis of all measurable metabolites: known and unknown) metabolomics [52,53,54,55]. Orphan CYPs are groups of proteins without assigned specific cellular or biochemical functions [56]. Metabolomics of orphan CYPs is an example of an untargeted metabolic approach. Generally, the prediction of the function(s) of new proteins may be difficult, and they are often called promiscuous proteins [56]. Proteins such as CYP can be deorphanized by exploiting the impact of some of the environmental (or phenotypical) factors. For instance, a physiological or molecular change can be identified using LC-MS/MS [56, 57]. The metabolomics data from biofluids and tissues effectively elaborate CYP-mediated metabolism. Subsequently, the metabolomic phenotypes highlight the therapeutic outcomes and toxicity of drugs [58].

Metabolomics studies usually involve a few steps as shown in Fig. 2. From specific examples of metabolomics in CYPs, it is obvious that several analytical methods have been used for profiling endogenous metabolites in biofluids, namely MS, UV, Fourier transform infrared (FTIR), nuclear magnetic resonance (NMR), and Gas-liquid chromatography (GC) [59, 60]. Traditionally, the most frequently used methods for metabolomics are LC-MS, LC-MS/MS, and NMR. NMR is a non-destructive, highly reproducible method. All metabolites can be detected at once without the requirement of pre-separation and chemical modifications; it is relatively fast, and the concentration of metabolites can be determined [54, 61]. The high sensitivity, versatility, reproducibility, and more cost-effective instrumentation of MS analysis give it an edge over NMR and other methods, especially being able to detect metabolites at very low concentrations, thereby making it a method of choice for biofluids that often contain metabolites at pico- to micro-concentrations. In addition, MS has the capacity to differentiate metabolite isomers [62,63,64,65,66].

Fig. 2
figure 2

Cytochrome P450 (CYP) metabolomics flowchart and their applications. LC-MS liquid chromatography-mass spectrometry, NMR nuclear magnetic resonance, FTIR Fourier transform infrared, UV ultraviolet, GC-MS gas chromatographic mass spectrometry  

Pharmacometabolomics or drug metabolomics allows elucidation of drug effects, enhances the prediction of variation in drug response among phenotypes, and provides useful information about treatment outcomes [19, 67]. Hence, drug metabolomics gives information about pre- and after-dose metabolite profiles in a biofluid. It identifies possible metabolites associated with detected adverse/side effects and gives room for modification of therapy based on a patient’s characteristics, such as metabolism, genetic variation, and non-drug/environmental intervention [68,69,70]. The conventional strategy to identify the best chemotherapy for cancer treatment relies on the pathological characterization of the disease without considering the underlying individual biochemical variations and the potential of suboptimal clinical outcomes. Metabolomics shows promise in addressing these limitations by identifying patterns in metabolite expression, alterations in biochemical pathways, and novel biomarkers. This approach can characterize metabolic subtypes in cancers more effectively than traditional proteomic methods [71]. Pharmacometabolomics finds its roots in early animal studies, where concerns about result reproducibility surfaced because of notable differences in the metabolic profiles of biofluids among different animal groups. Initially, the proficiency of drug dosing technicians was brought into question because of the observed variations in metabolic profiles among animal groups. However, subsequent investigations revealed that these differences in drug metabolites between animals and humans were genuine, highlighting the potential applications of pharmacometabolomics in personalized medicine [59, 60, 67]. While pharmacogenomics has been pivotal in advancing personalized treatment and served as a cornerstone of precision medicine, its limitations are becoming apparent [72]. One significant drawback is its tendency to overlook the impact of environmental factors on the variability of drug disposition, encompassing liberation, absorption, distribution, metabolism, and excretion (LADME), across diverse patient populations [72]. Reports have shown an interplay among genes, the environment (xenobiotics, gut microbiota, polypharmacy), and variability in drug responses among individuals. Most recently, pharmacometabolomics has been considered as an alternative complimentary approach to pharmacogenomics to predict and evaluate the metabolism of drugs based on different metabolic phenotypes (i.e., metabotypes) among individuals which is a summation of the interplay of genetic, physiological, chemical, and environmental influences on individual responses to drug treatment [72].

Studies indicate that approximately half of phase I drug metabolism in clinical settings can be attributed to the activity of cytochrome P450 3A (CYP3A) enzymes. Polymorphisms in CYP3A activity have been directly associated with drug-drug interactions and adverse reactions to medications. Thus, the incidence of unwanted drug effects is significantly dependent on the disposition of individuals to the activity of CYP3A enzymes [73, 74]. Based on this background, Lee et al. (2019) utilized a sensitive, accurate, fast, and efficient gas chromatographic coupled triple-quadrupole mass spectrometry (GC-MS/MS) technique to simultaneously determine the concentrations of five endogenous corticosteroids (i.e., cortisol, 6β-hydroxy cortisol, cortisone and 6β-hydrocortisone in urine, and 4β-hydroxycholesterol) in plasma of healthy Korean subjects. This validated method is suitable for quantifying the levels of endogenous metabolic markers that can be used to predict CYP3A activity and associated drug-drug interactions [9]. In another study, Shin and colleagues (2013) identified some metabolic markers that can be used to determine the activity of CYP3A enzymes. The researchers intravenously administered midazolam to 24 subjects at varying doses, following pretreatment with once-daily doses of different drugs [75]. These included one mg midazolam alone (as control), 1 mg midazolam after 4 days of 400 mg ketoconazole (CYP3A-inhibition) and 2.5 mg midazolam after 10 days of 600 mg rifampicin (CYP3A-induction). They observed that production of 11-deoxycortisol, pregnenolone, 7β-hydroxycholesterol, 24-hydroxycholesterol, and several other endogenous steroids was inhibited by ketoconazole, while dehydroepiandrosterone, testosterone, and 6β-hydroxycortisol, among others, were induced by rifampicin. These findings suggest that endogenous metabolites can be used to reliably predict the activity of CYP3A5, particularly the CYP3A5*3 genotype/variant, on the metabolism and drug-drug interaction of midazolam [75].

Likewise, Chen et al. (2021) reported a targeted study on CYP-mediated metabolism of arachidonic acid (AA), which was different from the two known pathways mediated by cyclooxygenase and lipoxygenase. This discovery was made through LC-MS/MS analysis of a mouse mammary tumor virus-polyomavirus middle T antigen (MMTV-PyMT) model that can spontaneously develop breast cancer. By comparing the normal mice with MMTV-PyMT mice, they observed a significant difference in the concentrations of AA and the two classes of major metabolites, the hydroxyeicosatetraenoic acids (HETEs) and epoxyeicosatrienoic acids (EETs), particularly 12-HETE, 19-HETE, and 8,9-EET in plasma and tumor tissues. Arachidonic acid and the metabolites were noted as potential endogenous markers for the diagnosis and treatment of breast cancer because they can both kill cancer cells and inhibit the interstitial milieu of the tumor [76]. Miller and colleagues reported targeted metabolomics of warfarin in a personalized therapy using LC-MS/MS. This approach was used to quantify the level of the enantiomers R and S, corresponding hydroxywarfarins, and the glucuronides mediated by some CYPs. In this study, 24 unique metabolites were identified in the urine of the subjects, and warfarin toxicity was reported. The report showed that (S)-7-OH-warfarin is the major factor. Conversely, hydroxylation of warfarin to (R)-warfarin was spread with varying levels of participation of CYP1A2 and multiple CYP2C enzymes. Direct glucuronidation of warfarin and hydroxywarfarin occurred in varied ratios and was primarily mediated by CYP enzymes. The main (>70%) excreted metabolites are the 6- and 7-OH-warfarin glucuronides unlike 4′-OH- and 8-OH-warfarin glucuronides, which were small amounts. Other minor metabolites reported include warfarin, 4′-OH-warfarin, 8-OH-warfarin, and their corresponding glucuronides [77, 78].

Fusobacterium nucleatum, a gram-negative obligate anaerobe bacterium, plays a significant role in the development of colorectal cancer. Reports showed that F. nucleatum regulates epithelial-mesenchymal transition (EMT) and metastasis, which activate TLR4/Keap1/NRF2 signaling pathways to increase the levels of CYP2J2 and 12(13)epoxy-9Z-octadecenoic-9,10,12,13-d4 acid (12,13-EpOME) metabolite that can serve as a potential biomarker and therapeutic agent, respectively, for colorectal cancer. An upregulation of CYP monooxygenases was observed especially for CYP2J2 and the metabolite 12,13-EpOME in colorectal cancer patients and mouse models with an increased spread and migration of colorectal cancer cells. Furthermore, metagenomic sequencing revealed increased levels of F. nucleatum and 12,13- EpOME in the feces and serum of colorectal cancer patients, respectively. In addition, high levels of F. nucleatum correlate with high levels of CYP2J2 in tumor tissues, resulting in a decrease in the survival rate of patients with stage III/IV colorectal cancer [79]. Murakami et al. (2019) reported an inverse relationship in the levels of 18-oxocortisol and 18-hydroxycortisol between aldosterone-producing adenomas (APAs) and CYP11B1 using high-resolution mass spectrometry imaging (MSI) such as matrix-assisted laser desorption/ionization (MALDI)-Fourier transform-ion cyclotron resonance MSI [80]. The metabolic profiles were analyzed using genotype/phenotype data, and an inverse relationship was observed between CYP11B1/cortisol derivatives and the effects on clinical outcome, providing useful information about the effects, clinical features, and conversion of cholesterol to steroids of APAs [80]. In another investigation, Murakami and colleagues reported a significant correlation in the levels of 29 metabolites identified with the severity of cortisol hypersecretion in patients with cortisol-producing adrenal adenoma (CPA) using mass spectrometry imaging (MSI) [81]. Also, excess levels of cortisol and elevated expression of CYP11B1 in the tumor correlate with serotonin, while a high level of 13 fatty acids is directly proportional to the size of the tumor and inversely to the nine polyunsaturated fatty acid levels, such as phosphatidic acid [81].

6 Integration of Proteomics and Metabolomics in CYP Research

Proteomics and metabolomics are powerful tools used independently in different areas of CYP-related research. However, integrating multiple omics, such as proteomics and metabolomics, offers a promising avenue for large-scale biomarker detection. This integrated approach is not only cost-effective but also provides more comprehensive data compared to conducting separate investigations across several individuals [82]. Moreover, this integration facilitates the diagnosis, monitoring, and treatment of complex diseases such as type 2 diabetes, Alzheimer's disease, obesity, schizophrenia, and autism, which typically involve multiple genomic loci [83, 84]. This approach gives room for exploring the advantages of integration of omics technology to overcome the limitations of individual omics to provide broader perspectives and deeper insights into disease mechanisms [84].

There are vast opportunities to combine these two approaches to address several challenges in data interpretation, sample requirement, prohibitive cost, and time requirement. In related biomedical research areas, proteomics and metabolomics have been integrated at different levels of workflow [11, 85,86,87]. In most instances, proteomics and metabolomics are typical combined during the data analysis phase, i.e., after the data for differentially expressed proteins (DEPs) and differentially expressed metabolites (DEMs) have been obtained independently [11]. Although the examples are fewer in number, proteomics and metabolomics have also been integrated right from the sample preparation steps. Gioria et al. (2016) demonstrated combined cell extraction steps followed by 2D gel-based proteomics and MS-based metabolomics analysis and finally combined bioinformatics analyses of the peptide and metabolite data [11]. In the integrated sample preparation process, extracting peptides and metabolites through the same solvent system is the key. Because the availability of biological samples often becomes a limiting step in achieving the desired goals of the study, using the same protein aliquot for both analyses is an excellent advancement in this area. The use of multiple omics technology on hepatocyte-like cells and microscale biochips platforms facilitated the evaluation of CYP-based drug metabolism and hepatic cellular maturation [88]. Despite the advances, there are ample future opportunities to integrate proteomics and metabolomics, especially in instrumentation where the same analytical platform can sequentially measure proteomes and metabolomes as part of the same run.

Though several areas of research and therapy can benefit from the integration of proteomics and metabolomics, we will highlight some of the interesting aspects that can be at the forefront of CYP-related applications. Figure 3 depicts examples of areas that can benefit from the integration of CYP proteomics and metabolomics. For example, analyzing hepatic or intestinal biopsy tissues will potentially provide information on CYP enzyme interaction with different physiological substrates or cocktails of drugs that a patient is taking. It can provide information on CYP enzyme levels in a particular organ along with substrate-specificity towards certain endogenous substances or medications. This approach can be highly useful for orphan CYP enzymes which need extensive efforts in searching substrates and functionality of proteins [26, 56]. Similarly, protein profiling and metabolic function quantitation can be achieved in patients with polypharmacy. In certain patient populations, the use of multiple medications and health supplements leads to altered CYP protein expression and function [89, 90]. Combined data output from proteomics and metabolomics can provide information about the induction or inhibition of CYP enzymes as an outcome of single drug action or as a net result from crosstalk of multiple drugs and supplements. In traditional metabolism studies, the effects of multiple drugs on the metabolism of a single substrate remain uncaptured or tedious to determine. Thus, drug-drug interaction outcomes will be more physiologically relevant through profiling protein levels and functions in the combined platform.

Fig. 3
figure 3

Integration of cytochrome P450 (CYP) proteomics and metabolomics in drug response. ELISA enzyme linked immunosorbent assay, 2D-PAGE two-dimensional polyacrylamide gel electrophoresis, LC-MS liquid chromatography mass spectrometry, GC-MS gas chromatographic mass spectrometry, NMR nuclear magnetic resonance

Biomarker identification is another area that is increasingly getting benefits from the integration of proteomics and metabolomics. Biomarker identification through monitoring the fate of physiological substances is crucial to the selection of an improved drug therapy outcome. The discovery of biomarkers can be achieved by metabolomics and proteomics, which give additional data that can strengthen the accuracy of prescribed therapy [91]. Pathological pathways related to CYP epoxygenase (e.g., CYP2J2, CYP2C19) have been identified as a potential target in triple-negative breast cancer. The data from breast cancer tissue proteomics and oxylipin metabolomics demonstrate that arachidonic acid-derived products and epoxyeicosatrienoic acid metabolites can be biomarkers for triple-negative breast cancer [92]. Similarly, toxicity markers related to CYPs can be identified through an integrated platform. CYP2E1 protein levels and activity were determined through proteomic and metabolomic data integration in donor livers following normothermic machine perfusion [93].

Personalized medicine is an aspect of therapeutics that can immensely benefit from the combination of proteomics and metabolomics. As such, personalized medicine is a treatment strategy developed based on characteristics such as genetic makeup and environmental factors (diet, comedications cigarette smoke, alcohol) to maximize beneficial drug outcomes and reduce adverse effects [25, 32]. Since CYP enzymes are heavily affected by these factors, understanding the protein expression and functions through proteomics and metabolomic approaches in combination will provide the data and their validation. CYP polymorphism leading to amino acid changes can be captured through proteomics, which can be validated through metabolomic data. CYP2J2 polymorphisms can alter its protein composition and the corresponding effect on arachidonic acid metabolite levels [94]. Similarly, altered CYP1A2 levels following cigarette smoke [95] can be determined by LC/MS-based proteomic approaches and eventually correlated with xenobiotic or endobiotic metabolism in metabolomics.

Integrating proteomics and metabolomics in CYP research at the individual level enables the synthesis and evaluation of data to investigate disease mechanisms and identify patient-spec biomarkers. This approach helps address treatment challenges associated with individual physiological phenotypes [84]. On a general note, a proof of concept study from 2015 involved functional and structural measurement of brain connectivity in a multi-omics study generated data in a single human in 23 separate individuals over a period of 18 months using magnetic resonance imaging, physical health, psychological function, gene expression, and metabolomics. A link between inflammation and weight gain was observed and revealed that some metabolic pathways did not go back to baseline after weight loss [96]. In another study, Chen and colleagues employed an integrated omics approach using proteomics, metabolomics, and other measurements to carry out genomic analyses in a patient with an elevated risk of type 2 diabetes. The authors observed a link between genes that mediate insulin signaling and response and downregulation by RNA-seq in an individual with respiratory syncytial virus (RSV) infection using LC-MS/MS proteomics. There was also a correlation between the onset of RSV infection, increased blood glucose level, and decrease in insulin response pathway [82]. However, such integration in a single individual requires multiple hypotheses and a large sample size to be able to generalize and correlate the data to a larger population [84]. Overall, the integration of proteomics and metabolomics at the sample analyses and data analyses stage can advance biomedical research as well as optimize therapeutic outcomes.

The advent of high-throughput technologies has given birth to various omics such as proteomics, metabolomics, metagenomics, genomics, transcriptomics, and epigenomics, with each having a peculiar complexity in terms of spatial and/or temporal characteristics [83]. As a result, there are several challenges associated with the integration of two or more of these technologies, particularly in application to individualized medicine. Since multiple technological approaches are usually required in disease diagnosis, there are challenges with harmonizing, interpreting, and implementing data from multi-omics sources which will require a robust and new analytical and statistical approach to integrate these divergent datasets and to establish standard quality control metrics [84]. Challenges in adopting integrated omics in clinical practice include ensuring reproducibility in statistical methodologies, harmonizing disparate datasets, scalability in electronic health records, and the complexity of analyzing different data types [97]. Sophisticated multi-dimensional models like neural networks [98], Bayesian [99], and dimensionality reduction [100] are needed to analyze multiple datasets simultaneously. However, the clinical success of these methods will remain to be seen as they are not designed to characterize an individual, limiting their application for use in clinical purposes such as precision. Another major challenge to the integration of omics is the difficulty in applying results obtained in a population to individuals in another population [101, 102].

Another challenge associated with the analysis of integrated omics such as proteomics and metabolomics is the amount of storage required for all data associated with the analyses. Data from a single individual may be manageable on a terabyte scale (1012 bytes), whereas hundreds or thousands are needed for many individuals, though cloud computing has reduced this challenge with more elastic computation and storage available for hospitals or healthcare facilities which also promotes data reproducibility. In essence, irrespective of strict adherence to standardized protocols in terms of sample preparation and other analyses, data reproducibility may arise from variability in experimental conditions or technical issues during analysis which can lead to inconsistent results among replicates. Proteomics and metabolomics are complex methods that lead to the generation of a large number of data that often require sophisticated bioinformatics analysis with attendant difficulty in data interpretation, making identification of statistically significant biomarkers very difficult [103, 104].

Similarly, variability among individuals may lead to an increased inherent error rate in data collection when an integrated omics is utilized which ultimately affects the accuracy of data and validation process, particularly when rare structural variants that are difficult to detect are involved. Samples may vary because of factors such as sample collection, handling procedures, and storage, and poor sample quality can result in data inconsistency during proteomics and metabolomics analyses, which can prevent the identification and discovery of biomarkers [105]. For instance, in addition to rigorous and extensive data analysis requirements, the ability to differentiate protein biomarkers in Alzheimer's disease patients from healthy controls is very difficult, thereby creating non-specificity and absence of sensitivity, which makes it difficult to develop diagnostic tools for early disease detection and monitoring [105,106,107].

7 Use of CYP Proteomics and Metabolomics in Disease Conditions

Proteomics and metabolomics are essential tools for investigating disease conditions, offering insights into protein profiles, interactions, and modifications, as well as small-molecule metabolites, aiding in disease diagnosis, progression tracking, and personalized treatments. Proteomics examines the abundance, interactions, and modifications of proteins, disease pathways, biomarker identification, and therapeutic targets. Conversely, metabolomics evaluates small-molecule metabolites, providing a dynamic snapshot of biochemical processes, aiding in disease diagnosis, progression tracking, and personalized treatment strategies [7, 108,109,110,111]. Table 1 summarizes examples of the application of proteomics and metabolomics in different disease conditions.

Table 1 Cytochrome P450 (CYP) proteomics and metabolomics in various pathological conditions

CYP2D6 gene polymorphisms have demonstrated a correlation with heightened susceptibility to hepatocellular carcinoma (HCC). To elucidate the underlying mechanisms through which the CYP2D610 (100C > T) polymorphism confers this susceptibility, Hu and colleagues employed a label-free global proteome profiling approach [12]. A notable reduction in the expression of the CYP2D610 TT genotype was observed, leading to an enhanced predisposition to HCC. Conversely, individuals possessing the TT genotype demonstrated a significant 69.2% decrease in their susceptibility to HCC, thereby providing compelling evidence that the TT genotype exerts a protective influence against the initiation of HCC. To elucidate the protective mechanisms conferred by CYP1A enzymes against hyperoxic lung injury (HLI), omics platforms, namely microarray and reverse phase proteomic array, were employed to characterize genotype-specific disparities in the transcriptome and proteome during acute HLI [13]. Wild-type (WT), Cyp1a1−/−, and Cyp1a2−/− C57BL/6J mice were exposed to hyperoxia (FiO2 > 0.95) for 48 h. Differentially expressed genes (DEGs) were found to be implicated in apoptosis, DNA repair, and estrogen response pathways, contributing to differences in HLI susceptibility between Cyp1a1−/− and Cyp1a2−/− mice. Elsewhere, the CYP expression patterns were examined in the colons of individuals with Crohn's disease and ulcerative colitis. Notably, colon samples from Crohn's disease patients exhibited an approximately two-fold upregulation in expression for all studied enzymes, while samples from the ulcerative colitis group displayed reduced expression of most examined CYP enzymes [112]. In a study aimed at identifying CYP enzyme biomarkers to predict the prognosis of anal squamous cell carcinoma (ASCC), researchers focused on detecting copy number variants (CNVs) and examining their correlation with disease-free survival (DFS). The study results revealed the identification of 29 genes, the duplication of which was linked to DFS. Importantly, one of the most significant findings was the duplications within the CYP2D locus, encompassing genes such as CYP2D6, CYP2D7P, and CYP2D8P [113]. The proteomics and metabolomics of CYP enzymes, especially CYP2D6, have a significant impact on the personalized treatment outcomes of schizophrenia [114].

An LC/MS-MS-based targeted metabolomics approach systematically profiled eicosanoids in a colon cancer model induced by azoxymethane (AOM) and dextran sulfate sodium (DSS) in C57BL/6 mice [115]. The study revealed increased levels of epoxygenated fatty acids (EpFA), produced by CYP monooxygenases, in both plasma and colon of AOM/DSS-induced colon cancer mice. Moreover, overexpression of CYP monooxygenases was observed in colon tumor tissues and cells, and pharmacological inhibition or genetic ablation of CYP monooxygenases suppressed colon tumorigenesis in the AOM/DSS model. A combined approach involving quantitative metabolomics and DNA methylation analysis of brain tissue from primary motor cortex Parkinson's disease (PD) patients was utilized to uncover the biochemical changes associated with the onset of the disease. The study revealed that bile acid metabolism emerged as the principal biochemical pathway that experienced disruption in the brains of PD patients who had died compared to the control group [14]. To understand the development and progression of bladder cancer, a comprehensive and unbiased metabolomic profiling of tissue specimens from both bladder cancer patients and control individuals was carried out using mass spectrometry-based techniques. The results of this analysis unveiled a significant alteration in the phase I/II metabolism, suggesting a potential impact of DNA methylation on disrupting xenobiotic metabolism within the framework of bladder cancer. Additionally, a distinct observation emerged: there was a notable decrease in the expression levels of CYP1A1 and CYP1B1 in a specific group of bladder cancer specimens compared to adjacent benign tissues [116].

To explore the involvement of neuroinflammatory mechanisms in autism spectrum disorder (ASD) development, a study conducted comparative proteomic and metabolic profiling of urine samples from children with ASD and healthy controls [15]. An integrated proteome and metabolome analysis revealed significant enrichment of six signaling pathways in ASD patients. These include three pathways associated with compromised neuroinflammation, namely, glutathione metabolism, metabolism of xenobiotics by CYP enzymes, and transendothelial migration of leukocytes. Interindividual variation in the effects of anesthesia is a major challenge in the medical sciences. The proteomic and metabolic knowledge of CYPs related to general anesthetics is assisting the advancement of pharmacogenomics applications and personalization [117]. Recently, an integrated omics investigation explored metabolic alterations in nonalcoholic steatohepatitis (NASH) [118]. Metabolomics and lipidomics analyses were performed on plasma, complemented by liver proteomics, to assess the metabolic profiles of NASH patients. From the results, elevated liver expression of pivotal proteins involved in fatty acid transport and lipid droplets in NASH patients was observed, along with distinctive lipidomic remodeling. Notably, increased expression of key glycolysis-associated proteins and higher levels of glycolytic output were observed, alongside the accumulation of branched-chain amino acids, aromatic amino acids, purines, and bile acids in NASH patients. Collectively, these omics strategies empower a deeper understanding of disease mechanisms and guide innovative approaches for improved patient care.

The integration of CYP proteomics and metabolomics in disease conditions represents a significant advancement in understanding the molecular mechanisms underlying various disorders. By examining protein profiles, interactions, modifications, and small-molecule metabolites, these approaches offer valuable insights into disease diagnosis, progression, and treatment strategies. However, while the studies cited demonstrate the potential of proteomics and metabolomics in elucidating the role of CYP enzymes in various diseases, there are some limitations to consider. For instance, the complexity of analyzing multiple omics datasets simultaneously poses analytical challenges, including the reproducibility of statistical methodologies and scalability for integration into clinical practice. Additionally, the variability in CYP expression patterns across different diseases underscores the need for further validation studies and standardization of analytical techniques to ensure robust and reliable results. Despite these challenges, the integration of CYP proteomics and metabolomics holds promise for advancing personalized medicine and improving patient outcomes in the future.

8 Conclusions

In recent years, combining proteomics and metabolomics has emerged as a powerful approach to uncover the complexities surrounding CYP enzymes and their multifaceted roles in diverse physiological and disease states. This has advanced our knowledge of the metabolic characterization of various pathological conditions, ranging from inflammatory diseases to carcinomas and neurodegenerative disorders [5, 119, 120]. Among the key findings, the complex connection between CYP enzymes and distinct metabolic markers associated with specific diseases was particularly notable. For instance, the association of specific CYPs with colorectal cancer, renal impairment, and tuberculosis offers promising avenues for diagnostics and therapeutic interventions [74, 79, 121]. Moreover, the diverse effects of CYPs as shown by their upregulation or downregulation in various diseases highlights their potential role as biomarkers for an array of diseases. This also reveals potential paths for drug development, optimization of therapy, and personalized medicine approaches, underscoring the diverse roles CYPs play in drug metabolism and pharmacogenomics.

While significant strides have been made in the field of CYP proteomics and metabolomics, it is critical to acknowledge that significant challenges persist. For instance, many CYPs undergo post-translational modifications, which makes their identification challenging. Furthermore, some CYPs are present in such minute quantities that standard analytical tools are incapable of detecting these enzymes. The vast array of reactions driven by CYP enzymes leads to a wide variety of metabolites. These metabolites can differ greatly in their concentrations, complicating their simultaneous detection. When integrating data from proteomics and metabolomics, handling such vast datasets can be computationally demanding [122,123,124]. To address these challenges, advanced mass spectrometry, targeted proteomics, and sophisticated bioinformatics tools that harness the power of machine learning (ML) and artificial intelligence (AI) are essential. Synchronizing experimental designs and employing integrative platforms can streamline data analysis.

Future directions will potentially focus on the integration of proteomics and metabolomics towards the elucidation of mechanistic pathways through which CYPs modulate disease states, potentially harnessing their therapeutic potential. Additionally, prioritizing translational research to bridge the gap between preclinical studies and patient care will be crucial. As we stand at the threshold of breakthroughs that could reshape our knowledge of these enzymes, it is anticipated that the integration of proteomics and metabolomics will continue to uncover avenues for innovative therapeutic strategies and novel biomarker discoveries. This could set the stage for a new era in personalized medicine where individual protein and metabolic status will determine treatment strategies. In summary, proteomics and metabolomics technologies, either individually or through integrated platforms, can help decipher the roles of CYP enzymes in health and diseases and can guide therapeutic decisions through a comprehensive understanding of proteome and metabolome profiles. The significant challenges in the data acquisition, data management, and integration of technologies need to be overcome in the clinical applications of proteomics and metabolomics of CYP enzymes in personalized medicine.