1 Introduction

The Innate immune system serves as first line of defense against the invading pathogens and restricts the infections by neutralizing the invading potential immunological threats [1,2,3]. This defense system constitutes of various specialized components performing specific tasks assigned to them across the human body. Innate immune components tasked with early detection and clearance of an invading pathogen are triggered within hours of detection of an immunogen in the body [4]. The Dual oxidase 1 (DUOX1) is a prominent membrane protein expressed in airway epithelial cells and functions as NADPH oxidase enzyme [5, 6]. DUOX1 is an important innate immune component respiratory tract. Through generation of hydrogen peroxide (H2O2), it is involved in lactoperoxidase-mediated innate immunity at mucosal surfaces. The generated H2O2 is necessary for thyroid peroxidase and lactoperoxidase (LPO) activities [7, 8]. It has a peroxidase-like domain at N terminal which is potentially responsible for its intrinsic peroxidase activity [6, 9]. The NADPH oxidase activity is calcium dependent with two potential Ca++ binding domains at 828–839 and 864–875 positions [10].

The DUOX1 plays a role in generation of hypothiocyanite, an antimicrobial oxidant in respiratory surface liquid along with LPO, thiocyanate by generation of H2O2 at the apical surface of epithelial lining [11]. Being a membrane protein, DUOX1 potentially plays its part in tracheal epithelial wound repair when activated by ATP released from injured cells [11]. DUOX1 has been reported to be induced by inflammatory cytokines [12, 13] and bacterial stimuli [14]. DUOX1 activation occurs in bronchial epithelial cells when challenged by Pseudomonas aeruginosa infection. Bacterial flagellum and LPS along with Type Three Secretion System components indirectly elicited DUOX1-mediated H2O2 production via TLR-dependent signaling [7, 15].

During respiratory infection, P. aerginosa virulence factor Pyocyanin (PCN) inhibits H2O2 generation by DUOX1, thus improving pathogen’s survival in the respiratory cavity by actively reducing the availability of its substrate NADPH2 in the cytoplasm [12]. PCN is a redox active, cytotoxic tricyclic metabolite considered responsible for immune modulatory, pro-inflammatory, enzyme inactivation, and pro-apoptotic activities during P. aeruginosa infection [12, 16,17,18,19,20]. PCN is reported to induce apoptosis in neutrophils via interfering with the mitochondrial respiratory chain, releasing reactive oxygen species and cytochrome c into cytoplasm and activation of mitochondrial acid sphinglomyelinase in cytoplasm [19, 21]. Peroxidases (Microperoxidase-11, LPO, and myeloperoxidase) in the presence of adequate quantities of H2O2 are capable of permanently oxidizing PCN, thus neutralizing it cytotoxicity [12, 22]. The DUOX1 serves as an important source of H2O2, in the extra-cellular environment [9]. The synergistic action of both DUOX1 and peroxidase enzymes holds the potential protective shield to prevent PCN-induced P. aeruginosa virulence, which otherwise usually paves the way for establishment of persistent chronic infection. Thus, the outcome of this redox warfare between the host and the pathogen determines the course of infection (Fig. 1).

Fig. 1
figure 1

Host—PA Redox Duel. (1) Activation of TLR via PA LPS, flagella, Type 3 Secretion System components. (2) Expression of inflammatory cytokines and activation of DUOX1. (3) Production of H2O2 in extra-cellular environment by DUOX1 using NADPH. (4) Generation of Hypothiosynate for killing the infecting pathogen. (5) Secretion of Pyocyanin by P. aeruginosa. (6) Peroxidase enzymes (LPO, MPO) oxidize PCN to detoxify in the presence of H2O2. (7) Diffusion of PCN across plasma membrane into the cell. (8) Generation of ROS by PCN using NADPH, thus competing with DUOX1 and consequently inactivation of DUOX1. PCN also increases the cellular levels of NFKB thus affecting gene expressions for immune-modulatory action. (9) Cell damage and potential induction of death

The increase in antibiotic resistance and emergence of Pan-drug resistant P. aeruginosa strains, the need of the hour is to explore unconventional antibacterial approaches [23]. Being a frontline immune component, DUOX1 holds a vital position protecting the pulmonary cavity from invading pathogens and thus is subjected to immune evasive and modulatory assaults from them. Comprehensive knowledge of pulmonary infection dynamics, primarily host–pathogen interaction (the way pathogen manipulates host responses and escape clearance simultaneously inflecting injuries to host) is astutely vital for designing any sort of therapeutic and/or preventive interventions. This requires a better understanding of DUOX1 [5]. Without proper annotation, DUOX1 cannot be used as therapeutic option. In this study, we endeavored to structurally and functionally annotate the DUOX1 gene and protein using servers and tools available in public domain. The contributions of this study are prediction of protein’s three-dimensional structure, physio-chemical properties, post-translational modifications, and genetic polymorphism analysis with subsequent disease-associated single-nucleotide variations and their impact on DUOX1 functionality by employing in silico approaches. Human DUOX1 was more homologous with the gorilla and chimpanzee than other mammalian species. The protein harbors the 21 aa long localization signal peptide at the beginning of the peptide. Three distinct functional domains were observed based on homology: An_peroxidase, FRQ1, and oxido-reductase domains. Polymorphism analysis revealed > 60 SNPs associated with different cancers with probable damaging effects. No cancer-associated methylated island was observed for DUOX1. Three-dimensional structure was developed via homology modeling strategy. The proper annotation will help in characterization of DUOX1 and enhance our understanding of its functionality and biological roles ultimately leading to its potential therapeutic applications.

This paper is organized as follows. Section 2 contains the adopted methodology for annotation enlisting all the tools and servers employed. In Sect. 3, the results are presented encompassing the DUOX1 annotation, while in Sect. 4, the results are discussed and Sect. 5 concludes the findings.

2 Methodology

2.1 Sequence Retrieval, Homology, and Phylogenetic Analysis

DUOX1 localization on the genome was determined by searching within human genome draft sequence on NCBI (http://www.ncbi.nlm.nih.gov/). Phylogenetic analysis provides vital information regarding gene variation among the included species. Peptide sequences of Human DUOX1 were obtained from Uniprot (http://www.uniprot.org/) along with other available species, i.e., Caenorhabditis elegans, Canis lupus familiaris, Sus scrofa, Rattus norvegicus, Danio rerio, Mus musculus, Poecilia Formosa, Oreochromis niloticus, Lepisosteus oculatus, Xiphophorus maculatus, Pan troglodytes, Bos Taurus, Callithrix jacchus, Ictidomys tridecemlineatus, Sarcophilus harrisii, Latimeria chalumnae, Oryzias latipes, Cavia porcellus, Pongo abelii, Nomascus leucogenys, Oryctolagus cuniculus, Otolemur garnettii, Papio Anubis, Felis catus, Ovis aries, Equus caballus, Loxodonta africana, Takifugu rubripes, Gorilla gorilla gorilla, Myotis lucifugus, Monodelphis domestica, Alligator mississippiensis, Callithrix jacchus, Macaca mulatta, Xenopus tropicalis, Bactrocera cucurbitae, and Lygus hesperus. Online multiple sequence alignment tool Clustal Omega was used to construct homology (Neighbour-joining) tree for determining DUOX1’s evolutionary trend with default parametric settings [24].

2.2 Sequence-Based Physico-chemical Characterization

Protparam Server (http://web.expasy.org/protparam/) was employed to calculate overall physical and chemical parameters and structural/functional motifs for human DUOX1 peptide sequence [25]. Signal peptide prediction for DUOX1 was done using online tool SignalP v4.1 (http://www.cbs.dtu.dk/services/SignalP/) [26]. Protscale Server (http://web.expasy.org/ protscale/) was used to calculate physio-chemical properties, i.e., polarity, % buried residues, average flexibility, bulkiness, hydrophilicity, and relative mutability for human DUOX1 [27].

2.3 Functional Domain Assessment and Polymorphism Analysis

Functional domains for DUOX1 proteins were studied using Domain Mapping of Disease Mutations (DMDM) database (http://www.bioinf.umbc.edu/dmdm/) [28]. DMDM employs Hidden Markov Model-based sequence alignment tool (HMMer) to map genetic polymorphisms (mutations/variations) on the protein at domain and visualized. Default parametric settings were used. Disease association with genetic polymorphism was also checked with BioMuta v3.0 database (https://hive.biochemistry.gwu.edu/biomuta). BioMuta is a curated single-nucleotide variation (SNV) and disease association database in special reference to oncogenesis [29, 30]. PROVEAN/SIFT and PolyPhen2 Prediction tools were used for studying the impact of variations on the protein’s functionality. SIFT tool predicts the impact of amino acid substitution on the proteins function. The prediction is made from analyzing conservation of amino acids via PSI_BLAST in the sequence alignments derived from closely related sequences [31]. PROVEAN tool provides a swift computational approach to predict effects caused by amino acid substitution or indel on biological function of the protein. This tool identifies nonsynonymous and/or indel variants predicted to be important for proteins functionality by developing pairwise sequence alignments with generation of precomputed predictions for all single amino acid substitutions and deletions at every position in the query protein [32]. PolyPhen tool was used to annotate coding nonsynonymous SNPs. It utilizes high-quality multiple sequence alignment pipeline based on machine learning method optimized for high-throughput analysis of the next generation sequencing data [33].

2.4 Sub-cellular Localization

Sub-cellular localization is important in respect of better understanding the functionality of any biomolecule in a biological system. Sub-cellular localization for DUOX1 was evaluated via CELLO server [34], while TMHMM, HMMTOP servers [35], and UniProt [36]. Topological information was used to calculate transmembrane structural domains with in the protein structure inferred from primary sequence of the protein.

2.5 Assessment of Methylation Sites

DNA methylation plays vital role in epigenetic regulation of genetic information. Methylation sites in DUOX1 were analyzed using MethyCancer (http://methycancer.psych.ac.cn/MethyCancer.do) provided by MethyCancer Database of Human DNA Methylation and Cancer [37]. DUOX1 gene (45145823–45149381) was analyzed for CpG sites. MethyCancer tool utilizes six data sources BIG/UHN [38], Columbia University [39], Human Epigenome Project HEP [40], MethDB [41], CpG130 CGI [42], and UCSC CGI [43]. Default settings were used for the analysis.

2.6 Prediction of Post-translational Modifications (PTMs)

Center for Biological Sequence Analysis CBS (http://www.cbs.dtu.dk/services/) provides a rich set of servers to predict post-translational modifications. The available servers NetCGlyc 1.0 [44], NetCorona 1.0 [45], NetGlycate 1.0 [46], NetNGlyc 1.0 [47], NetOGlyc 4.0 [48], NetPhos 3.1 [49], and ProP 1.0 [50] were used to predict sites for glycosylation, mannosylation, and phosphorylation for DUOX1 protein from its sequence.

2.7 Functional Protein Association Networks

Interacting partners of a biomolecule in a biological system influence its functionality. Without properly understanding the causes and consequences of these interactions, the complete comprehension of the biological role played by a protein could not be implicated. Search Tool for Retrieval of Interacting Genes and proteins (STRING v 10) provides protein–protein interaction and cluster of orthologs analysis servers using its database resources [51]. It was used to identify the interactions and associations involving DUOX1.

2.8 Protein Structure Building

Standard practice for protein structure prediction via homology modeling requires availability of suitable template for structure with good sequence identity and coverage (query sequence identity > 40% and query coverage > 50%). Protein Databank (PDB) was searched via BLAST for suitable template with above-mentioned parameters. Lack of such template in PDB was a major issue in predicting the DUOX1 3d structure. Other factors, i.e., the transmembrane nature and size of the protein (1551aa), complicated the structure prediction, thus limiting the usefulness of Ab initio structure building. Lack of template issue was overcame by relaxing PDB search parameters with focus on identifying domainwise similarity among available PDB structures (query sequence identity ≥ 25%). This approached yielded similar domains and hence were used for structure prediction via Swiss Model online protein prediction server [52]. The structure was built in domain-by-domain manner. The resultant structure was analyzed by Rampage server for Ramachandran plot to evaluate the predicted structure quality [53]. Predicted protein structure was visualized via Biovia Discovery Studio v 4.5 [54].

3 Results

3.1 Sequences’ Retrieval and Homology Tree Construction

DUOX1 is located on q arm of human chromosome 15 (15q21.1) precisely between 45,129,994 and 45,165,578 position. The gene spans a total of 35,585 bp and contains 35 exons, encoding 1551 amino acids. The chromosomal location, neighbouring genes along with their respective position and exon structure of DUOX1 gene, is shown in Fig. 2a–c. Peptide sequence of Human DUOX1 (Q9NRD9) along with various other species was subjected to multiple sequence alignment using Clustal Omega, and the alignment was used to generate homology tree (Fig. 3). Human DUOX1 was clustered together with the great apes Gorilla (G3R9G4) and Chimpanzee (H2Q9C8), while closely placed with Sumatran Orangutan (H2NN45) as members of Hominidae family. Gibbons despite member of Hominoidea as lesser apes was clustered with Tasmanian devil and opossums potentially due to geographical distribution of these mammals.

Fig. 2
figure 2

DUOX1 gene location, structure and domains. a Chromosomal localization of DUOX1 gene on q arm of human chromosome 15 highlighted in blue. b DUOX1 gene (blue) is flanked by DUOXA1 and SHF genes (grey). c DUOX1 gene contains 35 exons (green bars) and 11 introns (yellow arrowheads). [D0 DUOX1 contains three domains, namely, An_peroxidase—pfam03098, FRQ1—COG5126 (efhand—pfam00036, EFh—cd00051, EFh—smart00054) and COG4097—COG4097] (Ferric_reduct—pfam01794, UbiB—COG0543, Hmp—COG1018, FAD_binding_8—pfam08022, flavin_oxioreductase—cd06189, FNR_iron_sulfur_bind—cd06215, cyt_b5_reduct_like—cd06183, sulfite_reductase_li—cd06221, FNR_like—cd00322, NOX_Duox_like_FAD_NA—cd06186, FNR_like_3—cd06198 and NAD_binding_6—pfam08030)

Fig. 3
figure 3

Multiple sequence alignment of DUOX1 gene. DUOX1 gene sequences of Higher apes are denoted by green, Mammals in yellow, Amphibians in dark blue, Arthropods in light blue, and Pisces in red

3.2 Prediction of Signal Peptide and Structural Domains of DUOX1

SignalP Server predicted a cleavage site between position 21 and 22, as shown in Fig. 4. Server computed C-, S-, and Y-scores for DUOX1. Amino acid 22 exhibited maximum C- and Y-scores, while max S score was calculated for position 3 all greater than the standard value (0.5). The signal peptide was estimated to be 1–21 segment of the protein. The Uniprot database provided curated and reviewed topological information regarding the transmembrane domains of DUOX1. The protein had 7 transmembrane helical domains with extra-cellular N terminal and cytoplasmic C terminal domains, since the information obtained from Uniprot was manually reviewed and was thus preferred over the predictions servers. Both the domain prediction servers stumbled on complex arrangement of the transmembrane domains, as amino acid 1248 was only a single amino acid spanned by two transmembrane domains. The two servers failed to consider this lone extra-cellular residue and assumed it as continuation of a large transmembrane domain instead of two separate domains.

Fig. 4
figure 4

Signal peptide detection for DUOX1 protein using peptide sequence. SignalP 4.1 server predicted the signal peptide cleavage site between 21 and 22 aa

3.3 Functional Domain Analysis

DMDM provided a centralized portal for studying functional domains of DUOX1, as depicted in Fig. 2d and Table 1. Three major domains were observed in the protein, namely, An_peroxidase (pfam03098) on N terminus 29–557 aa exposed on the extra-cellular surface, FRQ1 domain (COG5126) 769–929 aa hidden beneath plasma membrane in cytoplasm and third domain with oxido-reductase activity was COG4097 1054–1494 aa. The An_peroxidase domain (Animal haem peroxidase) is a member of the superfamily cl14561. The FRQ1 domain contained EF-hand motifs which provide the Ca++ binding sites and is a member of the superfamily cl25352. The c terminus domain had overlapping motifs for FAD- and NAD-binding pockets along with oxido-reductase domains. This domain was classified as member of the superfamily cl06868.

Table 1 Functional domains of DUOX1 protein

3.4 Single-Nucleotide Polymorphism of DUOX1

The DMDM database provided three polymorphisms for DUOX1 p.CYS1026ARG, p.ILE962THR, and p.LEU1178PHE, but could not associate such variations with any disease condition. The BioMuta provided a comprehensive overview of DUOX1 genetic polymorphism with cancer type and frequency. The observed SNPs for DUOX1 gene for each cancer type were: Uterine cancer 128, Skin cancer 106, Kidney Cancer 102, Breast cancer 98, Stomach cancer 96, and Lung cancer 82 (Fig. 5a). While top three most frequent SNPs were occurring at 297 aa observed in 18 patients, 1393 in 16 patients, and 975 in 14 patients (Fig. 5b). All single-nucleotide polymorphs and somatic cancer single-nucleotide variants are shown in supplementary Tables 1 and 2, respectively. Based on PROVEAN/SIFT and PolyPhen2 predictions, somatic cancer single-nucleotide variants’ (SNVs) functional impact of the DUOX1 SNPs is depicted in Fig. 6a, b. PROVEAN/SIFT predicted deleterious/damaging 62, deleterious/tolerated 12, neutral/damaging 11, and neutral/tolerated 44, while PolyPhen2 predicted benign 43, possibly damaging 19, and probably damaging 67 genetic variants. Both the tools predicted the same 70 SNPs as lacking any functional association. Domainwise SNV frequency distribution showed higher degree of SNVs in An_peroxidase (66: damaging 23, neutral 14 and NA 29) followed by COG4097 (46:30, 9, 7) and FRQ1 (17:9, 2, 6), and the signal peptide also had a SNV at position 4, although this peptide segment does not play any direct role in functioning of the protein, but it is vital, as it marks the sub-cellular location of the protein, i.e., plasma membrane.

Fig. 5
figure 5

DUOX1 SNP analysis in cancer patients. a Observed SNPs for DUOX1 gene in response to each cancer type. b Positionwise SNP frequency in cancer patients

Fig. 6
figure 6

Assessment of Somatic Cancer SNVs’ impact on DUOX1 functionality. a PROVEAN/SIFT prediction tool classified SNVs into five classes, namely: deleterious/damaging 62, deleterious/tolerated 12, neutral/damaging 11, neutral/tolerated 44 and no association 70, while b PolyPhen2 tool classified SNVs into four categories benign 43, possibly damaging 19, probably damaging 67 and no association 70

3.5 Prediction of Hydrophilicity, Accessibility, Polarity, Flexibility, Mutability, and Bulkiness of DUOX1

DUOX1 was subjected physio-chemical analyses (polarity, % buried residues, average flexibility, bulkiness, hydrophilicity, and relative mutability) using Protscale Server available at expasy platform. Higher the generated score greater is the probability for each parameter. The average flexibility values are in the range of 0.36 (1209 aa) to 0.512 (160 aa). The bulkiness are between 10.671 (592 aa) and 20.67 (1232 aa), while accessibility is calculated to be 3.3 (1471 aa) and 7.589 (1268 aa). The least buried residue is 269 with calculated values of 2.044 and most buried one is 1388 aa valuing 11.8. The values for polarity are calculated with highly polar residue to be 1215 aa with 34.841 and least polar one 1388 aa having value 0.072. Hydrophobicity is estimated using Hopp and Woods method, and the values ranged from − 2.011 (1186 aa) to 1.733 (853 aa). The relative mutability calculations showed a patch of highly mutable region comprising of three residues (354, 355, and 356 with values 104.111, 103.111, and 103.556, respectively), while least mutable residue was 252 aa with values 44.444. The highest and lowest scores computed for each parameter of DUOX1 are shown in Table 2.

Table 2 Physio-chemical analyses (polarity, % buried residues, average flexibility, bulkiness, hydrophilicity, and relative mutability) for DUOX1. Scores were computed via Protscale Server available at ExPasy platform

3.6 Methylation Sites in DUOX1

MethyCancer tool predicted a 100 bp CpG Island type III at position 45,144,265–45,144,364, but no methylation site or island was observed in DUOX1 gene segment. This upstream island was predicted employing BIG and UHN data sources.

3.7 Post-translational Modifications of DUOX1

CBS prediction server predicted multiple post-translational modifications for DUOX1. Cutoff for each parameter was set at 0.5 by default, and higher values indicated higher probability. The propeptide signal cleavage site was calculated to be between 21 and 22 aa using ProP 1.0 prediction server. Net Corona 1.0 server estimated two Proteinase cleavage sites at positions 245 and 376 aa (Fig. 7a). C-mannosylation sites were calculated using NetCGlyc 1.0 server which yielded lack of any mannosylation site in the protein. NetGlycate 1.0 server estimated a total of 13 glycation sites, while 5 N-linked glycosylation sites and 15 O-linked glycosylation sites were calculated by NetNGlyc1.0 and NetOGlyc 4.0 servers, respectively (Fig. 7b, c). Overall 123 phosphorylation sites were predicted by NetPhos 3.1 server. Serine residue was to host highest number of phosphorylation site (74), while Threonine had 41 and Tyrosine harbored only 8 sites (Fig. 6d). Sites with their respective kinases are shown in Table 3.

Fig. 7
figure 7

Predicted post-translational modifications of DUOX1 Protein. The cut-off value was set at 0.5, with higher values having greater probability. a NetCorona server predicted two Cleavage sites for 3CL Protinase. b NetGlyc server predicted 5 N-glycosylation sites. c NetGlycate server calculated 13 glycation sites. d NetPhos 3.1a server estimated 123 phosphorylation sites

Table 3 Number of phosphorylation sites with respective kinases

3.8 Physiological Parameters of DUOX1

Physiological parameters are predicted using ProtParam server. DUOX1 has molecular weight around 177235.09 Da with speculated formula C8023H12364N2212O2233S53 and the total number of atoms is 24,885, while the theoretical pI is 8.12. Half-life estimated in mammalian reticulocytes, in vitro for DUOX1 was 30 h and it was > 20 h in yeast in vivo. The protein was slightly instable with calculated instability index to be 47.83 and aliphatic index 88.56. The estimated grand average of hydropathicity (GRAVY) is − 0.190.

3.9 Homology Modeling and Evaluation of Model Stability

PDB search with relaxed parameters yielded four templates Human Myloperoxidase (3fgp), EF-hand protein (2q4u), Transmembrane domain of Cylindrospermum stagnale NADPH Oxidase 5 (5o0t), and Dehydrogenase domain of Cylindrospermum stagnale NADPH Oxidase 5 (5o0x). DUOX1 protein was divided into four functional domains based on structure-sequence alignment obtained from the second PDB search. Human myeloperoxidase had 28% sequence similarity with first domain of DUOX1 protein (1–700 aa), EF-hand protein 25% similar to second domain (701–1000 aa), transmembrane domain of NADPH Oxidase 5 was 25% similar to third domain (1001–1260 aa), and dehydrogenase domain of NADPH Oxidase 5 was 35% similar to the last domain (1261–1551 aa). The structure was built in domain wise manner using Swiss Model web server (Fig. 8a, b). The figure shows the predicted structures in domainwise manner in accordance with structure building scheme. Three segment (signal peptide 1–21 aa, gap I 581–751 aa and gap II 931–1086 aa) were not modelled due to lack of any structural domain and availability of template. Ramachandran plot was used to evaluate the model quality (Fig. 8c). Being a membrane embedded protein, the stability indicators are relatively lower than those of non-membrane proteins.

Fig. 8
figure 8

DUOX1 protein structure prediction. a DUOX1 structure division in domains for structure prediction; structural domains along their respective lengths are shown on the top of linear peptide. Modelled domains along their lengths are mentioned on the linear peptide and the respective templates with percent sequence identity is beneath it. b Predicted individual structural domains. Domain 1 shown in yellow, length 1-700 aa modelled segment 22–580 aa, Domain 2 shown in green length 701–1000 aa, modelled segment 752–930 aa, Domain 3 shown in blue length 1001–1260 aa modelled segment 1032–1260 aa, Domain 4 shown in red length 1261–1551 aa, modelled segment 1261–1551 aa. Three segments (signal peptide 1-21aa, gap I 581-751aa and gap II 931–1086 aa) were not modelled due to lack of any structural domain and availability of template. c Ramachandran plot generated for the DUOX1-constructed structure. Vast majority of the amino acids satisfy the set constrains for the Ψ and Φ angles

3.10 Protein–Protein Interaction and Cluster of Orthologs Analysis for DUOX1

Protein interactions at cellular level are calculated using STRING Database. The protein–protein interaction analysis yielded a multiple node interaction network map (Fig. 9). The resultant interaction network analysis contained a total number of nodes to be 11 and 29 edges with 5.27 average node degree. The clustering coefficient was 0.786, and PPI enrichment value was 2.47e-06. DUOX1 is classified into non-supervised orthologous group (NOG19510) (Fig. 9a). DUOX1 is involved in biological processes: reactive oxygen species metabolic process (GO:0072593), superoxide metabolic process (GO:0006801), hydrogen peroxide metabolic process (GO:0042743), positive regulation of immunoglobulin-mediated immune response (GO:0002891), positive regulation of interleukin-10 production (GO:0032733), superoxide anion generation (GO:0042554), positive regulation of defense response to bacterium (GO:1900426), regulation of thyroid hormone generation (GO:2000609), regulation of toll-like receptor 2 signaling pathway (GO:0034135), hydrogen peroxide biosynthetic process (GO:0050665), regulation of inflammatory response (GO:0050727), cytokine production involved in immune response (GO:0002367), regulation of defense response (GO:0031347), respiratory burst (GO:0045730), response to organic cyclic compound (GO:0014070), metabolic process (GO:0008152), regulation of reactive oxygen species metabolic process (GO:2000377), regulation of response to external stimulus (GO:0032101), and innate immune response in mucosa (GO:0002227). DUOX1 was also involved in Molecular Function (GO); oxido-reductase activity, acting on NAD(P)H, oxygen as acceptor (GO:0050664), oxido-reductase activity, acting on NAD(P)H (GO:0016651), superoxide-generating NADPH oxidase activity (GO:0016175), Cellular Component (GO); NADPH oxidase complex (GO:0043020) and KEGG Pathways; Leishmaniasis (5140), Osteoclast differentiation (4380), Leukocyte transendothelial migration (4670), Phagosome (4145), and Inflammatory bowel disease (5321). All the associated functions and pathways require DUOX1’s peroxidase activity and H2O2 generation.

Fig. 9
figure 9

DUOX1 protein–protein interaction analysis. The STRING server computed interaction network analysis with a total number of nodes to be 11 and 29 edges with 5.27 average node degree. The clustering coefficient was 0.786 and PPI enrichment value was 2.47e-06. a COG Analysis classified DUOX1 into non-supervised orthologous group (NOG19510). b Protein–Protein interaction showed DUOX1 interaction with regulatory proteins DuoxA1, Doxa2, NoxA1, interleukin IL4 along with other proteins

4 Discussion

DUOX1 is one of a major receptor dependent source of oxidants in cells [55]. Its prominent role is in innate airway host defense via production of inflammatory mediators and promotion of cell migration and wound healing [11, 16, 55,56,57,58]. It has recently been reported to be involved in multiple processes ranging from innate immunity at mucosal surfaces to Regulation of Epidermal Growth Factor Receptor Signaling [59]. Platelet-derived growth factor directs wound healing and tissue regeneration processes via DUOX1-generated H2O2 [55]. Being localized at epithelial surface of airway cavity, it forms first line of defense against invading pathogens, and consequently, it is subject to immune evasion and immunomodulation by invading pathogens [60, 61]. Hence, its regular ambient functionality is essential for innate protection [62]. In this study, we have analyzed DUOX1 protein employing in silico approaches for acquiring better understanding of its structure and functionality. Human DUOX1 sequence was subjected to structural analysis. The structure prediction was complicated due to three factors: sheer size of the protein, its transmembrane nature, and lack of template. These factors resulted in relaxation of template search criteria and employment of homology modeling approach with templates having lower sequence similarity (< 40%). Proteins with lower sequence similarity were considered (which were previously discarded) and the structure was developed in domain-by-domain manner. Due to shorter query coverage of the selected templates, three segments could not be modelled. The modelled domains were subjected to Ramachandran plot analysis. Being a membrane protein and having seven transmembrane domains, the overall quality of the predicted structure was lower and also caused difficulty in structure building as well. Two of the transmembrane domains 1227–1247 and 1249–1269 were separated by a single extra-cellular amino acid at position 1248. This caused problem in transmembrane domain prediction and all the used prediction servers failed to consider this lone amino acid as extra-cellular and hence considered 1227–1269 as a continuous transmembrane domain. The sensitivity of the prediction servers needs to be enhanced to avoid such problems. ProtParam computations predicted the protein to be unstable with lower aliphatic and stability indexes as a massive membrane protein. Prominent kinases with number of their respective phosphorylation are sites: PKC 45, cdc2 26, PKA 22, and CKII 13, while 69 potential sites were predicted for non-specific kinases. Collectively in the enzyme structure, the two terminal domains peroxidase and NOX (oxido-reductase) domains are linked through transmembrane and EF-hand Ca++ binding sites. Ferredoxin (i.e., FAD and NADPH)-binding domains are characteristic of flavoprotein modules. The electron transfer begins from NADPH to FAD ending to oxygen via heme resulting in generation of H2O2 [28]. All the interaction proteins obtained from STRING database analysis show DUOX1’s involvement in NADPH oxidation along with other oxidases, i.e., NOX and respective activation factors, e.g., DUOXA1, DUOXA2 along with cytokines, and other associating proteins.

Components of cigarette smoke, i.e., acrolein, have been reported to inhibit DUOX1 [63]. DUOX1 gets silenced in metastatic cancer, and also leads to resistance against epidermal growth factor receptor tyrosine kinase inhibitors [64]. Thus, one could assume that hypermethylation of DUOX1 promoter and/or gene could serve as potential marker for oncogenesis in lungs [65]. However, our in silico results did not yielded any potential methylation sites in DUOX1 gene region on chromosome 15. The observed upstream CpG Island might act like a cis-regulatory element and effect the gene expression. Another potential reason for lack of such sites and silencing of DUOX1 in tumors could be the sequence variations arising as SNPs and SNVs. While its higher expression levels have been reported to be associated with better survival in patients [66]. The improved survival might be attributed to prolong exposure to oxidative stress after receiving ionizing radiation therapy [67]. The SNP analysis by DMDM provided us three point mutations when observed in BioMuta data, all these three polymorphisms were present in tumor samples. The silencing of DUOX1 in tumors might be associated with the SNPs observed rather than epigentic. DUOX1 has been reported to play a significant role in keratinocytes differentiation in kidneys [68]. As mentioned in the results, kidney cancers exhibited third highest observed SNPs of DUOX1 gene; hence, there could be a probable association between the two phenomena. A similar role played by DUOX1 in human fetal lung epithelial cells [11, 69], while again, lung cancer among others had higher SNP frequency for DUOX1 gene as per SNVs’ analysis data. This high SNP frequency could also be related to DUOX1 expression levels as observed in NCBI gene expression data (Gene ID: 53905). DUOX1 is highly expressed in esophagus, skin, lungs, and thyroid tissues in comparison with other tissues which also exhibit lower degree of expression [70]. These mutations might serve as prognostic biomarkers.

As discussed previously, the potential involvement of DUOX1 in immune cell recruitment, wound repair, and tissue regeneration and above all antimicrobial defense in respiratory cavity signify its importance in the combating infections with special reference to pulmonary infections caused by P. aeruginosa. Pulmonary microenvironment in cystic fibrotic lungs provides a good case study example for studying infection establishment and immune evasion by P. aeruginosa. PCN secreted by the pathogen inhibits DOX1 and up regulates pro-inflammatory response thus causing injuries. Ironically enough over expressed DUOX1 alone aggravates the cystic fibrotic pathogenesis by pouring abundance of ROS in pulmonary cavity and has been suggested as potential drug target [12, 71]. Over expression of DUOX1 coupled with peroxidase enzymes in cases of P. aeruginosa infection could potentially enable the immune system to detoxify PCN before its entry in to epithelial cells and neutrophils, thus avoiding its severe cytotoxic effects otherwise. At least site directed delivery of microperoxidase 11 could potentially reduce the PCN mediated pathogenicity in pulmonary infections and could positively manage cystic fibrotic pulmonary disease progression. Detailed studies are required to fully understand the potential therapeutic and prognostic aspects of DUOX1 gene.

5 Conclusion

We report employment of in silico approaches for rapid and reliable structural and functional annotation for human Dual Oxidase 1 protein. The adopted approach, based on DUOX1 sequence, yielded structural as well as functional information, i.e., motifs and domains, evolutionary significance, potential post-translational modifications, and its sub-cellular localization. The disease-associated genetic polymorphism might be potentially employed as biomarkers for the studied disease conditions and hence needs further investigations comprising both clinical as well as therapeutic aspects. DUOX1 is an important component of innate immunity against pulmonary infections, and thus, the predictions reported here need to be validated in wet lab to strengthen DUOX1-based antimicrobial protection in the face of unstoppable antibiotic resistance crisis. In conclusion, DUOX1 is a stable transmembrane protein, potentially involved in multiple aspects of pulmonary health. The further studies are needed to enhance our better understanding of its potential applied role in pulmonary health conditions.