WD40 protein—definition and architecture

Formation of large heteromeric protein complexes involves regulatory interactions, which are mainly controlled by scaffolding proteins in living system. The WD40 proteins, which are also named as WD-repeats, are such group of proteins which are a prominent feature of diverse protein-protein interactions (Gibson 2009). These proteins are very abundant in eukaryotic organisms but are rarely present in prokaryotes (e.g. Thermomonospora curvata and Cyanobacterium synechocystis; Stirnimann et al. 2010). The SMART database predicted 349 WD40 domain containing proteins in humans (Letunic et al. 2009). In plants like Arabidopsis thaliana, Oryza sativa and Setaria italica more than 200 putative WD40 domain containing proteins have been predicted (Van Nocker and Ludwig 2003; Ouyang et al. 2012), which can help out to understand their cellular networking system and roles in biological molecular machinery. A phylogenetic tree of some representative WD40 genes from these plants (including some of the functionally characterized genes from Arabidopsis) brings out three major WD40 groups (Fig. 1).

Fig. 1
figure 1

Phylogenetic tree of WD40 repeats proteins. The deduced full-length amino acid sequences of 141 WD40 genes (having 81 Arabidopsis thaliana, 59 Oryza sativa and 1 Setaria italica) were aligned by ClustalW and the phylogenetic tree was constructed using MEGA 5.05 by the Neighbor-Joining (NJ) method with 1,000 bootstrap replicates. The evolutionary distances were computed using the Poisson correction method and the optimal tree with the sum of branch length = 77.88547303. Genes are mentioned by their accession numbers and names of the functionally characterized genes are given in parenthesis

Structurally, the WD40 domain is characterized by presence of several copies of WD40 repeats with each repeat containing 44–60 residue units (Fig. 2a). Each unit contains a glycine-histidine (GH) dipeptide about 11–24 residues from its N-terminus and terminates with Trp-Asp (WD) doublet residues at the C-terminus (Neer et al. 1994; Smith et al. 1999). Each of the repeat folds into four-stranded anti-parallel β-sheet and is thought to arise from intragenic duplication and recombination events and diversify during evolution (Andrade et al. 2001). It was therefore thought that the sequences and structures of the repeats in the same protein are more similar to each other than those from different proteins.

Fig. 2
figure 2

Structure of WD40 protein. a WD proteins contain several repeats of 44–60 amino acid residues and each repeat comprises four antiparallel β sheets with conserved GH residues at N-terminal and WD at C-terminal. The residues shown by stars are hydrophobic amino acid frequently present in top surface and participate in interaction with other peptide (b) Seven-bladed β-propeller structure (most stable form) made by using PyMOL (http://www.pymol.org) and (c) Four stranded antiparallel β sheet structure

Generally, the WD40 domains have been shown to contain seven or multiples of seven repeats forming a highly stable β-propeller structure (Fig. 2b). Some researchers have also predicted proteins enclosing as low as two to as many as sixteen repeats (Van Nocker and Ludwig 2003; Saeki et al. 2006; Mishra et al. 2012). On the basis of the geometry modeling, it was predicted that the 7-fold β-propeller is the most stable β-sheet geometry and this form always dominates in the resolved WD40 structures and identified WD40 proteins (Murzin 1992). Proteins with less than 7 repeats form an incomplete β-propeller structure and require additional WD-repeats from its neighbours to stabilize themselves. For instance, two complex structures, Seh1-Nup85 and Sec13-Nup145 where Seh1 (component of nuclear pore complex) and Sec13 (component of the Coat protein II) display a six-bladed open propeller structure, whereas the two Nup (component of nucleoporin pair) proteins possess a three stranded β-blade as well as helical trunk and crow modules. Domain invasion motif (DIM) of the Nup proteins inserts into the opening between blades of corresponding protein and hence completes the seven-bladed WD40 propeller structure (Hsia et al. 2007; Brohawn et al. 2008).

The role of WD40 protein is to provide a rigid platform for interactions of proteins with other cellular components. These interactions aid in controlling several vital functions of the cell, such as signaling cascades, cellular transport and apoptosis (Xu and Min 2011). The sequences to the exterior of the repeats help to establish protein specificity. They also coordinate downstream events, such as ubiquitination and histone methylation and have also been known to function in several plant developmental processes.

The WD-repeat is also called β-transducin repeat, since the first crystal structure having WD40 repeat protein was bovine β-transducin (one subunit of the trimeric G protein transducin complex; Lambright et al. 1996; Sondek et al. 1996). This Gβ subunit is involved in signal transduction and forms a seven-blade β-propeller structure containing seven WD40 repeats. In plants some other WD40 proteins include the Tightly Associated Factor (TAFII), CULLIN4-E3 ubiquitin ligase, Transparent Testa Glabra 1 (TTG1), Constitutively Photomorphogenic 1 (COP1) and Suppressor of PhyA-105 (SPA1), Anthocyanin11 (AN11), and Pale Aleurone Color1 (PAC1). These proteins have reported to be involved in light signaling/photomorphogenesis and flowering (von Arnim et al. 1997; Hoecker et al. 1999; Holm et al. 2002), respectively. Some others like, GTP binding protein β1 is involved in auxin response and cell division (Ullah et al. 2003), Fasciata2 (FAS2) functions in meristem maintenance (Kaya et al. 2001), Leunig (LUG) functions in floral development (Franks et al. 2002), while Fertilization-Independent Endosperm (FIE) regulates seed development and flowering (Sorensen et al. 2001). Another WD40 protein Cyclophilin71 which interacts with histone H3 has a role in chromatin-based gene silencing and organogenesis (Li et al. 2007).

Abundance of WD40 domains in eukaryotes and their action as common interactor

Among the different eukaryotic proteomes, WD40 domains form one of the most copious types along with RNA recognition motif, Zinc fingers, RING-finger and immunoglobulin (Stirnimann et al. 2010). The WD40 proteins occupy 1% of the total proteome in Homo sapiens, 1.4% in Saccharomyces cereviasiae and >0.8% in Arabidopsis (Stirnimann et al. 2010). Some extra functional or catalytic domains are often found in most WD40 proteins such as those involved in protein degradation (F-BOX, Really Interesting New Gene-finger; RING, and suppressors of cytokine signaling domain; SOCS), microtubule dynamics (Lissencephaly type 1-like Homology; LisH), phospholipid binding (Fab1, YOTB/ZK632.12, Vac1 and EEA1 a class of zinc finger domains; FYVE) and vesicle coating (CLathrin Heavy chain repeat homology; CLH) (Stirnimann et al. 2010). The functionality of WD40 protein lies mainly in top surface residues which are created by domain part of protein. WD40 propellers are large domains of about 300 amino acids having three distinct surfaces available for interactions: the top region of the propeller, which is defined as the part of the structure, where the loops connecting D and A strands of the WD-repeats lie, the bottom region, and the circumference (Stirnimann et al. 2010) (Fig. 2c). The most common interacting site of WD40 protein is located on top small surface (formed by N-terminal of repeat) of propeller structure. Three residues of the repeat close to the propeller axis are normally concerned in this interaction. They are positioned at the beginning of strand A (mostly two residues) and at the end of strand B (mostly one residue; Xu and Min 2011). Deletion mutation leads to the conclusion that there is no apparent folding order for each repeat and that the order in which repeats fold might vary among different WD40 proteins, or even within the same protein (Garcia-Higuera et al. 1998). Single amino acid substitutions in one repeat that affect local folding thus might not have drastic consequences for the overall folding process. Structural clustering of interaction interfaces (Aloy et al. 2003) within structural domains indicates that there are 30 interfaces (distinct families with diverse function) for 86 distinct WD40 domains or 0.34 interfaces per domain. The other common families have much lower values (0.07 or 86/1216 for kinases; 0.06 or 15/242 for PDZ domains; 0.10 or 31/310 for SH3 domains), signifying greater interface diversity for the WD40 domain family. Structure and their interacting peptide are mentioned in Table 1.

Table 1 WD40 protein and their binding partner having consensus motif

Protein–protein and protein–peptide interactions involve the entry site to the central channel of the β-propeller, corresponding to the ‘supersite’ for the β-propeller superfold, where the majority of interaction partners (including small molecules) bind (Russell et al. 1998). N- or C-terminal extensions of the β-propeller also pack against the entry sites of the central channel, while the core of the channel is not accessible for interactions. WD40 domains can thus act as large interaction platforms for multiple proteins, making them preferably suited to be hubs in cellular interaction networks. Thus, in comparison to other domains, the proteins containing WD motif are components of several interaction pairs. Datasets of yeast two-hybrid [i.e. binary interaction (Yu et al. 2008)] and of mass spectroscopy/tandem-affinity purification (i.e. multi-protein complex) experiments (Collins et al. 2007) provide evidence for highly interaction mediated by these proteins. Remarkably, the number of WD40 domain-containing complexes found in the MS/TAP datasets easily surpasses the corresponding number of binary interactions in the yeast two-hybrid datasets, suggesting that WD40 proteins act as scaffolds for assemblies of larger complexes.

DWD (DDB1-binding WD40) protein

A subset of WD40 proteins have been named as DWD [Damaged DNA binding (DDB) WD40] based on their interaction with DDB1 and CULLIN4 (CUL4). These proteins contain 16 conserved amino acids referred as “DWD box” (Angers et al. 2006; Higa et al. 2006). They have some key characteristics, such as: difference in the number of motifs, importance of the Arg residue following the WD dipeptide (Scrima et al. 2008) and competence of binding with DDB1 provided for some additional residues besides the Arg. DDB complex was first identified as a heterodimer involved in DNA repair, and consists of the 127 kDa subunit DDB1 and a 48 kDa subunit DDB2. DDB1 is recognized as the linker protein in cullin4-RING ubiquitin ligase complexes (CRL4) and thus play additional roles beyond DNA repair. The term DCAF (DDB1-CUL4 Associated Factor) is used for all DDB1 interactors except CUL4. DWD proteins have been reported as substrate receptors for CRL4, a family of multi-subunit E3 ligases. This CUL4 E3 ligase is involved in ubiquitination of target proteins for degradation by the 26S proteasome and also UV damage repair mechanism. In our own molecular docking analysis we found that the circumference of β–propeller structure of SiWD40 (Setaria italica WD40) was involved in an interaction with a SiCullin4 protein (Mishra et al. 2012). The CRLs are a huge subfamily of CUL4-based E3 ubiquitin ligases. It consists of three core subunits: CULLIN4 (CUL4), a RING finger protein Regulator of Cullins1 (ROC1)/RING-BOX1 (RBX1), and UV-Damaged DNA Binding Protein1 (DDB1) (Lee and Zhou 2007) (Fig. 3). RING fingers help in transfer of ubiquitin directly from the E2 to the substrate by allosterically activating the E2. Two different isoforms of CUL4 are expressed from a single gene in Arabidopsis, one (CUL4-L) has 50 amino acids extra at the N-terminus as compared to the other (CUL4-S) (Jackson and Xiong 2009).

Fig. 3
figure 3

Assembly of Cullin 4 RING E3 ligase (CLR4 ligase). Cullin 4 E3 is multisubunit ubiquitin ligase function in proteosomal degradation pathway and UV DNA damage repair mechanism. In the case of cullin4-E3 complex, DNA Damaged Binding1 (DDB1) and DDB1 WD40 (DWD) proteins are utilized as adaptors and substrate receptors, respectively. Here RING fingers (RBX1) help the transfer of ubiquitin (Ub) directly from the E2 to the substrate by allosterically activating the E2 and DWD transfer substrate from E2 to DDB1

In CRL4 ligase, DDB1 serves as the adaptor protein connecting the CUL4/RBX1 to the substrate receptors. In cullin E3 ligase, various types of DWD (also known as DCAF) protein serve as substrate receptors crucial for the recognition of variety of substrate (Lee and Zhou 2007; He et al. 2006). Most DWD proteins contain WD40 repeats and within the repeats, a conserved 16 amino acid block (called the DWD domain) generates the docking site for DDB1. In the 16 amino acid block, there are four highly conserved residues: Asp (or Glu) 7, Trp (or Tyr) 13, Asp (or Glu) 14, and Arg (or Lys) 16 (Fig. 4). There are hydrophobic residues at positions 1, 2, 10, 12, and 15 and small residues at positions 3, 4, and 5 (He et al. 2006). Database searches using the DWD box predicted that there are as many as 90 DWD proteins in human genome (He et al. 2006). Likewise 33 DWD proteins have been predicted in fission yeast, 36 in C. elegans, 75 in Drosophila, 78 in rice, 85 in Arabidopsis and 82 in foxtail millet (He et al. 2006; Lee et al. 2008). Further, in Arabidopsis, 101 DWD domains were identified, of which 69 had one DWD domain and 16 had two. This correlates with the fact that DWD proteins usually possess one and sometimes two but rarely three DWD motifs (He et al. 2006). Additionally, in rice, 96 DWD motifs were present of which 14 had two while two of the proteins had three motifs. Moreover, some other conserved domains were found in 15 Arabidopsis and 9 rice DWD proteins which may regulate interaction with substrate proteins.

Fig. 4
figure 4

Conserved amino acids of DWD motifs in various DWD proteins. Hy is hydrophobic amino acids, Sm is small amino acids and X is any amino acid

AtDDB2 and AtCSA-1 (an ortholog of the mammalian Cockayne Syndrome type-A), encode two related WD40 proteins. Both along with RBX1 form CRL4 complex, [CSA1-DDB2-CUL4A/B-RBX1 (CRL4DDB2/CSA) ligases], were reported as crucial components in UV-B DNA damage repair mechanism in Arabidopsis thaliana (Biedermann and Hellmann 2010; Zhang et al. 2010; Fischer et al. 2011). Here CUL4-CSA1 ubiquitin ligase targets CDT1 (chromatin licensing and DNA replication factor 1) for ubiquitination in response to UV irradiation. It also ubiquitinates histone H2A, histone H3 and histone H4 at sites of UV-induced DNA damage (Wang et al. 2006). Histone ubiquitination assists their exclusion from the nucleosome assembly and consequently helps in DNA repair. Hence ubiquitination of histone does not lead to degradation, instead assist in repair mechanism.

Role of WD40 protein in development and cellular diversity

A protein complex comprising MYB and bHLH transcription factors linked with a WD40 repeat protein regulates the anthocyanin biosynthetic pathway in Antirrhinum majus, Petunia hybrida and Arabidopsis thaliana (de Vetten et al. 1997; Walker et al. 1999; Quattrocchio et al. 1999; Morita et al. 2006; Schwinn et al. 2006). In Arabidopsis, the WD40 repeat protein TTG1 is vital to several aspects of epidermal cell fate, positively regulating trichome formation, anthocyanin production, seed-coat pigmentation and seed-coat mucilage production, and negatively regulating hypocotyl stomatal-cell identity and root-hair formation (perhaps by initiating the differentiation of alternative cell fates). Ramsay and Glover (2005) suggested TTG1 to function as a scaffold for protein–protein interactions between the bHLH and MYB proteins. A seven WD-repeats nucleolar protein named yaozhe (YAO) is essential for the correct positioning of the first zygotic division plane and plays a significant role in gametogenesis in Arabidopsis. Its counterparts in yeast and human are components of the U3 small nucleolar ribonucleoproteins (snoRNP) complex, hence YAO is probably involved in rRNA processing in plants as well (Li et al. 2010). Another five WD40-repeat protein OsLIS-L1 contains a LisH domain at C terminus along with a domain of four WD40 repeats homologous to the β subunit of trimeric G-proteins (Gβ). OsLIS-L1 plays an important role in male gametophyte formation and the first internode elongation in rice (Gao et al. 2011). Another WD40 repeat protein (NEDD1) from Arabidopsis was found to support cell division by interacting with the γ-tubulin complex (Zeng et al. 2009). This protein arranges spindle microtubules (MTs) preferentially towards spindle poles and phragmoplast MTs toward their minus ends and helps in organization of spindles during mitosis (Zeng et al. 2009). The An11 family of WD40 proteins is highly conserved between plant and animals. In plants, the An11 protein is known to be involved in the regulation of anthocyan biosynthesis (de Vetten et al. 1997).

Role in post-translational modification

Embryonic Ectoderm Development (EED) is a WD40 protein responsible for the methyltransferase activity of Polycomb Repressor Complex 2 (PRC2; a histone methyltransferase) complex and directs it’s binding to histone H3 tail (Margueron et al. 2009; Xu et al. 2010). The interaction of EED with H3K27me3 marks enhanced histone methyltransferase activity of PRC2, which in turn maintains transcriptional silencing whereas interaction with H1K26me3 inhibits the PRC2 methyltransferase activity (Xu et al. 2010). Chromatin-based gene silencing provides a vital mechanism for the regulation of gene expression. Li et al. (2007) identified a WD40 repeat protein named CYCLOPHILIN71 (CYP71), and showed its utility in gene repression and organogenesis in Arabidopsis. Methylation of H3K27 of chromatin histone H3 is directly mediated by CYP71 in some regulatory genes, including KNAT1 (knotted1-like gene) and STM (SHOOT MERISTEMLESS) that regulate meristem activity and organogenesis in Arabidopsis. Because CYP71 is highly conserved among eukaryotes ranging from fission yeast to mammals, its homologs may function as universal histone remodeling factors required for epigenetic gene silencing (Zhu et al. 2008). MSI1 a WD40 protein in Arabidopsis has also been proposed to exhibit pleiotropic phenotypes by epigenetic regulation (Hennig et al. 2003; Alexandre et al. 2009). WD40 β-propeller is a class of ubiquitin-binding domains especially found in member of the F box family of SCF ubiquitin E3 ligase adaptors (Pashkova et al. 2010).

Role in abiotic stress tolerance

Zhu et al. (2008) reported that an Arabidopsis WD40-repeat protein (HOS15), functioned through chromatin histone deacetylation for repressing abiotic stress related genes. ABA and abiotic stress treatment increases the expression level of HOS15. Salt and ABA treatment leads to upregulation of many stress-responsive genes in hos15 mutant which is higher than wild type. Hence chromatin remodeling by means of HOS15 (caused by deacetylation of histone H4) signifies the role of this gene in abiotic stress tolerances in plants (Zhu et al. 2008). A salt-responsive WD40 protein in rice (LOC_Os08g38880) having five WD40 repeats is regulated by salt stress and contains several stress responsive cis-acting elements in upstream region of gene (Huang et al. 2008). BnSWD1 (Brassica napus salt responsive WD40 1) contains eight WD40 repeats and is found to be conserved in all eukaryotes. This gene expresses at high levels under salt-stress conditions and is also upregulated after various hormonal treatments such as abscisic acid, salicylic acid, and methyl jasmonate (Lee et al. 2010).

The myoinositol signaling pathway has been correlated to stress and developmental processes in plants. WD40 repeat region of a myoinositol polyphosphate 5-phosphatase (At1g05630) was found to interact with the sucrose nonfermenting-1-related kinase, which has a central role in sugar, metabolic, stress, and developmental signals in Arabidopsis (Ananieva et al. 2008). Further, we have recently shown that a WD40 repeat containing gene (homologous with salt responsive WD40 protein of rice) is upregulated during different abiotic stresses in foxtail millet. It also shows the DNA binding activity with an AP2 domain containing protein, SiDREB gene (Mishra et al. 2012).

Conclusions

The WD40 domain is one of the most abundant and highly conserved domains across eukaryotes. Given their rich interaction surfaces, these proteins function as an adaptor in many different protein complexes or protein-DNA complexes in very diverse cellular processes. They mediate molecular recognition events mainly through the smaller top surface of domain which comprises three residues and form the transient complexes with other peptide. DWD protein function as substrate receptor for CUL4 mediated E3 ligase dependant in proteasomal degradation of protein. Some WD40 proteins control the post-translation event through methylation and uniquitination of histone protein. Besides, these proteins also play role in development specific event and abiotic stress tolerance in plant. Due to larger subunits assemblies and lack of measurable intrinsic activity (such as catalysis), these proteins are difficult to handle. However, structural details and functional mechanism of most WD40 protein in plants is still unclear which is critical for understanding their role in cellular process. We thus definitely need to explore more about the WD40 domain protein complexes in the future.