Introduction

Throughout evolution plants have developed an extraordinary ability to overcome fluctuating and drastic environmental changes. Their sessile nature has imposed the selection of particular defense strategies allowing them efficient and effective adjustment or acclimation responses to these conditions, as well as skilled mechanisms to tolerate and survive them. The different endurance strategies selected in these organisms are the result of complex structural and interconnected regulatory networks, which have evolved in an intimate relationship with developmental programs. For instance, in many plant species, the reproductive stage waits for favorable climatic conditions to instrument a crucial set of processes for their perpetuation; root architecture modifies according to the availability of water, phosphorus and other nutrients; and orthodox seeds once desiccated can remain dormant for many years without significant loss in viability until they find sufficient water to germinate [1, 2]. This outstanding resourcefulness suggests mechanisms that make them capable of detecting diverse changes in the plant cell milieu, imposed by the external environment or by developmental programs.

Many molecular response mechanisms are efficiently adapted for rapid detection of subtle environmental fluctuations, as can be observed in mechano-sensitivity, ion channels, proton pumps and post-translational protein modifications. The modification of protein structure also seems to be an efficient and effective transducer of a great diversity of signals. This phenomenon is commonly associated with changes in protein conformation produced by phosphorylation or acetylation, or by interactions between proteins or other partners such as nucleic acids or other small molecules acting as substrates, cofactors, and allosteric regulators [3, 4]. However, an adaptation that has received less attention is that related to the intrinsic plasticity found in those proteins that have the ability to present different transient structures depending on the nature of their surroundings.

During the last decade, we have witnessed significant advances in the identification and characterization of many proteins showing intrinsic structural disorder. This has increased our knowledge of their functional relevance, structural properties and dynamics, as well as mechanisms of action (For review see [5,6,7]). Intrinsically disordered proteins (IDPs) are widely distributed in all domains of life. Although only a few complete proteomes from the different domains are currently available, various bioinformatic studies agree that Eukaryota proteomes show a higher average of disorder, compared to those of Bacteria, which in turn present higher disorder than those of Archaea. Interestingly, the predicted disorder in eukaryote proteomes spans a broad range of score values, with both very low and very high disorder [8, 9]. Overall, current information indicates that the level of disorder is higher in eukaryotic organisms than in prokaryotes. Even more important is the observation that protein superfamilies, which have undergone massive diversification during evolution present more structural disorder than other families. These data also correlate with the expansion of the number of cell types in an organism, revealing a positive relationship between proteome disorder and organism complexity [10, 11].

The accumulated knowledge on IDPs has revealed their functional versatility resulting from their peculiar properties. For example, IDPs can form ensembles with different structural conformations, allowing variability in the exposed surfaces [7, 12,13,14]. This structural plasticity confers to IDPs the ability to differentially exhibit different post-translational modification sites and/or recognition motifs, depending on specific conditions to interact transiently, but specifically with proteins or nucleic acids. With this in mind, it is not surprising the central roles that IDPs play in cellular functions, achieving regulatory and signaling roles as well as acting as scaffold or assembly proteins.

In this review, we present a general panorama of the available knowledge on protein disorder in plants. We have put together this information in the context of fundamental biological processes such as development, metabolism and stress responses, which in spite of the limited number of studies unveil the functional relevance of these proteins in the life of plants. The different IDPs referred to in this work are compiled in Tables 1 and 2.

IDPs distribution in plants

In recent years, the discovery and characterization of proteins with different amounts of structural disorder has revealed their high representation in plants [15,16,17,18]. Large-scale analysis of IDPs and intrinsically disordered regions (IDRs) in Arabidopsis thaliana, a widely used experimental model in plant biology, has shown that approximately 30% of its proteome is mostly disordered [10, 16], whereas Zea mays and Glycine max proteomes contain an even higher proportion of disorder (~50%) [19]. Interestingly, the chloroplast and mitochondrial proteomes show a significantly lower occurrence of disorder (between 2 and 19%) when compared to nuclear proteomes of different plant species. The abundance of disorder in these organellar proteomes is comparable to that of Archaea and bacteria, in accordance with the bacterial origin of the genes encoding their proteins [20]. The IDPs encoded in these organellar genomes are mostly involved in translation, transcription or RNA biosynthesis, and some are structural constituents of ribosomes, having in common the ability to form large complexes or to interact with numerous partners as expected from their intrinsic structural flexibility [5, 20]. It is interesting to note that for those proteins with paralogues of nuclear origin, both copies tend to show similarly low levels of disorder, suggesting again a common extra-nuclear origin or functional constraints [20]. Furthermore, the recent data obtained from the examination of the distribution of genes encoding IDPs in the genomes of A. thaliana and Oryza sativa indicate that they are not randomly arranged and that their organization may result from high recombination rates and chromosomal rearrangements. These observations are in accordance with the location of genes for proteins with highly disordered content within recombination hotspots and possessing high G + C content; this codon usage related to the over-representation of specific amino acid residues in IDPs (e.g., Arg, Gly, Ala and Pro) [19].

In silico analyses of the Arabidopsis proteome and of proteins from other plant species have found that IDPs are highly represented in functions related to cell cycle, nucleic acid metabolism, protein synthesis, hormone signaling and regulation of gene expression, development and responses to stress [16, 17, 19, 21,22,23]. This last functional category seems to be particularly associated with plant IDPs, including proteins involved in detection and signaling of external stimuli, chaperone activities and secondary metabolism; all essential functions for the phenotypic plasticity needed for plant adaptation and survival, as will be further discussed in this review. It should be noted that the examples we present in this review cannot be unequivocally classified and they may belong to several functional classes, which alludes to their functional promiscuity.

IDPs in plant development

The study of plant development and the characterization of the mechanisms involved have identified many proteins playing major control roles in this process. Further detailed analyses have revealed the presence of IDRs in some of these proteins. Germination and early seedling development [24], adventitious shoot formation [25], xylem development [26], photomorphogenesis [27], phytohormone signaling and response [28], flowering [29], and vegetative and reproductive growth [30] are some of the processes where IDR-containing proteins appear as key players. Interestingly, the structural plasticity arising from IDRs of several of these IDPs has been shown to be essential for proper function.

Table 1 Plant intrinsically disordered proteins involved in development, metabolism and stress response

TCP (TB1-CYC-PCF1) transcription factors

The appropriate development and function of vegetative (leaves, shoot and roots) and reproductive (flowers) organs is orchestrated by several proteins, which are subjected to adaptable but precise spatio-temporal control, resulting in a timely fine-tuning of cell proliferation, expansion and differentiation [31]. Many of these proteins are transcription factors, some of which contain IDRs of significant length, that by interacting with other proteins and/or binding to DNA decode a specific signal in the activation or repression of gene expression. The TCP [from TEOSINTE BRANCHED1 (TB1), CYCLOIDEA (CYC) and PROLIFERATING CELL NUCLEAR ANTIGEN FACTOR1 (PCF1)] protein family consists of plant-specific transcription factors involved in plant shape developmental control. Bioinformatic analyses have shown that these transcription factors are IDPs [30, 32]. TCPs are classified as class I or class II according to the characteristics of their conserved and non-canonical basic helix-loop-helix (bHLH) DNA-binding domain [30, 32]. Class I TCP transcription factors participate in organ shape and growth, pollen development, germination, and inflorescence and flower development [33]. Class II TCPs, in addition to their redundant function in the regulation of lateral organ morphogenesis, also participate in endosperm, cotyledon, leaf, petal and stamen development, as well as other aspects of plant development and other processes [33]. Some of the functions assigned to TCP transcription in plant growth and development are a consequence of their involvement in the biosynthesis of some phytohormones, such as brassinosteroids and jasmonic acid, and other metabolites with biological activity such as flavonoids [21]. Analysis of the 24 Arabidopsis TCP protein sequences has shown a differential structural disorder content between the two TCP classes; with class I being more disordered than class II [30]. Biochemical analysis of TCP8, a class I TCP shows three IDRs of more than 50 residues in length containing a cluster of serine residues, at least one of which is phosphorylated [30]. In addition, the IDR located in the TCP8 C-terminal region corresponds to a trans-activation domain (TAD), which is required for the formation of high-order TCP8 homo-oligomers [30]. The identification of molecular recognition features (MoRFs) in the TCP8 TAD [30] and evidence of its requirement to bind TCP15 and PNM1, a pentatricopeptide repeat protein [34], are consistent with TCPs’ function as mediators of different stimuli or signals (Fig. 1a). Furthermore, they demonstrate the importance of IDRs as protein domains able to confer the ability to recognize various different partners, a feature needed for precise and flexible control.

Fig. 1
figure 1

Schematic representation of two examples of plant proteins containing IDRs that participate in developmental and metabolic processes. a TCP8 is a plant-specific transcription factor involved in plant shape developmental control. TCP8 contains three IDRs (represented by curved lines). In these IDRs, there are conserved serine residues, from which at least one is phosphorylated (fill blue small circle in the middle IDR). The IDR at the C-terminal region corresponds to a trans-activation domain (TAD) required for the formation of TCP8 homo-oligomers. This TAD is also required to bind different partners, such as TCP15 or PNM (red irregular oval). The IDR at the amino-terminal region (purple irregular line) is part of the TCP8 DNA-binding domain; this disordered region gains structure when TCP8 binds to DNA. b CP12 plays a key role in the regulation of the Calvin cycle by translating changes in light availability into the modulation of GAPDH and PRK enzyme activities. CP12 is a scaffold protein (represented by curved lines at the top of this panel) that forms a ternary complex with GAPDH (blue and red irregular ovals) and PRK (brown irregular oval) (GAPDH-CP12-PRK) (represented by the association of the three components at the bottom of the panel). During the formation of the GAPDH-CP12-PRK complex, GAPDH associates with CP12 by conformational selection. Upon this interaction, the CP12 N-terminal region remains in a fuzzy state, serving as a linker that facilitates the interaction with PRK. Once the complex is formed, it dimerizes to form a native complex in which there are two dimers of PRK, two tetramers of GAPDH and two monomers of CP12 (figure at the bottom right of this panel). Using this mechanism, it seems that CP12 is able to modulate GAPDH and PRK activities

NAC (NAM-ATA-CUC2) transcription factors

Another fundamental aspect of plant development is the maintenance of the shoot apical meristem (SAM). NAC (NAM/ATAF/CUC2) transcription factors constitute one of the largest families described in plants that, in addition to their involvement in other processes, control key aspects of SAM maintenance [35]. A conserved and folded DNA-binding domain defines these transcription factors; however, an additional feature of some NAC transcription factors is the presence of a variable and disordered TAD [24]. This characteristic has been experimentally confirmed for several NAC TADs [21, 36], such as ANAC019, involved in germination and early seedling development; HvNAC005 and HvNAC013, in senescence; NTL8, ANAC013, NAP, ANAC046 and SOG1 in germination and senescence; CUC1 in adventitious shoot formation and ANAC012 in xylem fiber development [24, 25, 37, 38]. It is known that TADs from HvNAC013 and ANAC046 interact with the RST (RCD1-SRO-TAF4) multi-binding domain of the hub protein RCD1 (RADICAL-INDUCED CELL DEATH1), a regulator of developmental, hormonal and stress responses [37]. Differing from the folding-upon-binding phenomenon, no structural rearrangement of the two disordered TADs occur upon binding to RCD1 [37], indicating that these ensembles might function as fuzzy complexes.

Elongated hypocotyl (HY5), bZIP transcription factor

Light is absolutely required for plant life. The presence or absence of light causes developmental reprogramming. The light-dependent modulation of plant development is known as photomorphogenesis. This developmental program leads to cotyledon expansion, hypocotyl shortening and chloroplast development [27]. HY5 (Elongated hypocotyl) is a bZIP transcription factor that positively regulates photomorphogenesis [39]. Disorder within the N-terminal region of HY5, responsible for the interaction with its negative regulator COP1, a multifunctional E3 ubiquitin ligase, has been demonstrated by various biophysical methods including limited proteolysis, mass spectrometry, circular dichroism (CD) and nuclear magnetic resonance (NMR) [27]. It is proposed that this disordered character might modulate the interaction with its partners, although functional characterization is still needed.

Cryptochromes (CRYs), blue light receptors

Plants are able to sense light quality (or wavelength) using different proteins such as phytochromes, phototropins and cryptochromes (CRY). Cryptochromes are blue light receptors that control developmental processes such as seedling de-etiolation, growth by elongation and initiation of flowering [40, 41]. CRYs consist of two domains: a conserved light-sensing N-photolyase-homologous region (PHR) of about 500 residues, and a C-terminal tail of variable sequence and length (CRY C-terminal Extension, CCE) [42, 43]. The CCE tail interacts with the PHR domain in a globular well-defined structure. Light activation of the Arabidopsis receptors CRY1 and CRY2 releases the CCE tail from the PHR, inducing the unfolding of the tail and allowing the interaction of both the PHR and the CCE with other proteins (e.g., COP1 and SPA1, a suppressor of phytochrome A1) to promote the blue light signal transduction pathway [44,45,46]. The light-induced disordered state of CRY receptors has been characterized by several biophysical methods such as limited proteolysis, CD, NMR and X-ray crystallography [47,48,49]. It is possible that plant CRYs use their disordered CCE region to efficiently recognize diverse binding partners through high-specificity/low-affinity interactions, potentially expanding the repertoire of plant signaling pathways coordinated by light [17].

HDC1 (histone deacetylase complex 1)

Regulation of chromatin accessibility is an important event of gene expression control, fundamental in developmental processes to fulfill the cell requirements within its organismal context. This process depends on the action of multiprotein complexes that control different modifications in DNA and histones [50]. One of these complexes is the histone deacetylase complex (HDAC), which in plants consists of histone deacetylases, co-repressors and histone-binding proteins [51]. HDC1 (HISTONE DEACETYLASE COMPLEX1) is a protein component of Arabidopsis HDAC containing a disordered N-terminal region [52, 53]. Interestingly, an HDC1 knockout mutant shows impaired leaf growth and delayed flowering, demonstrating its participation in plant development [52]. As expected for an IDP, HDC1 interacts with a wide variety of partners (HDA6, HDA19, SNL3, SNL2, SAP18, ING2 and MSI1) [53]. Deletion of the N-terminal disordered region considerably weakens HDC1 interaction with those proteins. This result together with evidence obtained from complementation experiments shows that the HDC N-terminal IDR plays a significant role in the coordination of flowering and petiole development [53].

BRI1 and BKI1, brassinosteroids signaling proteins

Brassinosteroids (BRs) are plant hormones that control a variety of growth and developmental processes, such as vascular differentiation, leaf development, stem elongation, flowering, senescence, stomatal development and male fertility [54,55,56]. BRs are perceived at the cell surface by BRI1 (BRASSINOSTEROID INSENSITIVE 1), a leucine-rich repeat receptor-like kinase (LRR-RLK) and its co-receptor BAK1 (BRI1-ASSOCIATED RECEPTOR KINASE 1) [57]. In the absence of BRs, the cytosolic kinase activity of BRI1 is maintained at low levels by auto-inhibition through its C terminus and by interacting with the repressor protein BKI1 (BRI1 KINASE INHIBITOR 1) [58]. When BRs bind to the extracellular domain of BRI1, the intracellular kinase domain is activated through auto- and trans-phosphorylation. BKI1 is then phosphorylated by BRI1 and released to the cytosol [59]. In contrast to animal LRR toll-like receptors, the extracellular region of the BR receptor contains a superhelix of twenty-five twisted LRRs; moreover, a ~70 amino acid ‘island’ domain has been localized between LRRs 21 and 22, which together constitute a hormone binding region. BR binding causes a conformational change in the BRI1 receptor that leads to its auto-phosphorylation. Remarkably, the ‘island’ domain connects to the LRR core through two long-disordered loops that become fully ordered upon binding to the steroid ligand. This makes the receptor competent to interact with other proteins, a conversion that may be necessary for receptor activation. It has been proposed that the BRI1 IDR may be an LRR receptor adaptation for efficient detection of small ligands [60]. Further participation of protein structural disorder is evident in this BR sensing protein ensemble, as the BKI1 C-terminal region presents high levels of disorder, particularly, at the BRI1 interacting motif (BIM). It is interesting to note that even though angiosperm BKI1 orthologues are highly diverse, the BIM IDR shows a high degree of conservation [61]. This, together with the finding that the absence of the IDR leads to increased BR sensitivity, establishes its relevance in BR signaling in plants [61].

Luminidependens (LD), a plant prion

The most diverse group of plants corresponds to the flowering plants (angiosperms). Flowering needs to be precisely controlled to generate flowers in an optimal time frame, where environmental conditions match with the presence of pollinators to promote fertilization and reproduction processes [62] Flowering often follows vernalization, a process achieved after a prolonged period of cold (winter), which ensures flowering in the spring [63]. Interestingly, Chakrabortee and collaborators found that a high proportion of proteins related to flowering in Arabidopsis are predicted to contain prion-like domains (PrDs) [29]. Some of these proteins are involved in transcription or regulation of RNA stability in the autonomous flowering pathway: Luminidependens (LD), Flowering Locus PA (FPA), Flowering Locus Y (FY) and Flowering Locus CA (FCA) [29]. Prions are proteins that retain the molecular memory of the cell because they are able to adopt different conformations and can be self-perpetuating [64]. PrDs are enriched in glutamine, asparagine, glycine, proline, serine and tyrosine and it has been shown that they are intrinsically disordered [65, 66]. LD is the first protein reported to have prion-like properties in plants, and can fully complement the activity of the Sup35 PrD, a well-characterized yeast prion [29]. As expected for a prion-like protein, LD protein shows a high level of structural disorder (64.6% according to PONDR, this work) [67], indicating that it is an IDP, even though this property has not been experimentally tested. Notably, LD orthologues from different plant species (Z. mays, O. sativa, Phaseolus vulgaris and Physcomitrella patens) also show a high percentage of disorder (51–66%, this work) [67]. As mentioned above, LD, along with a substantial percentage of Arabidopsis PrD-containing proteins, participates in flowering processes. This suggests that these proteins may play adaptive roles in the plant environmental memory required for fast responses to changing conditions, fine-tuning reproductive functions and consequently plant species preservation.

GRAS (GAI-RGA-SCR) transcription factors

The plant-specific GRAS [GIBBERELLIC ACID INSENSITIVE (GAI), REPRESSOR OF GAI (RGA), SCARECROW (SCR)] protein family is essential in diverse developmental processes, acting as integrators of signals from different plant growth regulatory inputs (for an extensive review refer to 68). GRAS proteins modulate gene expression through interaction with different transcription factors, thereby controlling their activities. Along with the conserved and folded GRAS domain, GRAS proteins are characterized by a disordered N-domain enriched in MoRFs [69]. Remarkably, the predicted MoRFs exclusively reside in the N-domain conserved motifs that define each subfamily, suggesting that structural disorder permits interactions with different proteins [17]. As has been established for other unstructured proteins, GRAS IDRs containing MoRFs experience disorder-to-order transitions when interacting with their ligands [17, 68,69,70]. GRAS proteins are classified in ten subfamilies. One of these subfamilies, composed of DELLA (Asp-Glu-Leu–Leu-Ala) proteins, is particularly important for hormonal regulation because DELLA proteins participate as negative regulators of gibberellic acid (GA)-induced plant growth. These are negatively regulated under increasing GA, as GA binds to its receptor (GID), prompting the interaction of the GID-GA complex with the disordered N-domain of DELLAs. This, in turn, promotes the degradation of the DELLA proteins through the ubiquitin–proteasome pathway, resulting in derepression of plant growth [71]. This interaction is mediated by the conserved DELLA and VHYNP motifs localized in an IDR that upon binding to the GID1/GA complex, experiences a disorder-to-order transition [70]. The participation of GRAS IDRs in this signaling pathway highlights their prevalence and function among hub network proteins, operating as integrators of environmental and developmental cues in plants.

MAP65-1, a microtubule associated protein

MAP65-1 is a microtubule (MT)-bundling protein implicated in central spindle formation and cytokinesis in animals, yeast and plants [72]. The Arabidopsis genome has nine genes encoding MAP65 proteins [73]. All these proteins have an N-terminal dimerization domain and an MT-binding domain. The MT-binding domain is localized at the second half of the MAP65-1 protein. The N-terminal region of this part of the MAP65-1 protein contains a conserved sequence responsible for MT binding, whereas its C-terminal region is more variable and predicted to be disordered [74, 75]. It was recently shown that Arabidopsis MAP65-1 is phosphorylated by Aurora α-kinases at two amino acid residues within its C-terminal disordered tail. The phosphorylation of these residues renders its detachment from MTs, leading to cell cycle progression, suggesting that the unfolded structure in MAP65-1 is required to modulate the accessibility of the two phosphorylatable residues to Aurora kinases, hence ensuring appropriate cell proliferation during plant development [75].

NRPE1, the largest subunit of Pol V

The RNA-directed DNA methylation (RdDM) pathway may act to repress the transcription of transposable elements to maintain genome integrity, mostly during critical plant development stages [76]. In A. thaliana, the canonical RdDM pathway is characterized by the participation of heterochromatic 24 nt small RNAs (hc-siRNAs) which are mainly produced by the interplay between RNA POLYMERASE IV (POLIV) and RNA-DEPENDENT RNA POLYMERASE 2 (RDR2). These enzymes generate a double stranded RNA that is subsequently trimmed into a 24 nt duplex by a type III ribonuclease, DICER-LIKE 3 (DCL3) [77, 78]. The generated hc-siRNAs are then methylated by HEN1 at the 3’ end of each strand [79] to be exported to the cytoplasm where one strand associates with the ARGONAUTE 4 (AGO4) complex [80]. The complex is then imported to the nucleus where hc-siRNA pairs may bind by base complementarity to a scaffold long non-coding RNA produced by RNA POLYMERASE V (POLV) [81]. The association of AGO4 in the silencing complex allows a physical interaction between this protein and POLV carboxy-terminal domain (CTD) via AGO hooks (described below) aided by the function of KTF1/SPT5L (Suppressor of Ty insertion 5—such as a homologue of SPT5 Pol II-associated elongation factor) [82]. This triggers the recruitment of a plethora of proteins which remove active chromatin marks and establish repressive ones, such as DNA methylation, DNA and histone modifications and chromatin remodeling features (reviewed extensively in [83]).

A peculiarity of the RdDM pathway in plants is the participation of two plant-exclusive RNA polymerases, POLIV and POLV. The catalytic domain of these polymerases is highly conserved, but their specific activities are conferred by their largest subunits; NRPD1 for POLIV, and NRPE1 for POLV [76, 84]. These subunits possess a characteristic carboxy-terminal domain which, in the case of NRPE1, contains a region rich in GW, WG and GWG amino acid residue arrangements, known as AGO hooks [84, 85]. This region constitutes an AGO-binding platform necessary for the interaction between NRPE1 and AGO4 and the consequent small RNA-directed DNA methylation [86]. Besides NRPE1, AGO hooks are also present in other AGO-binding proteins with up to 45 repeats. Along with their repetitive character, AGO-binding platforms have been predicted to be IDRs [87]. Interestingly, whereas the AGO-binding platform of NRPE1 orthologues is highly divergent in the primary sequence, the intrinsic disorder and the presence of AGO hooks are hallmarks of AGO-binding platforms across NRPE1 s. These characteristics are also extended to other AGO-binding proteins such as SPT5L, suggesting that this repetitive disordered structure is required to interact with a broad repertoire of targets, presumably regardless of sequence conservation [84]. Moreover, the evolutionary analyses reported by Trujillo et al. [84] suggest that this repetitive disordered array has been conserved to allow rapid sequence divergence while maintaining key functions in these proteins.

Protein disorder in plant metabolism

Large-scale computational approaches have found that IDP functions seem to be more common in signaling and regulation processes, whereas structural order is more frequent in proteins involved in catalysis, in binding of small ligands and in membrane proteins (channels or transporters) [88]. However, this dichotomy contrasts with the description of some enzymes containing IDRs in loops or tails, which participate in the modification of protein conformation upon substrate binding, and thus expose catalytic residues and contribute to catalysis [89,90,91]. Furthermore, one must consider the role of some IDRs as sites for post-translational modifications, acting as switches of activation/inactivation or as modulators of their own activity. Many of these IDR-containing proteins are involved in the fundamental housekeeping of the plant.

In this section, we will describe those IDPs known to participate in different aspects of plant metabolism; some of them involved in photosynthesis, in metal binding or in antioxidant mechanisms.

Chloroplast protein 12 (CP12)

Few studies have investigated the role of protein structural disorder in the plant photosynthetic machinery. However, with the advancement of the characterization of proteins implicated in this process, more data are emerging showing the impact of intrinsic disorder in this essential plant function. An example of this is the chloroplast protein 12 (CP12), a well-characterized scaffold protein that forms a ternary complex with glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and phosphoribulokinase (PRK), named the GAPDH-CP12-PRK complex. CP12, present in most photosynthetic organisms, also regulates GAPDH and PRK activities [92, 93]. CP12 is a small protein (8.5 kDa) encoded in the nuclear genome and translocated to chloroplasts; although it contains cysteine residues, it has been shown to have all the properties of an IDP. Because its degree of disorder is higher in vascular plant orthologues than in eukaryotic algae, it has been proposed that CP12 has evolved to become more flexible, which correlates with its increased multifunctionality [94, 95]. In the plant kingdom, CP12 proteins share common features; however, their N termini, in addition to being highly disordered, show high sequence variability [95, 96]. During the formation of the GAPDH-CP12 or PRK-CP12 binary complexes, CP12 structural disorder remains, in particular in its N-terminal region, indicating that these are fuzzy complexes. These observations have suggested that the fuzziness of this association could facilitate the binding of either GAPDH or PRK [97, 98]. The integration of the different lines of evidence suggests a model for the formation of the GAPDH-CP12-PRK complex, where GAPDH associates with CP12 by conformational selection; first recognizing specific conformation(s) in CP12 to establish the binding. Upon this interaction event, the CP12N-terminal remains in a fuzzy state acting as a linker to facilitate the association with PRK. Once the GAPDH-CP12-PRK complex is formed, it dimerizes to form the native complex, composed finally of two dimers of PRKs, two tetramers of GAPDH, and probably, two monomers of CP12 [93, 96, 97] (Fig. 1b).

CP12 plays a key role in the regulation of the Calvin cycle, transducing changes in light availability such as those occurring during the day–night transition. This event leads to the generation of a hyperoxidant state, which is detected by the two cysteine residues in the CP12 C terminus forming a disulfide bridge. This leads to a conformational change in CP12, resulting in its N-terminal region folding into α-helix [96], which subsequently prevents the entrance of the NADPH cofactor in the GAPDH catalytic site. In the night-to-day transition, the conformation is reversed; the disulfide bridge is reduced by thioredoxin permitting NADPH entry and resulting in GAPDH activation. This inhibiting effect exerted by the CP12 also occurs on the PRK enzyme, as part of the complex. Interestingly, accumulating evidence indicates that CP12 assembles in larger supramolecular complexes, as happens in Chlamydomonas reinhardtii, where the GAPDH-CP12-PRK complex associates with aldolase [92], thus suggesting additional roles in other metabolic processes [23]. From the differential lines of evidence, it can be concluded that CP12, as with some other IDPs, has a moonlighting activity, being able to act as a scaffold for GAPDH and PRK [93], as a regulator of these enzyme activities, and as a protective shield against oxidative damage [23, 99, 100].

Glyceraldehyde-3-phosphate dehydrogenase (GAPDH)

GAPDH plays a central role in glycolysis and gluconeogenesis. In vascular plants, GAPDH can exist as heterotetramers of two GapA and two GapB (A2B2) subunits, as homotetramers of four GapA subunits (A4) or as hexadecamer of eight GapA and eight GapB subunits (A8B8). Interestingly, the GapB subunit also contains a C terminus highly similar to the CP12 C-terminal IDR [101]. The presence of two cysteine residues in this region permits photosynthetic NADPH-dependent GAPDH containing the GapB subunits to detect redox changes. Oxidative conditions induce the formation of a disulfide bridge in its CP12-like C terminus, promoting the NAD-dependent arrangement of higher homo-oligomers that result in auto-inhibition of its NADPH-dependent catalytic activity. This conformational change and complex formation is needed for the reduction of 1,3-bisphosphoglycerate to produce glyceraldehyde-3-phosphate [101,102,103]. This intrinsically disordered feature of GapB confers on A2B2 GADPH a CP12-autonomous regulation by the redox status of the cell.

Ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) activase

Rubisco (Ribulose-1,5-bisphosphate carboxylase/oxygenase), the most abundant protein on Earth [104, 105], is an enzyme responsible for fixing atmospheric CO2 into RuBP (ribulose 1,5-bisphosphate) to produce two phosphoglycerate molecules. The activity of this enzyme depends on the binding of Mg2+ ions and the carbamylation of a lysine residue located in its active site; however, the binding of RuBP can reduce the efficiency of carbamylation, and consequently the activation of the enzyme [106]. Nature has solved this limitation through proteins known as Rubisco activases that because of their ATPase and chaperone activity, allow Rubisco carbamylation by removing RuBP from the active site and giving access to CO2 molecules. Photosynthetic organisms present two Rubisco activase isoforms (α and β) [107] containing a C-terminal extension (20–50 amino acid residues) which is predicted as an intrinsically disordered region [23]. As is the case for CP12, this IDR contains two highly conserved cysteine residues in α isoforms [108, 109], responsible for the light regulation of Rubisco activase. This control is achieved by the action of thioredoxin f on the two cysteine residues, such that upon oxidation, inhibition of α isoform activity by light is abolished [110, 111]. The Rubisco activase function can be recovered by the reduction of the C-terminal disulfide bridge by thioredoxin f, depending on the redox status of the chloroplasts [112]. Interestingly, in spite of the functional or structural differences among Rubisco activases in diverse photosynthetic organisms, their C-terminal IDRs have been conserved; for example, in the case of cyanobacteria, they are involved in carboxysome targeting [23, 113]. Overall, intrinsic disorder in Rubisco activase strongly suggests that it is a conserved feature responsible for its functional versatility as an ATPase, a chaperone and as a fine-tuning regulator that has contributed to the broad adaptability of the photosynthetic process.

Manganese stabilizing protein (MSP)

Plants capture sunlight through the light-harvesting complex (LHC) or antenna complex as part of Photosystem II (PSII). This complex of proteins and pigments is embedded in thylakoid membranes and connects the antenna to the chlorophylls in the reaction center. The photons captured by PSII initiate a chain of redox states through electron transfer reactions needed for the oxidation of two water molecules to O2. This photolysis reaction takes place in the oxygen-evolving complex (OEC), one of the PSII subunits. The different polypeptides of PSII are needed for an efficient O2 evolution; in particular, three extrinsic proteins of 17, 23, and 33 kDa, which are located on the luminal side of PSII. This last protein, also termed manganese stabilizing protein (MSP), is required to maintain stability and an efficient cycling of the four oxidizing manganese atoms [114,115,116,117]. MSP lacks a compact structure and is composed of 55% turns and random coils. These properties, together with its amino acid composition and other features, establish its intrinsic structural disorder. In vitro experiments suggest that the structural flexibility of this protein is required for its function, possibly by facilitating effective protein–protein interactions as an integral member of PSII [118]. Moreover, it has been shown that conserved charged amino acid residues in MSP are important for the retention of Cl ions, to maintain their concentration at the levels needed for the effective redox reactions of the manganese cluster [119]. Again, MSP exemplifies the participation of protein structural disorder as an essential attribute to achieve precise and opportune roles in a complex system able to adjust to the changing environment.

Alb3, a thylakoid membrane protein

The membrane invertase protein Alb3 controls the insertion, folding and assembly of a diverse group of proteins into the thylakoid membrane of chloroplasts. Alb3 interacts with chloroplast signal recognition particles (cpSRP) in the thylakoid membrane through its C-terminal intrinsically disordered region. This IDR has two conserved positively charged motifs needed for the association with cpSRPs that follows a coupled binding and folding mechanism [120]. Once the Alb3-cpSRP complex is formed, it participates in the post-translational insertion of the light-harvesting chlorophyll a/b-binding protein (LHCP), a highly abundant protein in thylakoid membranes. The insertion of LHCP into these membranes strictly requires the involvement of cpSRP and Alb3. Alb3 is also needed for the targeting and insertion of cytochrome b 6 into the thylakoid membrane [121]. Cytochrome b 6 is a largely disordered protein in aqueous solution, but by interaction with lipids from the membrane it folds into an α-helical structure just before its membrane insertion [122]. An additional function assigned to the Alb3 C-terminal IDR is the light-dependent modulation of Alb3 stability [123].

Polyphenol oxidases (PPOs)

Tyrosinases and catecholases from plants and fungi are generally named polyphenol oxidases (PPOs). In plants, PPOs mediate the production of melanin, responsible for the brown color in fruits when they suffer damage. They are nuclear-encoded and are transported to the chloroplast thylakoid lumen, where they can be in a soluble form or in a weak association to the thylakoid membranes. They are activated by the proteolytic cleavage of their C-terminal region. Using bioinformatic approaches to analyze multiple plant PPO sequences, it was found that the region between the N-terminal and C-terminal corresponds to a disordered linker essential to establish those conditions in which the PPOs are processed and may be activated [124]. This prediction suggests that the PPO IDR may acquire certain levels of order depending on the environment. Although experimental data are needed, the presence of a conserved phosphorylation site within this IDR suggests auto-regulation of PPO activities and/or that PPOs have roles as signaling molecules [124].

Phosphoenolpyruvate carboxylase (PEPC)

Carbon assimilation is not only accomplished by the activity of Rubisco, but also by PEPC (Phosphoenolpyruvate carboxylase), a ubiquitous enzyme in plants. PEPC also plays a critical role in plants with C4 photosynthesis and crassulacean acid metabolism (CAM), by producing oxaloacetate from HCO3 [125]. Two types of PEPC enzymes have been described in plants, known as plant-type PEPC (PTPC) and a distantly related bacterial-type PEPC (BTPC). The BTPC enzymes show low sequence identity with PTPCs, they lack the typical serine phosphorylation motif located in the PTPC N-terminal region and they are encoded in all plant genomes sequenced to date. Of particular interest is the fact that BTPCs contain an insertion of approximately 142 amino acid residues predicted as a structurally disordered region. This IDR seems to be highly divergent and a distinctive characteristic of BTPCs. PEPC enzymes are organized as oligomers, from which two classes have been identified: class-1 oligomers consisting of PTPC homotetramers, and class-2 complexes corresponding to heterotetramers composed of three PTPC and one BTPC subunits [125]. Recently, it was demonstrated that the BTPC IDR from the castor oil plant (Ricinus communis) is needed for its association with the PTPC subunit in a class-2 PEPC complex. Furthermore, even though the N-terminal region conserved in PTPCs is not conserved in BTPCs, it was thought that these enzymes were non-phosphorylatable. However, it has been shown that RcBTPC is phosphorylated in vivo at least at two serine residues. One of these modifications occurs at serine-451, a highly conserved target residue located within the IDR of these proteins. This event exerts a regulatory role, causing the inhibition of the catalytic activity of the enzyme within the class-2 PEPC complex [126].

UreG G protein

GTP-binding proteins (G proteins) are GTPases that catalyze the hydrolysis of GTP to yield GDP and inorganic phosphate. The structure of the catalytic domain of this enzyme is usually a β-sheet delimited by flexible regions of α-helices and loops. The binding of GTP or GDP activates (in the case of GTP) or inactivates (in the case of GDP) these GTPases, associations stabilized by the binding of specific protein regulators that promote conformational modifications. UreG is a bacterial-type G protein involved in urease maturation that has been demonstrated to belong to the class of intrinsically disordered enzymes. The structural disorder in UreG is mostly concentrated in a region of ~50 residues localized in the center of its protein sequence, which seems to influence the structure of the GTP-binding pocket [127,128,129]. In plants, one gene shows sequence similarity with bacterial UreG GTPase, which functions as an urease accessory protein, promoting optimal urease activation by allowing nickel or zinc incorporation in its active site and the GTP-dependent CO2 transfer required for lysine carbamylation. This protein has been characterized in soybean (G. max), where it has shown a differential binding affinity to Ni2+ and Zn2+. Furthermore, it has the highest affinity for Zn2+ described to date for any UreG protein. This observation suggests a role for UreG as a Zn2+ accumulator protein that may modulate the available levels of this metal in the cell. Analysis of its quaternary structure indicates that UreG is monomeric in solution and that dimers can be formed and stabilized upon Zn2+ binding, due to conformational rearrangements in the protein. The association with Zn2+ decreases the levels of secondary structure, but perhaps stabilizes the subsequent dimerization by facilitating the folding of the active site domain. However, this binding alone is not enough to yield a high UreG activity, suggesting that additional factors are needed to achieve its optimal GTPase activity [130]. UreG further illustrates the functional versatility conferred by intrinsic disorder to proteins with catalytic and regulatory roles in plant metabolism.

Jaburetox, an intrinsically disordered insecticidal polypeptide

Ureases are nickel-dependent metallo-enzymes that catalyze the hydrolysis of urea into ammonia and CO2 [131]. It was discovered that canatoxin, considered an isoform of a jack bean urease (from seeds of Canavalia ensiformis), corresponds to a 10-kDa peptide (JBU) produced from urease hydrolysis by cathepsin-like enzymes. This JBU peptide is toxic to mammals, fungi and insects. One of the major urease isoforms from jack bean seeds shows toxicity to hemiptera insects independent of its ureolytic activity, and instead its effect is produced by the action of digestive enzymes present in the insect gut [132, 133]. This entomotoxic activity is caused by an internal peptide (pepcanatox) product of this hydrolysis. Jaburetox, a recombinant version of the in vivo generated peptide, is derived from the N-terminal sequence of the C. ensiformis urease isoform and possesses a potent insecticidal effect on crop pests [133, 134]. A motif present on this peptide is also found in pore-forming and neurotoxic peptides which present membrane-disturbing activities [135]. A large hydrodynamic radius, together with light scattering, CD and NMR spectroscopic data, shows that Jaburetox is a monomeric disordered peptide with an α-helix motif by its N terminus and two turn-like structures in the central region and by the C terminus of the peptide. It is suggested that the Jaburetox IDP might act as a membrane protein or as a scaffold protein, but evidence for this is still lacking, therefore a comprehensive view of its insecticidal activity remains elusive [136].

Protein structural disorder in plant abiotic stress responses

Prediction of structural intrinsic disorder from plant proteomes reveals a noteworthy participation of IDPs in plant responses to their environment and to stress conditions. However, not many abiotic stress response proteins have been confirmed as IDPs and there is limited information about their function. Here, we compile those stress-responsive IDPs for which there is evidence of function and structural organization.

Table 2 Intrinsically disordered LEA proteins

Late embryogenesis abundant (LEA) proteins

Late embryogenesis abundant (LEA) proteins belong to an emblematic group of IDPs distinctively involved in plant stress responses, notably, in adverse conditions of low water availability. LEA proteins can be classified into seven groups or families based on amino acid sequence similarity, although nomenclature can vary. In this review, we will follow that proposed by Battaglia et al. [137], who also report the presence of distinctive motifs for each family, some of which correspond to MoRFs [138, 139]. LEA proteins do not show significant sequence similarity with any other proteins of known function, making their characterization a challenging task. LEA proteins are considered ubiquitous in the Viridiplantae kingdom because they have been found in angiosperms, gymnosperms, non-vascular plants and algae [137, 140]. Although for some time they were considered exclusive to plants, interestingly they have also been detected in other organisms including insects, nematodes, crustaceans, rotifers and bacteria [141,142,143,144,145]. In all cases, their abundance is related to water deficit, but some also respond to other stress conditions. In general, LEA proteins are highly hydrophilic with a high content of glycine residues or other small amino acids, and they are usually deficient in tryptophan and cysteine residues; all characteristics of IDPs [137, 146, 147]. These properties are conserved in a wider group of water deficit response proteins, the ‘hydrophilins’, which are conserved across all domains of life [148]. As is documented for other IDPs, LEA proteins possess key qualities that enable them to perform more than one function; this ‘moonlighting’ characteristic will be described below. As in the case of IDPs involved in development and metabolism, the plasticity and molecular flexibility of LEA proteins appear to be central to their function (Fig. 2A).

Fig. 2
figure 2

Schematic representation of two examples of plant IDPs that participate in abiotic and biotic stress responses. A LEA proteins (represented as purple curved lines) belong to a representative group of plant IDPs involved in plant abiotic stress responses. LEA proteins are able to prevent the inactivation of reporter enzymes under in vitro partial dehydration and freeze–thaw treatments. One action mechanism supported by different lines of evidence indicates that LEA proteins function as chaperones during water deficit a by interacting with their protein target(s) (green irregular ovals) and avoiding the damage (denaturation represented by green irregular lines emerging from the green ovals) caused by the effects of low water availability in the cell. The possibility that LEA proteins may bind and recognize their targets by conformational selection under water deficit has been suggested by in vitro data. In addition, there is evidence indicating that LEA proteins are able to stabilize membrane (double blue circles) integrity b during water deficit, by interaction through the amphipathic regions present in some LEA proteins. It has been suggested that LEA proteins might achieve more stable conformations upon membrane association. It has been proposed that this interaction induces LEA protein folding. b An additional attribute of at least some LEA proteins is their ability to bind metal ions (Fe3+, Ni2+, Cu2+, Co2+ and Zn2+) (small gray fill circles), which in some cases, by these means scavenge reactive oxygen species (c). For some LEA proteins, metal binding promotes a reduction in the content of structural disorder; however, this is not a common observation. In this panel, continue arrows represent the protective effect of LEA proteins, whereas discontinuous arrows indicate the consequent damage produced by stress in the absence of these proteins. B Biotic stress produced by plant pathogens has led to the selection of refined mechanisms to detect their presence and to mount complex inducible responses to efficiently counteract their attack. The participation of IDPs along the different steps of pathogen invasion, from their perception to the plant defense response has been documented. The RbohD protein (green curved lines), which belongs to the NADPH oxidase family, represents an example of this. This protein, partially integrated in the membrane, is responsible for the early generation of ROS, upstream of calcium and phosphorylation signaling. The RbohD cytoplasmic N terminus possesses an IDR, which contains EF-hand motifs involved in calcium binding. The malleable nature of this region results in extended conformational changes induced by the synergistic effect of calcium binding and its phosphorylation, which in turn modulates the interaction with small GTPase proteins (orange irregular oval); a process needed to set up the plant protection response against pathogens

One of the most general functions across the LEA group is an ability to protect the integrity of other enzymes. This has been demonstrated using several non-plant reporter enzymes with in vitro partial dehydration and freeze–thaw treatments, whereby the presence of LEAs prevents inactivation, denaturation and consequent aggregation of enzymes such as lactate (LDH) and malate dehydrogenases (MDH), citrate synthase (CS), β-glucosidase G (βglG), and glucose oxidase/peroxidase (GOD/POD) [137, 146, 149,150,151,152,153,154,155]. In the case of group 3 LEA proteins from Pisum sativum (PsLEAm), this protective effect has been demonstrated on plant proteins such as mitochondrial rhodanase and fumarase [156]. The protective activities resemble that of small heat shock proteins (sHSPs), which circumvent protein aggregation upon heat shock treatments in the absence of ATP [157]. Hence, it appears that LEA proteins may function as chaperones during water deficit stress. These observations suggest a protective role specifically against protein damage caused when water limitation inhibits cellular functions. Furthermore, it appears that this unique function cannot be provided by other types of chaperones [149, 158] (Cuevas-Velazquez et al. unpublished).

The different lines of evidence from in vitro enzyme assays have led to two main hypotheses to explain the LEA protein protecting activity. Because high concentrations of LEA proteins are able to prevent inactivation and aggregation of other proteins, it has been proposed that they may act as ‘molecular shields’. Given their large hydrodynamic radius in aqueous solution, they may create a protein molecular net, thereby promoting the alignment of their hydrophilic amino acid residues around the surface of a target protein, and in this way, prevent the loss of its bulk water and consequent changes in its native structure [159, 160]. However, there is also evidence showing that small amounts of LEA proteins (down to 1:1–1:5 ratios of LEA:reporter enzyme) are also capable of protecting target proteins to a similar degree [149, 161,162,163,164,165,166]. This indicates that LEA proteins may function in a ‘chaperone-like’ mode, where interaction is required to select and protect their targets, binding as monomers or oligomers [149, 161, 167, 168] (Fig. 2A(a)). This hypothesis is supported by evidence indicating that these disordered proteins can fold in α-helix under high osmolarity or high macromolecular crowding, prevalent conditions under water deficit, which would lead to a decrease in their hydrodynamic radius [138, 139, 147, 169]. Crucially, this property seems to be associated with their chaperone-like activity [139]. With this in mind, and considering the role of conformational plasticity [138, 139, 147, 169,170,171], it is possible that LEA proteins may bind and recognize their targets following a mechanism that resembles conformational selection under water deficit, the natural conditions under which they accumulate in the cell. Although similar lines of evidence have been obtained for LEA proteins from different groups (LEA2, LEA3, LEA4, LEA7), further experimentation is needed to support these alternative mechanisms. In particular, it is imperative to establish strategies to obtain in vivo data that could help provide a more comprehensive view of their action mechanisms.

The intrinsic disorder of LEA proteins may also function in the stabilization of membrane integrity under stress (Fig. 2A(b)). Some LEA proteins, mainly those from groups 2–4, are able to bind in vitro to lipid vesicles [172,173,174,175,176]. In some cases, these vesicles have been produced using phospholipids and galactolipids common to plant chloroplast and mitochondrial envelops [169, 177], or they have been obtained from thylakoid membrane fractions of spinach leaf tissue [178]. Interestingly, α-helix folding upon vesicle binding has also been shown for some LEA proteins [179, 180]. For group 2 LEA proteins (dehydrins), it has been found that the K-segment, a distinctive motif of this family is necessary for liposome binding, which is consistent with its amphipathic nature [178, 180, 181] (Fig. 2A(b)). Distinctive motifs of LEA3 and LEA4 proteins also present amphipathic properties, which help explain their ability to bind to lipid vesicles surfaces [137, 182, 183].

Unfortunately, to date there remains no evidence of any of these activities in vivo. Despite this, in vitro functions correlate with the accumulation of LEA proteins in plant tissues under low water potentials induced by dehydration or by cold or freezing temperatures; conditions in which enzymes can be inactivated and membrane injuries are produced. Interestingly, some Arabidopsis LEA proteins are required for plant optimal adjustment to cold, water deficit and/or salinity, as can be inferred from the phenotypes produced by mutants lacking genes encoding LEA proteins from group 1 [184], group 2 [185], group 3 [169] and group 4 [186]. Additionally, the acquisition of tolerance to water limitation or low temperatures by the overexpression of several LEA proteins (from groups 2–4 and 7) in different plant species strongly supports their role as protector molecules under these stress conditions [187,188,189,190,191,192,193,194,195,196,197].

As can be seen with IDPs involved in plant development and metabolism, dynamic structural order can be attained by interaction with metal ions. LEA proteins have also been shown to bind metal ions (Fe3+, Ni2+, Cu2+, Co2+ and Zn2+) and scavenge reactive oxygen species (reviewed in [198, 199]). LEA proteins showing high affinity to these metals include LEA2 or dehydrins, LEA3 and LEA4 [191, 198,199,200]. Acid dehydrins (RAB17 and VCaB45) are able to bind calcium, possibly to modulate intracellular calcium levels, thereby acting as ionic buffers during water deficit: a hypothesis that still needs to be tested [201, 202]. The metal binding properties in these proteins have been attributed to the abundance of histidine residues or to the presence of metal binding motifs (HX3H or HH) [203]. Importantly, for some LEA proteins, it is known that metal binding can promote the gain of an ordered conformation [204].

Remarkably, a group 2 LEA protein (ITP, iron transport protein) has been shown to carry iron in vivo and bind iron in vitro (Fig. 2A(c)). This protein was found associated with iron in phloem exudate from R. communis L. [205]. ITP also binds Ni2+, Cu2+, Zn2+ and Mn2+ in vitro, preferentially binding to Fe3+ but not to Fe2+. This indicates that this LEA protein may function as a phloem micronutrient transport protein [205], opening up the possibility that this novel function may exist for other LEA proteins or IDPs able to bind iron or other micronutrients.

For some group 2 proteins, it has been shown that their phosphorylation is required for the metal association to occur [202, 206, 207]. Group 2 and group 4 proteins can also circumvent the production of reactive oxygen species (ROS), given their capacity to bind metals able to promote ROS generation (Fig. 2A(c)). Evidence for this activity has been obtained in vitro and in vivo [200, 204, 208]. This mechanism could be advantageous under abiotic stress conditions such as water deficit, when ROS production and sensitivity to secondary stresses are exacerbated.

The multifunctionality of LEA proteins and the role of metal ions are reinforced by data indicating that group 2 and group 7 LEA proteins can also bind nucleic acids. Group 2 LEA proteins (CuCOR15, VvDHN1a and WCI16) have been shown to associate with DNA and RNA. In the case of CuCOR15 and WCI16, this occurs in the presence of physiological concentrations of Zn2+ [209,210,211]. This evidence suggests that nucleic acids need similar protection from the effects of water limitation.

DNA binding has also been demonstrated for a group 7 LEA protein (ASR1, ABA [Abscisic acid] stress ripening 1), a widely occurring plant LEA protein that does not exist in A. thaliana. Strikingly, in addition to its protein protective role, ASR1 can also function as a transcription factor. It has been shown that ASR1 is able to bind to the regulatory regions of genes related to cell wall synthesis and remodeling, as well as genes encoding membrane channels implicated in water and solute trafficking [212]. Grape ASR1, VvMSA, recognizes specific sites in the regulatory region of the hexose transport 1 (Ht1) gene [213], and ASR orthologues are also involved in sugar and amino acid accumulation in species such as maize and potato [214, 215].

Phosphorylation of IDRs in some LEA groups may also play a role in LEA protein function. Members of group 2, 4, 6 contain phosphorylatable motifs and in vivo and in vitro phosphorylation has been verified for group 2 LEA proteins (dehydrins/DHNs) in Arabidopsis, wheat, maize, and other plants [201, 207, 216,217,218,219,220,221,222]. Although the role of this modification is not well understood, for group 2 LEA proteins it may be needed to modulate membrane interaction and lipid phase transition [178, 180], as well as nuclear localization [223, 224]. However, although phosphorylation pattern correlates with tolerance to water limitation, it is unknown whether this modification is required to modulate LEA protein protective activity and/or target selectivity by allowing the display of different binding motifs and/or MoRFs.

The multifunctionality found in vitro for the different LEA proteins is often compatible with their in vivo intracellular localization, suggesting that there may be both subcellular specialization as well as redundancy. LEA proteins from all groups have been localized to cytosol, nucleus, mitochondria and chloroplast [137, 199, 225]. However, at least in the case of group 3 LEA proteins, the most diverse LEA family, not all its members show the same localization. Some are found in cytosol or nucleus, others in the chloroplast and some others only in mitochondria [226]. This implies a requirement and possible functional specificity of LEA proteins during the plant stress response.

Further evidence of their deep functional divergence, as well as their ubiquitousness, can be seen in the high conservation of most LEA families throughout the Plantae kingdom’s evolution. LEA proteins from group 1–4 can be detected in genomes from the most recent angiosperms through to the bryophytes, including the liverwort Marchantia, the most basal plant model described to date. Group 6 and 7 LEA proteins have been found only in seed plants, and in the case of group 7 LEA proteins, do not seem to be present in all phyla [137, 155, 186, 227, 228]. The broad distribution and conservation of these plant IDPs throughout evolution illustrate not only the relevance of these proteins for the organisms in this kingdom, but also the importance of disorder for the various functions they achieve.

The ubiquity of LEA proteins across land plants is testament to their versatility. Even though LEA protein action mechanisms remain elusive, the intrinsically disordered nature of these proteins matches their apparent ‘moonlighting’ character, as exhibited by diverse data where the same LEA protein is able to protect proteins and membranes, and bind metals and/or nucleic acids (see Table 2). These characteristics are compatible with their ability to utilize the same or overlapping regions to exert distinct effects and to switch functions by adopting different conformations upon binding [229].

Small heat shock proteins (sHSPs)

Small heat shock proteins (sHSPs) are ubiquitous molecular chaperones, which play important roles in protein homeostasis and in plant responses to stress. sHSPs are classified in 11 subfamilies, six localized to cytoplasm/nucleus and five to organelles. These chaperones bind diverse partially unfolded polypeptides maintaining their refolding capacity until they can return to their native structure with the help of other chaperone proteins, such as HSP70. In this way, sHSPs protect cells from the loss of essential proteins and from the penalties caused by protein aggregation. Commonly these proteins respond to high temperatures, but also to other stress conditions, and some may also be produced even under optimal growth conditions. In contrast to other molecular chaperones, sHSPs form large and dynamic oligomers with different stoichiometry. All sHSPs contain a core α-crystallin domain bordered by a short C-terminal region and an N-terminal extension of variable length and sequence (for review see [157, 230]). Both regions participate in the recognition of—and binding to—clients and in the formation of their oligomers (for review see [230, 231]). It has been proposed that during heat stress the oligomeric sHSPs undergo conformational rearrangements leading to their dissociation. These structural changes enable the interaction of these chaperones with hydrophobic patches in the partially denatured clients, subsequently forming large soluble complexes, protecting protein clients form further damage. Biochemical and biophysical evidence indicate that the intrinsically disordered N-terminal arm is able to present different interaction sites revealing a mechanism to efficiently protect the integrity of many different substrates in the cell [157, 232]. Although many questions still remain unanswered regarding mechanistic details and in vivo evidence is required, sHSPs offer a view of the need for structural plasticity and promiscuity to maintain cell functions during stress.

Glycine-rich RNA-binding proteins (GR-RBPs)

Although LEA proteins are important to the plant cold stress response, other IDPs are known to play a protective role. Plants exposed to low temperatures experience a slowing down or even a pause of their metabolic processes and this may result, directly or indirectly, in damage to macromolecules and cellular structures [233]. Among the proteins synthesized to overcome the impairment that cold and other abiotic and biotic stresses cause to macromolecules are the so-called glycine-rich RNA-binding proteins (GR-RBPs) [234, 235]. Some of the functions characterized for GR-RBPs are the facilitation of mRNA transport and participation in splicing and translation: roles mediated by their RNA chaperone activity [236, 237]. GR-RBPs contain an RNA recognition motif (RRM) in the N-terminal region and a disordered glycine-rich region (GR) at their C-terminal end, and they can be classified into eight groups, each one with apparently different roles [238, 239]. In Arabidopsis, AtGR-RBP7, in addition to being a circadian regulator and promoter of flowering and mRNA splicing, accumulates in response to cold stress [236, 237, 240,241,242,243,244]. Deletion of AtGR-RBP7 leads to low-temperature sensitive phenotypes, highlighting its role in optimal plant adjustment to cold stress [245]. NMR analysis confirms the structural disorder of the GR domain for NtGR-RBP1, an AtGR-RBP7 orthologue from Nicotiana tabacum [246]. As expected, NtGR-RBP1 is shown to bind RNA and single stranded DNA through the RRM. Furthermore, the NtGR-RBP1 GR interacts transiently with its RRM domain, promoting self-association to effectively increase its local concentration, and hence its affinity for nucleic acids. These findings suggest a mechanism for the unfolding of non-native structures in RNA by NtGR-RBP1, which may be involved in enhancing its RNA chaperone activity [246].

Vesicle-inducing protein in plastid 1 (VIPP1)

The integrity of thylakoid membranes is crucially important for photosynthesis and chloroplast functions. Multiple reports have shown the participation of a protein called VIPP1 (Vesicle inducing protein in plastid 1) in thylakoid membrane biogenesis and thylakoid membrane maintenance during drought, heat and osmotic stress [247, 248], not only in cyanobacteria and green algae, but also in vascular plants [249,250,251]. The evolutionary emergence of this protein seems to be specific to oxygenic photosynthetic organisms [251]. Recent evidence suggests that while VIPP1 may have multiple roles in plastids, it strongly protects the chloroplast envelope [252]. The N-terminal region of VIPP1 presents high sequence similarity to its bacterial orthologue PspA (Phage shock protein A) [253], which plays a central role in the well-characterized bacterial system Psp, involved in the protection of membrane integrity under various stresses [254]. During membrane damage, PspA and VIPP1 bind to membranes forming high-order oligomeric effector complexes able to repair the inner membrane and conserve its integrity [255, 256]. Interestingly, this occurs despite the absence of transmembrane domains in these proteins [252, 253]. CD spectroscopy studies show that PspA and VIPP1 N-terminal peptides are disordered in solution and fold upon membrane association, as occurs in a typical membrane amphipathic helix [257, 258]. The membrane binding of these proteins depends on differences in stored curvature elastic stress, a feature of damaged membranes [259]. These observations suggest that the folding transition associated with PspA and VIPP N-terminal membrane binding might act as a stress-sensing mechanism controlling the effector function of these proteins.

During the evolution of photosynthetic organisms, the PspA orthologue VIPP1 has acquired an additional C-terminal tail (Vc) that also presents the characteristics of an intrinsically disordered region [260]. Using live imaging experiments performed in vivo in Arabidopsis, with GFP (green fluorescent protein) translational fusions of VIPP1 or VIPP1 lacking Vc (VIPP1ΔVc), it was shown that Vc enables VIPP1 to form oligomeric effector complexes along cell envelopes, whereas VIPP1ΔVc leads to the formation of irregular aggregates of VIPP1 particles. The expression of VIPP1ΔVc complemented the vipp1 knock out mutation in Arabidopsis, but exhibited sensitivity to heat shock. Furthermore, transgenic plants over-expressing wild-type VIPP1 showed enhanced tolerance against heat shock. Vipp1 knockout Arabidopsis mutants show reduced content and other structural defects of thylakoid membranes, as well as reduced photosynthetic activity. In addition to its role in membrane biogenesis, it has been proposed that VIPP1 may also function as a lipid transfer protein, delivering structural lipids into thylakoid or envelope membranes [253]. Overall, these data suggest that the involvement of the Vc disordered region in the formation of the oligomeric effector complexes might be relevant for the control of VIPP1 association/dissociation states. Under conditions of membrane stress, this IDR may permit the insertion of their amphipathic helix into the lipid bilayer to relax the curvature elastic stress in membranes [259].

Dehydration-responsive element binding protein 2A (DREB2A)

Dehydration-responsive element binding protein 2A (DREB2A) is a key transcription factor for drought and heat stress tolerance in Arabidopsis. DREB2A induces the expression of dehydration and heat stress responsive genes [261]. This transcription factor contains several IDRs allowing it to interact with multiple proteins, a characteristic consistent with interactome data showing that DREB2A is a hub protein with 26 nodes [21]. DREB2A may interact with its negative regulators DRIP1 and DRIP2 (DREB2A-interacting protein1 and 2), with ribosomal proteins such as RPL15 (ribosomal protein L15), other transcription factors such as RCD1 (Radical cell death 1), and the transcription co-regulator MED25 (Mediator 25), among others [21]. It has been shown that MED25 binds to one of the DREB2A IDRs and that this interaction results in a gain of ordered structure in this region. Similarly, the binding of DREB2A to its canonical DNA sequence also leads to an increase in the secondary structure of the protein. Data also show that DREB2A conformational changes induced by DNA binding reduce its interaction with the MED25 acid domain, which does not exclude the possibility that this modification may promote its association to another Mediator subunit close by [262]. RCD1 controls DREB2A function, and is itself rapidly removed during abiotic stress [263]. O’Shea et al. [264] showed by NMR spectroscopy that DREB2A undergoes coupled folding and binding with α-helix formation upon interaction with RCD1.

bZIP28, a transcription factor in the unfolded protein response

Under adverse conditions such as heat stress, pathogenesis and by inhibition of protein glycosylation [265,266,267], the demand for protein folding can exceed the capacity of protein homoeostasis systems. This results in the increase of misfolded or unfolded proteins in the endoplasmic reticulum (ER) lumen. This series of events leads to ER stress that subsequently induces the unfolded protein response (UPR) to fulfill the requirement of protein folding and degradation [268]. Two branches of the UPR signaling pathway have been described in plants: one involving the membrane-associated basic leucine zipper (bZIP) transcription factors and the other involving a bifunctional protein, with kinase and ribonuclease activities, known as inositol-requiring enzyme 1 (IRE1), which functions as an RNA splicing factor [269]. In Arabidopsis, bZIP28 is an ER membrane-associated transcription factor; its N-terminal region contains a transcriptional activation domain oriented towards the cytoplasm, while its disordered C-terminal tail localizes to the ER lumen [270]. It is proposed that bZIP28 senses ER stress through its interaction with the ER chaperone BiP (binding immunoglobulin protein), a master regulator of the ER stress sensor. Under non-stress conditions, BiP binds to bZIP28 IDRs present in its lumen-facing tail and retains it in the ER. Upon stress, BiP is competed away from bZIP28 by the accumulation of misfolded proteins in the ER, releasing bZIP28 and allowing its exit from the ER, to move towards the Golgi apparatus [271]. Then, bZIP28 is cleaved by proteases, releasing its transcriptional activation domain that will be translocated to the nucleus to up-regulate stress response genes [271]. The bZIP28 IDR represents one additional example of the role of IDRs in controlling signaling in plant stress responses.

Protein structural disorder in plant biotic stress responses

From germination to reproduction, plants confront a large diversity of parasitic organisms that can cause disease. These pathogens include viruses, bacteria, fungi, nematodes and insects that exploit resources and replication systems in plants [272]. Infection by these organisms has driven plants to evolve refined mechanisms to detect their presence and to mount complex inducible responses to efficiently counteract their attack. As in other plant processes, plant defense systems are tightly regulated, many of them through the participation of kinases and phosphatases that modulate the phosphorylation status of key control proteins [272,273,274]. Marín and Ott [22] have reported the prediction and extensive compilation of different IDPs involved in this process. Because this information has been recently published, in the present work we include only a summary of the material for which functional and structural evidence is available.

Plants are able to specifically recognize their aggressors through receptors localized at the cell membrane. These receptors include LRR-RLKs, a common class of receptors in plants, where intrinsic disorder is present. An example of this is the aforementioned BAK1, an RLK that in this process functions as a co-receptor of the two of the best-characterized pathogen LRR-RLK receptors, FLS2 (Flagellin-sensing 2) and EFR (EF-Tu receptor) [275]. The relevance of the BAK1 IDR C-region resides in its ability to discriminate between two signal transduction pathways, even though the same phosphorylation site (Tyr-610) inside this region is involved in both brassinosteroid sensing and in the pathogen defense response [276]. It is well established that plant perception of pathogens is accompanied by an oxidative burst, where RbohD plays a central role (Fig. 2b). This protein belongs to the NADPH oxidase family, responsible for the early generation of ROS, upstream of calcium and protein phosphorylation signaling. Different experimental evidence supports the presence of an IDR in the RbohD cytoplasmic N terminus, a region that contains an EF-hand motif involved in calcium binding. The malleable nature of this region results in extended conformational changes induced by the synergistic effect of calcium binding and its phosphorylation, which in turn modulates the interaction with small GTPase proteins [277, 278]; fundamental events to set up protection responses to pathogenic agents (Fig. 2b). Following perception at the cell envelope, the signaling process continues in the cytosol, where different molecules play relevant roles. One of these protein molecules is the HSP90 molecular chaperone that given its refolding capacity, is an essential participant in many signaling pathways in plants and animals [279]. The structural organization of this chaperone shows an N-terminal region with an ATPase domain and a linker region composed of charged residues that connect its middle domain with the dimerization region localized at the C terminus. Interestingly, the N-terminal domain undergoes consecutive conformational changes upon ATP binding, leading to the formation of a transient dimer with different co-chaperone partners. The association of HSP90 with the RAR1 co-chaperone results in an order-to-disorder transition of this ATP domain, which enables its movement to allow the entrance of the catalytic loop localized at the middle HSP90 region [280]. These interaction events are essential for the competence of RAR1 function, which together with the SGT1 co-chaperone, is needed to activate the majority of R proteins, detectors of pathogen effector molecules, by mediating NLR (nucleotide binding leucine-rich repeat receptor) function [281]. This signal pathway flows towards a MAP kinase cascade, whose activation ends in the phosphorylation of transcription factors (e.g., WRKY33) that induce the expression of defense genes. Two of these MAP kinases, MEK and MEKK1, show long disordered regions in their N termini that, in the case of MEKK1, have been shown to play a regulatory role; their removal results in a constitutively active kinase [22, 282]. The reprogramming of those genes encoding the proteins that will counteract pathogen incursion needs the action of transcription factors (TFs). Various TF families are involved in this process including MYC, MYB, TGA, WRKY and ERF. Among the TFs known to have a role in the plant pathogen response are MYC2, MYB30, TGA3, WRKY1, WRKY4, WRKY52, WRKY53 and ERF. All these proteins contain, in addition to their DNA-binding domains, IDR-containing linker domains with regulatory functions [21]. Some of these linker domain IDRs have been shown to interact with co-transcription factors that might contribute to the modulation of the spatio-temporal expression of target genes and to the selectivity required to distinguish the identity of particular pathogens (for review see [283]).

Computational analyses using available plant genome sequences predict the presence of significant structural disorder in many more proteins implicated in plant pathogen responses. However, as yet there is no experimental support for their structure or function. Hence, new discoveries await our curiosity and creativity.

Conclusions and future directions

Plants provide a clear picture of the importance of intrinsic disorder in eukaryote protein function. The structural flexibility and molecular promiscuity afforded to a wealth of plant proteins with intrinsically disordered domains have ensured pivotal and multifunctional roles in core processes, including development and metabolism as well as biotic and abiotic stress responses. Technical and experimental barriers to the study of IDPs have limited IDP research in planta, and up to now, there has been a strong reliance on interpretation and extrapolation from in vitro analyses; in particular, for those which are highly disordered and function under stress. It is hoped that the recent explosion in molecular genetic technologies will pave the way for further exploration of the in vivo mechanisms and interactions of plant IDPs. We are only beginning to understand their place in the story of plant evolution and their essential functions in life as a whole.