Introduction

The first report about a mammalian metallothionein (MT) dates back more than 50 years [1]. It was the high cadmium content of horse kidney cortex that caught the researchers’ attention and led to the identification of the first “naturally” CdII-binding protein. Owing to its additional richness in Cys thiolates, the name “metallothionein” was born. Considerably younger are the plant MTs, which were discovered around 25 years ago [2]. Here it was the observation that during the earliest germination stage of Triticum aestivum (common bread wheat) 20–25% of the total Cys incorporated into nascent proteins is found in just a single protein, which was accordingly denoted “early cysteine-labeled”, or Ec, protein. In the following years, this peculiar wheat Ec-1 protein was investigated in more detail [3, 4], and at the beginning of the 1990s the overexpression of another plant MT, the MT2 protein from Pisum sativum (garden pea), was attempted in Escherichia coli but was hampered by partial proteolytic digestion [5]. In the following 10 years, more and more plant MT gene sequences were reported and gene expression studies were performed; however, no investigations on the actual proteins were performed. This only started to change in the early 2000s, when the use of larger purification tags such as glutathione S-transferase (GST) became more common and proteolytic digestion of especially the linker regions of these rather small proteins in E. coli could be prevented [6, 7]. It is thus appropriate to say that the study of plant MT proteins is still a quite recent research field. But what discriminates the members of the plant MT family from the mammalian forms? The original criteria used to designate Cys- and metal-ion-rich proteins as MTs were a low molecular mass (approximately 5–10 kDa), no aromatic amino acids, a characteristic Cys distribution pattern, e.g. CysCys and CysXaaCys motifs (with Xaa meaning any amino acid other than Cys), and the characteristic optical features of metal–thiolate complexes, e.g. a shoulder of the peptide backbone transitions around 250 nm indicative of CdII–thiolate clusters [8, 9]. These criteria were only met by the Ec proteins, provided that the two highly conserved His residues found in their sequences were disregarded (Fig. 1).

Fig. 1
figure 1

Amino acid sequence alignment of representative members of the mammalian and the plant metallothionein (MT) families. The plant MTs are additionally divided into four subfamilies comprising the MT1, MT2, MT3, and Ec proteins. Cys residues are highlighted with a black background, aromatic amino acids with a grey background. His residues are accentuated with a black frame. Sequences denoted with an asterisk represent exceptions to the otherwise highly conserved Cys distribution pattern within a plant MT subfamily. In Arabidopsis thaliana MT1A, the linker region is additionally reduced to just seven amino acids

For the plant MT1, MT2, and MT3 proteins the presence of the aromatic amino acids Tyr and Phe caused their classification as MT-like. Only upon introduction of the presently used classification system in 1999 were all plant MT forms combined in MT family 15, the plant MTs [10]. Whereas the mammalian MTs invariably contain 20 Cys residues, the Cys content in plant MTs is lower and shows a greater variety across the different subfamilies. With 17 Cys residues, the Ec or MT4 proteins have the highest content of thiolate groups, followed by the MT2 forms with 14, the MT1 forms with 12, and finally the MT3 proteins with only 10 Cys residues (Fig. 1). The Cys content within a plant MT subfamily can differ slightly and some such variants are depicted in Fig. 1 and marked with an asterisk. However, the length of the amino acid sequences and hence the mass of the proteins is generally higher than that observed for the mammalian MTs. Whereas the latter have masses around 6.0 kDa without metal ions, the largest plant MTs are found in the MT2 subfamily (approximately 7.9 kDa), followed by the Ec (approximately 7.7 kDa) and MT1 (approximately 7.6 kDa) forms, and finally the MT3 proteins (approximately 6.8 kDa). The observed mass differences mainly originate from the so-called linker regions, amino acid stretches connecting the Cys-rich regions of the proteins but being devoid of Cys residues themselves. In the mammalian forms these linker regions comprise only three amino acids, but in the plant MT forms they are considerably longer (Fig. 1). It is within these linker regions where the aromatic amino acids so untypical for the “classical” MTs are located. The plant MT1, MT2, and MT3 forms feature two Cys-rich regions. The six Cys residues located at the C-terminus in each subfamily are arranged in a highly conserved pattern (CxCxxxCxCxxCxC). In contrast, the number and distribution pattern of the Cys residues in the N-terminal part are characteristic for each subfamily and are used as the main distinguishing factor for subfamily discrimination. The Ec proteins stick out as they contain three Cys-rich regions, all clearly separated in sequence by linker regions. These linker regions are considerably shorter than the ones found in the plant MT1, MT2, and MT3 proteins, but with 12–15 amino acids they are nevertheless distinctively longer than in the mammalian forms. Noticeable are also the two fully conserved His residues in the Ec subfamily mentioned already, which are located close to and within the central Cys-rich region.

The current state of research with respect to gene expression pattern and proposed functions of plant MTs was summarized recently [11, 12] and will not be part of this review. Instead, after an update has been given on the metal ion contents observed in the few plant MTs studied in the form of the translated proteins so far, the main focus will be set on the structural aspects that can be derived from sequence analyses and spectroscopic measurements, and also aspects concerning the structural and functional role of the peculiar linker sequences typical for the plant MT forms will be addressed.

Metal ion contents

The divalent metal ions ZnII and CdII are bound to MTs mainly in tetrahedral tetrathiolate coordination spheres. From two three-dimensional structures we know that also one or more Cys residues can be replaced by His residues without alterations of the coordination number or geometry [13, 14]. Two basic metal–thiolate structures are known: the M II4 Cys11 cluster of the α-domain of vertebrate and echinoderm MTs [1523] and the M II3 Cys9 cluster of the β-domain of vertebrate, crustacean, and echinoderm MTs (Fig. 2) [1517, 2025].

Fig. 2
figure 2

The two metal–thiolate cluster structures formed with divalent metal ions in, e.g., the vertebrate MTs: a M II4 Cys11 cluster of the α-domain and b M II3 Cys9 cluster of the β-domain. Only the coordinating sulfur atoms of the Cys residues are shown

The latter cluster type is also part of the βE-domain of wheat Ec-1 [14]. Recently, an M II2 Cys6 cluster was characterized by NMR spectroscopy; this cluster was previously unprecedented for the MT superfamily and is hitherto uniquely found in the plant Ec proteins [26]. On the basis of the metal ion to Cys ratios observed in these cluster structures, i.e. 1:2.75 and 1:3, and assuming that also in plant MTs tetrahedral tetrathiolate coordination spheres prevail, one can estimate the expected ZnII and CdII contents according to the number of Cys residues in the sequence as given in Table 1.

Table 1 Metal ion contents of plant metallothioneins (MTs) calculated on the basis of metal ion to protein ratios found in the metal–thiolate clusters of vertebrate MTs and as determined experimentally for the purified proteins

The same rationale was followed to assess the CuI content of plant MTs. From vertebrate MTs three CuI-to-Cys stoichiometries are known, a Cu6Cys11 species found in the α-domain as well as a Cu4Cys9 and a Cu6Cys9 form in the β-domain [27, 28]. No structural information shining light on the possible coordination geometries of the CuI ions in these arrangements is available. The X-ray structure of a truncated form of the CuI-specific CUPI MT from the yeast Saccharomyces cerevisiae reveals a Cu8Cys10 cluster, in which six CuI ions are coordinated by three Cys thiolate groups each and two show a linear coordination environment [29]. On the basis of these cluster forms, the CuI-to-Cys ratios given in Table 1 were calculated. The metal ion to Cys ratios for both the monovalent and the divalent metal ions are provided as a guideline to estimate if the metal ion contents listed in the literature for the different plant MTs can be brought into agreement with the cluster structures known from other MT species.

As the majority of studies on plant MTs are restricted to the nucleic acid level, e.g. in gene expression studies, the number of investigations dealing with the actual proteins is rather limited in the literature. Except for the seed-specific wheat Ec-1 protein, sufficient amounts of plant MTs for more detailed analyses were only obtained via overexpression in E. coli. Mostly due to difficulties associated with the detection of MTs in the cell extract and to proteolytic digestion, plant MTs are largely overexpressed in the form of fusion proteins, either with a GST tag or with a self-cleavable intein tag. As cleavage of the GST tag requires proteases, e.g. thrombin, and because of the simplicity of handling the entire fusion protein, often plant MTs are investigated in their GST-fusion form (see Table 1). From the plant MT1 subfamily the proteins from Cicer arietinum (chickpea), P. sativum (garden pea), and Triticum durum (durum wheat) have been analysed in sufficient detail to enable the determination of metal ion contents [5, 3033]. As summarized in Table 1, the ZnII, CdII, and CuI contents are mostly within the predicted range, only the values around 6 for the number of coordinated divalent metal ions in garden pea are slightly higher than what is expected on the basis of the known cluster structures, and the CuI content of 2.3 in one preparation of garden pea MT1 is considerably lower than calculated. For the three different plant MT2 forms studied in this respect so far, the picture is even more heterogeneous. Whereas the values observed for chickpea MT2 fall into the predicted range [34, 35], the metal ion contents measured in samples of Quercus suber (cork oak) MT2 are slightly lower (ZnII and CuI) or higher (CdII) than estimated [7], and the numbers obtained for the GST-fusion protein of Avicennia marina (grey mangrove) are persistently below the expected range [36]. Cork oak MT2 contains an additional, not conserved His residue in the Cys-devoid linker region that was suggested to take part in CdII binding but only negligibly in ZnII binding [37]. With this His residue the number of potential CdII binding ligands would increase to 15 and hence the calculated divalent metal ion content would be 5.0–5.5. However, the CdII content of cork oak MT2 would still be higher than estimated. For the plant MT3 subfamily, the ZnII and CdII contents are only available from two forms: Musa acuminata (banana) MT3 and Elaeis guineensis (oil palm) MT3A [6, 38]. The calculated range based on 10 Cys residues is 3.3–3.6 metal ions. Sequence alignments show that in some plant MT3 forms, among them banana MT3 and oil palm MT3A, an additional His residue close to the C-terminal Cys-rich region is present. If it is supposed that this His residue could potentially serve as an additional ligand similar to the situation observed in cork oak MT2, the calculated divalent metal ion content for these MT3 forms would even increase to 4. Accordingly, coordination of three or four divalent metal ions as observed for banana MT3 is well within the predicted range. In contrast, the amount of 1.7 ZnII ions observed for oil palm MT3A is rather low. The Ec proteins constitute the plant MT subfamily with the highest number of Cys residues. As we know from the NMR solution structures of wheat Ec-1, also the two fully conserved His residues participate in metal ion binding. Hence, considering 19 potential coordinating ligands, the predicted metal ion content range in Table 1 would even increase to 6.3–6.9 divalent metal ions. However, among others, the NMR structures clearly show that wheat Ec-1 has the ability to coordinate six ZnII ions, and that the metal ion content is slightly lower than predicted owing to the presence of a ZnCys2His2 site, i.e. a site with a metal ion to ligand ratio of 1:4. In light of this, a content of 2.4 ZnII ions as determined for Sesamum indicum (sesame) clearly falls out of the range [39].

In summary, simply on the basis of the different metal ion contents observed for a given subfamily, it is more than obvious that there is still a long way to go to understand the metal ion binding properties of plant MTs. Certainly it is premature to judge the accuracy of the different measurements on the basis of the predicted values. As shown, metal ion binding capacities can be increased by recruiting additional ligands, e.g. His but also the carboxy groups of Asp and Glu would be feasible, or they can be reduced upon formation of alternative cluster structures or even mononuclear binding sites. Table 1, however, can be used as a guide to identify less likely and mostly too low metal ion contents, and it is more than obvious that there is a need for additional studies to obtain a more uniform, and hence reliable, picture for each subfamily.

Secondary structural elements

Apart from wheat Ec-1 [14, 26], no three-dimensional structural information about plant MTs is available. This means in particular that the structure and folding of the exceedingly long Cys-free linker regions so typical for the plant MT1, MT2, and MT3 proteins remain enigmatic. However, first conclusions can be drawn from results obtained with circular dichroism (CD), IR, and Raman spectroscopy and also based on theoretical predictions and calculations, if evaluated critically. To obtain a first informative basis, the amino acid sequences of representative members of the four plant MT subfamilies were analysed with the PROF prediction tool of the PredictProtein server (http://www.predictprotein.org) [40, 41]. When the data are evaluated at the 50% probability level, no helical content is predicted in any of the proteins tested, but the contribution of β-sheets is clearly evident and is mostly restricted to the areas of the linker regions. Confining the β-sheet structures to the Cys-free linker regions, which account for approximately 50–55% of the total number of amino acids in the MT1, MT2, and MT3 forms, one obtains overall β-sheet contributions of roughly 30% (Table 2). For a 40% probability level, these values increase to around 40%. Formation of β-sheet structures by residues of the Cys-rich regions is unlikely owing to the steric strain imposed on the protein backbone by the metal–thiolate clusters. For wheat Ec-1 no α-helices and only very low β-sheet contents (below 5%) were predicted, which goes along with results of the structural investigations by NMR spectroscopy of the individual domains.

Table 2 β-sheet and α-helix contents of selected plant MTs as predicted for the linker regions or determined with IR, Raman, or NMR spectroscopy for the entire proteins

Two studies that address the question of secondary structural elements in plant MTs with the aid of CD, IR, or Raman spectroscopy are available in the literature: the investigation of cork oak MT2 and chickpea MT1 [32, 42]. The contents of secondary structural elements in the ZnII and CdII forms of cork oak MT2 were determined with Gaussian fitting of the amide I band in the Raman spectra and the amide III band in the IR spectra. While the virtual absence of helical contributions was confirmed, relatively high amounts of β-sheets (55–64%) were calculated (Table 2). A similar picture was obtained for chickpea Cd5MT1 upon evaluation of the amide I band in the IR spectra, although the contribution of β-sheets was lower (approximately 30%). The CD spectrum of the same species is dominated by a single minimum around 200 nm indicative of a predominantly random coil structure. Calculations revealed additionally around 10% α-helix and 30% β-sheet contributions. Taken together, the results obtained on cork oak MT2 and chickpea MT1 with the different techniques do not seem to correlate well. Hence, it is important to consider the specific weaknesses of all three spectroscopic techniques especially with respect to the study of MTs before making conclusive predictions on the amount of secondary structural elements present. The CD spectral features for α-helices and β-sheets are influenced and at least partially neutralized by the contributions of the metal–thiolate clusters, i.e. the CdII–thiolate cluster of chickpea Cd5MT1 in the above example, tainting the calculated fractions of secondary structural elements with a relatively high uncertainty. The interpretation of the amide I and III bands in the IR and Raman spectra can be biased significantly by the parameters chosen to treat the data, e.g. smoothing and differentiation. Also the fitting procedure to decipher the individual components of the observed band shape strongly depends on the parameters applied, e.g. number of Gaussian peaks used to fit the data, their band widths, and their positions. Another source of ambiguity poses the assignment of the fitted Gaussian peaks to individual secondary structural elements. Overlapping spectral contributions to the amide I band in the Raman spectra have, for example, been described for β-sheet and β-turn structures [43]. Specifically, for the amide I band in the IR spectra of mammalian MTs it was shown that contributions usually assigned to α-helical structures, i.e. 1,648–1,660 cm−1, can also be interpreted as random coil, and the wavenumber range typical for antiparallel β-sheets or aggregated strands, i.e. 1,675–1,695 cm−1, can be occupied by contributions from β-turns in MTs [4446].

Nevertheless and whatever the exact percentages of secondary structural elements are, it is apparent that in the plant MT1, MT2, and MT3 subfamilies virtually no α-helices are present, roughly one third of the respective entire amino acid chain consists of β-sheets and the remainder is made up of a mixture of β-turns and random coils. While the latter two structures are presumably predominantly found in the Cys-rich regions, β-sheets are more conceivable in the linker regions, which is corroborated by the predictions mentioned above. It might be worth speculating about possible arrangements of these β-sheet structures in the linker regions. An arrangement with two longer antiparallel β-strands as depicted in Fig. 3a and predicted for Actinidia deliciosa (kiwi) MT3 [47] would be rather rigid and should keep the N- and C-terminal Cys-rich regions in close proximity.

Fig. 3
figure 3

Probable β-sheet arrangements within the linker regions of plant MTs. a The entire linker sequence forms a single long antiparallel β-sheet structure (cyan) and hence a rather rigid scaffold to bring the Cys-rich regions (long, grey terminal tubes) into proximity for joint cluster formation (metal ions are depicted as blue spheres). The amino acids of the linker region form a more flexible structure with four shorter β-sheets, which allow a single cluster arrangement (b) or a dumbbell-shaped arrangement with two clusters (c)

A relatively high β-sheet content could be alternatively achieved by multiple shorter β-sheet stretches as similarly found in the 15 amino acid long linker region of a bacterial MT [13]. This would provide the linker region with a greater flexibility, potentially allowing the formation of both a single metal–thiolate cluster (Fig. 3b) and two clusters formed separately by the N- and C-terminal Cys-rich regions (Fig. 3c). While a single cluster has been predicted, e.g. for cork oak MT2 [48] and chickpea MT2 [35], dynamic structures were proposed for, e.g., chickpea MT1 [32] and banana MT3 [38], in which the formation of a single or two separate clusters is dependent on the metal ion load (see below and Fig. 4).

Fig. 4
figure 4

a The dumbbell-shaped arrangement similar to the arrangement in Fig. 3a can be transformed in a jackknife-like movement into a single domain cluster form upon binding of an additional metal ion (red sphere). b This transformation can be reversed upon removal of the additional metal ion. The grey arrows show the direction of movement of the Cys-rich regions and the red arrow shows the release of the additional metal ion

Influence and potential role of the Cys-devoid linker region

Linker regions in multidomain proteins can have a dual role. As a protein domain is usually considered to constitute an independent folding unit, interdomain linkers can serve as relatively rigid spacers to allow correct domain folding by preventing disturbing interferences with neighbouring domains. Nevertheless, interdomain interactions or formation of specific clefts for, e.g., substrate binding can be crucial for proper protein activity. Hence, a linker can also have the function of a hinge allowing specific movements. Statistical analyses showed that Pro and Lys residues are more frequent in interdomain linker regions [49]. In particular, the occurrence of Pro demands attention, as this amino acid, on the one hand, interrupts the formation of secondary structural elements but, on the other hand, itself introduces conformational rigidity and could assist in orientating two domains relative to each other for proper function [50]. Statistics also revealed that interdomain linkers mostly contain an even number of amino acids ranging from 4 to 12 residues in length [49].

In mammalian MTs the two domains are separated by a short and highly conserved LysLysSer linker. As the linker is not involved in interdomain contacts, it was suggested that it serves as a hinge region allowing a certain degree of domain movement [51]. The positive charge of the linker might also be required for charge neutralization as both domains bear a negative charge in their fully metallated form, i.e. −2 for the α-domain and −3 for the β-domain when coordinated to four or three divalent metal ions, respectively. To study the effect of the length of the linker region, additional amino acid repeats consisting of Pro, Gly, and Asp residues were inserted C-terminal of the LysLysSer linker and metal tolerance assays were performed in the form of yeast complementation assays with accordingly transformed cell lines [51]. Insertion of up to 4 additional amino acids confers the same tolerance against up to 300 μM CdII ions as observed for the unmodified mammalian MT; constructs with longer inserts, i.e. 8, 12, and 16 amino acids, were less effective. As concomitantly also lower levels of the respective MT proteins with the longer linkers were observed, a reduced half-life of these proteins in yeast was proposed and hence a higher propensity of the longer linkers to proteolytic degradation. Thus, from this study it is not possible to decipher if the increased spatial separation of the two domains in the mammalian MT investigated per se decreases the metal ion binding ability of the protein or if only the actual proteolytic cleavage of the linker during protein degradation diminishes the activity in the metal tolerance assay. The second point not addressed is the influence of the charge of the linker residues on the metal ion binding properties. Firstly, the insertion of additional linker residues C-terminal of the LysLysSer sequence increases the distance of the α-domain to the +2 charge of the linker and, secondly, the longer inserts of 8, 12, and 16 amino acids contain 2, 3, and 4 Asp residues, respectively, adding to the overall negative charge. On a speculative basis, these additional negative charges might destabilize the metal–thiolate clusters—independently of the linker length. Hence, the observed reduced metal ion binding properties of mammalian MT upon increase of the linker length clearly requires further investigation before justifying a general conclusion. Differences in properties, however, are apparent when comparing the full-length human MT2 protein with its two separate domains in vitro. The fully metal ion loaded full-length protein was found to be more stable towards oxidation and ZnII transfer [52], and remetallation studies with the apo forms and AsIII revealed the faster metallation of the full-length form [53]. In addition, the apparent pK a values of the Cys residues in the ZnII and the CdII forms, which are indicative of the metal ion affinities of the proteins, are lower for the full-length human MT2 compared with the average values obtained from separate measurements with the individual domains [52]. In summary, while the influence of the length of the linker region is less clear as variations of the amino acid composition might have a difficult-to-predict influence, it seems obvious that the physical connection of both domains plays a crucial role for the stability of the protein.

Turning our attention to the plant MTs, one of the most striking features are their unusual long Cys-devoid linker regions ranging from approximately 30 to 45 amino acids in length for the MT1, MT2, and MT3 forms, whereas the linker between the γ-domain and the βE-domain of the Ec proteins is with up to 11 residues (when considering the conserved His residues as part of the central Cys-rich region) considerably shorter. Up to now, solely a single study can be found in the literature providing experimental results on the role of the Cys-free linker regions in plant MTs [48]. The comparative study showed that both full-length cork oak MT2 and a chimera of the N- and C-terminal Cys-rich regions connected by a shorter eight amino acid long linker are able to form species with the same number of metal ions, i.e. Zn4 and Cu8 forms. Interestingly and despite the identical metal ion binding capacity, the full-length protein is superior to the species with the shorter linker in yeast complementation studies for copper tolerance. The reasons for this might be an influence of the longer linker on the in vivo stability of the full-length protein, a role in targeting of the protein to a specific subcellular location, or even interaction with a so-far-unknown binding partner within the cell [48]. Contrarily, earlier studies raised the possibility that plant MTs are posttranslationally processed to remove the linker region on the basis of trials to isolate the native proteins directly from the plant material [54]. The initial failure to overexpress garden pea MT1 in E. coli supported this view [5]. In analogy to the results obtained with the full-length human MT2 and its separate domains, also the apparent pK a values of the Cys residues in full-length wheat Ec-1 are lower than the average of the values obtained for the separately analysed γ-domain and βE-domain of the protein [26]. This averaged value is identical to the pK a values obtained when a 1:1 mixture of the separate domains was subjected to pH titration. Again it seems to be the actual physical connection of the two domains by the linker region that is crucial for the increased metal ion binding stabilities of the full-length protein.

Spatial arrangement of Cys-rich regions

After discussing the secondary structural elements presumably formed by the residues of the linker regions and the importance of these linkers for the properties and functions of (plant) MTs, we will now focus on the spatial arrangement of the Cys-rich regions and hence on the metal–thiolate cluster assemblies formed. Beyond doubt, the most intensely investigated and best understood structure is the one of the mammalian MT forms. The N-terminal β-domain and the C-terminal α-domain can be regarded as independent folding units in the presence of suitable metal ions; hence, the final structure is identical in the separately studied domains and the full-length protein [15]. The domains are connected by a LysLysSer sequence as mentioned already, whose linker nature only became clear in the course of the structural studies. Consequently, mammalian MTs are two-domain proteins in the widest sense and their structure is commonly described as dumbbell-shaped. The plant MT1, MT2, and MT3 proteins analogously feature two Cys-rich regions, and their separation in sequence by the long linker regions is obvious (Fig. 1). However, no three-dimensional information about the structures of members from these three plant MT subfamilies is available, and accordingly the question concerning the spatial arrangement of the two Cys-rich regions relative to one another has to rely on alternative techniques. In principle, both a dumbbell-shaped two-domain structure as in the mammalian forms and a hairpin-like one-domain arrangement as seen, for example, in a cyanobacterial MT [13] is feasible. The earliest study providing insights into this question describes the overexpression of garden pea MT1 without a purification tag in E. coli, which was accompanied by proteolytic cleavage of the linker region as mentioned already [5]. Accordingly, reverse-phase chromatography at pH 2 revealed two distinct peaks originating from the N- and C-terminal Cys-rich regions, respectively. However, only a single peak was observed after size-exclusion chromatography at neutral pH and in the presence of CdII ions. This triggered the obvious conclusion that both peptides were connected by CdII ions even after cleavage of the protein backbone and hence a single metal ion cluster must be present in the protein. In a similar way, the spatial arrangement of the two Cys-rich regions was addressed for chickpea MT2 [35]. Limited proteolytic digestion of the Cd5MT2 form with proteinase K and subsequent size-exclusion chromatography gave a single peak with an elution time corresponding to an apparent molecular mass of around 4.5 kDa, hence the mass of the combined Cys-rich regions. MALDI-TOF measurements of the peak fraction identified solely signals for the two Cys-rich domains, but not for the linker region. Amino acid analysis corroborated the result. Hence, it is safe to conclude that chickpea MT2 also adopts a hairpin-like structure in its fully metallated form. Also for cork oak MT2 a hairpin-like structure with a single metal ion cluster was proposed on the basis of a comparative analysis of CD spectra [48].

In contrast to the results given above is the theoretical work performed with the sequence of durum wheat MT1 [30]. The authors used both Cys-rich regions, which contain six Cys residues each, separately for homology modelling, yielding the β-domains of sea urchin MT and rat liver MT2, respectively, as the best fitting models. Both models contain a Cd3Cys9 cluster and hence prompted the prediction of a dumbbell-shaped arrangement of the two Cys-rich regions of durum wheat MT1 with two separate Cd3Cys6 clusters. It has to be noted, however, that this result (1) contrasts with the metal ion content of durum wheat MT1 that was determined to 4 ± 1 CdII ions in the same study and is well in line with results from other plant MT1 proteins and theoretical calculations (Table 1), (2) is hard to reconcile with any metal–thiolate cluster stoichiometries and structures known so far, and (3) is not in line with the hairpin-like spatial arrangement predicted for garden pea MT1. The proposal of different cluster arrangements for the two MT1 proteins is even more surprising as durum wheat MT1 shows 52% sequence identity and 73% similarity to the sequence of garden pea MT1. Considering only the Cys-rich regions, the sequence similarity is even 100%.

Banana MT3 has the ability to coordinate up to four divalent metal ions [38]. On the basis of stoichiometric considerations, formation of a hairpin-like structure seems reasonable as banana MT3 contains ten Cys residues and one His residue, which in principle allows a metal cluster arrangement similar to the α-domain of the mammalian MTs (M II4 Cys11) or the cluster found in the cyanobacterial form (M II4 Cys9His2). For both banana MT3 and its His → Ala mutant, also a species coordinating only three ZnII ions was observed and triggered the proposal of an additional substoichiometrically metal-loaded form that could adopt a dumbbell-shaped structure [38]. In this model, the N-terminal region would bind a single ZnII ion via its four Cys residues, whereas a separate Zn2Cys6 cluster could be accommodated in the C-terminal Cys-rich region (Fig. 4a).

This view opens up the possibility of a jackknife-like domain movement dependent on the metal ion load. Coordination of the fourth metal ion would then accordingly bring the two separate clusters together to form a single metal cluster arrangement (Fig. 4b). A two-domain, “open” structure was similarly predicted for kiwifruit MT3 in its three divalent metal ion binding form on the basis of theoretical calculations [47].

The fourth plant MT subfamily, the Ec proteins, contain three Cys-rich regions, again clearly separated in sequence by, although shorter, linker stretches. Corroborated by a number of investigations and last not least by two NMR solution structures, we have solid evidence that wheat Ec-1 is organized into two metal-binding domains [14, 26, 55, 56]. The smaller N-terminal domain is formed by the N-terminal Cys-rich region and hosts a Zn2Cys6 cluster (Fig. 5).

Fig. 5
figure 5

Amino acid sequence alignment of representative members of the plant Ec subfamily with Cys and His residues highlighted as in Fig. 1. The residues comprising the N-terminal γ-domain and the C-terminal βE-domain are indicated with an orange ellipsoid and a green ellipsoid, respectively. Below this, the NMR solution structures of the two domains of wheat Ec-1 are shown; no information about the relative orientation of these two domains to each other is available. ZnII ions are depicted as blue spheres and parts of the coordinating Cys and His side chains are shown in stick mode

As mentioned above, such a cluster, while known from yeast transcription factors, e.g. GAL4 [57], is unprecedented for any MT species so far. In continuation of the nomenclature used for the two mammalian MT domains, the N-terminal domain of wheat Ec-1 was denominated as the γ-domain. The larger domain is formed by both the central and the C-terminal Cys-rich region and encloses a mononuclear ZnCys2His2 site as well as a Zn3Cys9 cluster with similarity to the one found in the β-domain of, e.g., the mammalian MTs (Fig. 5). This arrangement of metal ion binding sites was denominated as the extended β-domain, or the βE-domain. While in principle coordination of ZnII ions by His residues is known from cyanobacterial MTs [13], the observed mononuclear binding site is again unprecedented for any MT form so far. The Cys residues of the central Cys-rich region participate in the formation of both the mononuclear site and the Zn3Cys9 cluster, giving rise to an interleaved arrangement of the two metal ion binding sites in this domain. Both the γ-domain and the βE-domain have been shown to act as independent folding units [26, 55], and from NMR studies on the full-length protein no interdomain contacts could be deduced [14]. Hence, the overall arrangement of the wheat Ec-1 protein might be described as elongated or dumbbell-shaped. These findings taken together, it is more than legitimate to say that such an arrangement of the three Cys-rich regions would have been more than difficult to predict with theoretical methods, especially given the novelty of the structural motifs. Nevertheless, the basic motif of a two-domain protein formed by the N-terminal Cys-rich region and the combined central and C-terminal Cys-rich regions was already predicted on the basis of proteolytic digestion experiments as the main method [56] and was also contemplated during a study of pH-dependent ZnII release from the full-length protein with ESI-MS [58].

Conclusion

This review of the literature shows that prediction of the spatial arrangement of Cys-rich regions in any new MT species on the basis of theoretical considerations or homology modelling bears a certain degree of ambiguity. On the other hand, the relatively simple experiment of limited proteolytic digestion coupled with appropriate analysis of the resulting cleavage product(s) can give ample first insight into the prevailing cluster organization. This being said, the actual metal–thiolate cluster structure, and possibly the participation of coordinating ligands other than Cys residues, might still hold some surprises at hand.