Keywords

1 Frustule-Associated Organic Layers

The diatom cell wall, termed frustule, is an organic–inorganic hybrid material composed of ~90% amorphous, hydrated SiO2 (silica) and ~10% tightly associated and embedded organic material (Kröger and Poulsen 2008). The inorganic skeleton also contains ions of the metals Al, Fe, Ge, Cu, and Zn (Davis and Hildebrand 2008 and references therein) albeit at rather low concentrations reaching from 0.01% in the case of Zn and Cu to about 1% for aluminum (see Table 1 in Davis and Hildebrand 2008). The role of the inorganic non-silica components on frustule formation and biological function has so far remained largely unexplored. Research on the organic components has been focused mainly on understanding their role in biogenesis of this unique type of cell wall (Hildebrand et al. 2018). During the past 15 years these studies have oncentrated on Thalassiosira pseudonana (Fig. 1a), which grows fairly rapidly to high cell densities, is amenable to genetic modification (Poulsen et al. 2006), and was the first diatom with a completely sequenced genome (Armbrust et al. 2004).

Fig. 1
figure 1

(a) Scanning electron microscopy (SEM) image of the frustule from T. pseudonana. (b) Schematic diatom cell in cross section showing the biosilica components and organic layers in the frustule. Intracellular compartments are not shown

1.1 Arrangement and Biogenesis of the Organic Layers

The general structure of the frustule is shown schematically in Fig. 1b and has been described in detail in Chap. “Structure and Morphogenesis of the Frustule”. Electron microscopy imaging revealed the presence of three structurally distinguishable types of organic materials that cover the entire frustule surface both on the proximal and distal side. The organic material on the distal side is termed the extracellular matrix (Fig. 1b, purple). The organic layer between the plasma membrane and the proximal side of the frustule is termed the diatotepum (Fig. 1b, red), which is visible in transmission electron microscopy (TEM) analysis of sectioned cells in regions where the protoplast is detached from the frustule (von Stosch 1981; Crawford 1973). In these regions the diatotepum appears as a well delineated layer that is attached to the frustule only at a few specific sites. Each theca is equipped with its own diatotepum and both layers do not merge at the overlap region, but rather are separated by the girdle bands of the hypovalve (Fig. 1b, red). The third type of organic material in the frustule is the organic casing which completely covers each silica element of the frustule (Fig. 1b, blue) (Volcani 1981). The existence of the organic casing is somewhat controversial, because it is difficult to visualize in TEM sections of intact cells due to its tight association with the silica. However, in TEM images of sections of C. fusiformis stained for organic material a distinct layer is visible around each silica element (Reimann et al. 1965; Kröger and Wetherbee 2000). Furthermore, when purified frustules (involving extensive treatment with hydrolytic enzymes) were treated with HF vapor, the silica was completely removed leaving behind a casing that retains the characteristic pattern of the silica (Reimann et al. 1966).

With respect to biogenesis of the organic layers, there is clear evidence that the diatotepum and the extracellular matrix become assembled on the cell surface and are not produced inside the silica deposition vesicles (SDVs), where silica formation takes place (von Stosch 1981; Crawford 1973; van de Poll et al. 1999; Kröger and Wetherbee 2000). To date, biogenesis of the organic casing has remained unclear. Due to its very tight association with the silica it seems likely that it is produced within the SDVs. However, TEM analysis of developing SDVs has not yet provided clear evidence for this scenario. It has been proposed that the organic casing might be derived from the SDV membrane (Pickett-Heaps et al. 1990), yet this seems highly unlikely as the organic casing lacks a lipid bilayer-type morphology in TEM analysis.

1.2 Biochemical Composition of the Organic Layers

Following the discovery of the frustule-associated organic layers, biochemical studies have been undertaken to determine their chemical composition. A large variety of frustule isolation strategies have been developed, which likely have had different effects on the preservation of the organic layers and the presence of contaminating cytoplasmic components. Some of the earliest studies subjected the frustule preparations to exhaustive acid hydrolysis at elevated temperatures, which are conditions that preserve the silica but degrade the main biopolymers (e.g., polysaccharides, proteins, phospholipids). Mixtures of monosaccharides, amino acids, and fatty acids were found in the acid hydrolysates suggesting that the frustule-associated organic material was composed of polysaccharides, proteins, and lipids (for a review see Volcani 1981). Polysaccharides, which are generally highly abundant in algal cell walls, had early on been suspected by microscopists to be the main constituents due to the affinity of the organic layers for polysaccharide-binding dyes. To date, frustule-associated polysaccharides are almost exclusively characterized only regarding their monosaccharide compositions, which is rather complex (Lewin 1955; Ford and Percival 1965a, b; Hoagland et al. 2004; Chiovitti et al. 2005; Willis et al. 2013; Tesson and Hildebrand 2013; Gügi et al. 2015). Whether this reflects the presence of a diverse mixture of polysaccharides, each having a relatively simple monomer composition, or only a few polysaccharides with rather diverse monosaccharide units remains unknown. Recently, a frustule-associated, chitin-based meshwork was identified in T. pseudonana, which also contains proteins and is distinct from the chitin fibers that are secreted through specific tube-like openings in the frustules of Thalassiosira and Cyclotella species (Brunner et al. 2009; Blackwell et al. 1967; Herth and Zungenmaier 1977; Herth 1979; Herth and Barthlott 1979). Using chitin-specific dyes and tagging with green fluorescent protein (GFP), a ring of chitin and a chitin synthase have been localized in the girdle band region of T. pseudonana (Durkin et al. 2009; Wustmann et al. 2020). Many other diatom species encode chitin synthase genes (Durkin et al. 2009; Amato and Ferrante 2018), and it seems therefore likely that chitin is a widespread frustule component.

Since proteins and lipids are the main constituents of all cellular components, it could initially not be ruled out that their presence in frustules resulted from contaminating cytoplasmic material (e.g. plasma membrane, plastid and vacuole fragments). Investigations on frustule-associated lipids are very rare, and so far no unusual lipids were found in frustule preparations (Kates and Volcani 1968; Tesson et al. 2008, 2009; Suroy et al. 2014). The few available reports agree that fatty acid bearing lipids are bona fide components of the frustule, yet lipid bilayer-based structures are absent. Therefore, it has remained unresolved how lipids are linked into the organic layers. In contrast to lipids, frustule-associated proteins have been quite extensively studied. This led to the discovery of entirely new protein families, provided the first insight into protein–silica interactions, and stimulated novel hypotheses on silica morphogenesis. The work on frustule-associated lipids and polysaccharides is still in its infancy, and therefore this review will mainly focus on the role of proteins in frustule formation and function.

The first evidence for the presence of unusual proteins in diatom frustules was obtained by amino acid analysis from Ben Volcani’s group more than 50 years ago. They discovered dihydroxyproline and phosphorylated trimethyl-hydroxylysine, which at that time were novel modified amino acids (Nakajima and Volcani 1969, 1970). Furthermore, the standard amino acids serine and threonine were shown to be generally enriched in diatom frustules (Hecky et al. 1973; Swift and Wheeler 1992). About 25 years after Volcani’s pioneering work, the first intact proteins from diatom frustules, called frustulins, were isolated and characterized (Kröger et al. 1994). The structures, properties, and possible functions of frustulins will be described below. Before the first diatom genome and transcriptome data became available, the discovery of frustule-associated proteins relied entirely on biochemical analysis of sufficiently pure frustule preparations. A straightforward method developed by the Volcani group, which involves lysis of the cells with glass beads followed by repeated differential centrifugation, yielded remarkably pure frustules from some diatom species (e.g., Cylindrotheca fusiformis, Navicula pelliculosa) (Nakajima and Volcani 1969, 1970). When C. fusiformis frustules were treated with ethylenediamine tetraacetate (EDTA) or sodium dodecylsulfate (SDS), only frustulins were found in the extracts (Kröger et al. 1996). In contrast, the SDS extract of T. pseudonana frustules contained many other proteins after the frustules had been extracted with EDTA, urea, and high salt concentrations (Frigeri et al. 2006). It has not yet been systematically investigated by TEM which of the frustule-associated organic layers become fully or partially extracted upon the various extraction procedures. There is indirect evidence from immunofluorescence microscopy that frustulins are part of the extracellular matrix, because they are easily accessible by antibodies. In contrast another family of frustule-associated proteins, termed pleuralins (see below), are only accessible for antibodies after EDTA extraction (Kröger et al. 1997).

Following extraction with EDTA and sodium dodecylsulfate (SDS) at high temperature, the frustules still contain organic components. It is assumed that the vast majority of these are genuine frustule components rather than contaminants, because SDS efficiently solubilizes lipids and proteins. The organic components in SDS-extracted frustules may be highly covalently cross-linked to one another constituting an insoluble organic layer, or they may be partially or fully entrapped inside the silica, or both. It has been speculated that covalent bonds exist between the silica surface, which contains Si-OH and Si-O groups, and the functional groups of the organic components. Such covalent linkages could be Si-O-X or Si-X groups with X representing a functional group of the organic component (e.g., CHR2, COR, NHR, SR). However, in nuclear magnetic resonance (NMR) specroscopy analyses of isolated frustules from T. pseudonana and Stephanopyxis turris no evidence for covalent bonds between the organic components and the silica was obtained (Christiansen et al. 2006; Jantschke et al. 2015). NMR spectroscopy on intact N. pelliculosa cells provided evidence for covalent bonds between silica and the organic components (Kinrade et al. 2002), but no such evidence was found in T. pseudonana cells in comparable yet even more sensitive NMR experiments (Gröger et al. 2008). Therefore, to date it is generally agreed that the interactions between biomolecules and silica involve exclusively noncovalent bonds.

To extract the organic components from SDS-treated frustules, three different methods were used: HF, NH4F, and NaOH. Adding liquid, anhydrous HF to dried frustules at 0 °C (boiling point of HF is 19.5 °C) converts silica quantitatively to SiF4, which escapes as a gas (boiling point of SiF4: −86 °C). The reaction generates H2O and thus does not proceed under strictly anhydrous condition, yet HF is in vast excess. During the reaction, O-glycosidically linked sugars are converted to the constituent monosaccharides, phosphoester bonds are cleaved, but N-glycosidic bonds and peptide bonds remain unaffected (Mort and Lamport 1977). Aqueous solutions of HF (30–48%) have also been used effectively for demineralization, but they are prone to hydrolyzing peptide bonds due to the high concentration of water. Nevertheless, peptide data have been obtained from such samples (Frigeri et al. 2006; Nemoto et al. 2020). Aqueous solutions of NaOH completely dissolve silica yielding sodium silicate and at the same time completely hydrolyze proteins. O-glycosidic bonds are stable in NaOH, which has therefore been used to extract polysaccharides from frustules (for example see Chiovitti et al. 2005). The chemically most gentle method for completely dissolving the frustule silica is using a highly concentrated solution of NH4F at pH 5 (adjusted with HF or HCl), which converts SiO2 into water soluble SiF62−. Peptide bonds, O- and N-glycosidic bonds, and phosphoester bonds are stable under these conditions, enabling the isolation of proteins and probably all other biomolecules in their native chemical structures (Kröger et al. 2002; Poulsen et al. 2003; Poulsen and Kröger 2004). It is assumed that NH4F treatment extracts biomolecules that are tightly adhered to the silica surface or are embedded within the silica and thus are components of the silica-associated organic matrix (Fig. 1b, blue layer). Biomolecules that are covalently cross-linked into insoluble networks will not get extracted, because the NH4F solution is unable to cleave such cross-links (e.g., glycosdidic bonds, ester-bonds, isopeptide bonds). Therefore, it is unlikely that material from the extracellular matrix or diatotepum is present in the NH4F-extract of EDTA- and SDS-extracted frustules as they should have been extracted by these prior treatments.

After dissolving the silica with NH4F, an insoluble residue is obtained, which is termed AFIM (for ammonium fluoride insoluble material). The components in the AFIM originate potentially from any of the three frustule-associated organic layers. They could be meshworks of covalently cross-linked biomolecules, or extremely stable noncovalent assemblies that were not extracted by treatment with EDTA and SDS due to embedment inside the silica. Characterizing the biopolymers that constitute the AFIM is rather difficult due to the insolubility of the material. Solid state 13C-NMR analysis identified β-chitin in T. pseudonana in the AFIM (Brunner et al. 2009). After exhaustive acid hydrolysis, complex mixtures of amino acids and a monosaccharides were identified in T. pseudonana and other diatoms suggesting that (glyco-) proteins and polysaccharides other than chitin are also present in the AFIM (Tesson and Hildebrand 2013; Kotzsch et al. 2016; Pawolski et al. 2018). Treating the T. pseudonana AFIM with anhydrous HF followed by proteomics analysis identified 12 proteins (Kotzsch et al. 2016). It was assumed that extraction of these proteins from the AFIM was due to HF-mediated cleavage of cross-links that were based on O-glycosidic and/or phosphodiester bonds (Kotzsch et al. 2016).

The isolation of most of the frustule-associated biopolymers known to date was achieved using the sequential extraction method shown in Fig. 2. In addition to this biochemical approach, frustule-associated proteins have been identified through transcriptomics analysis of synchronized cells, and via bioinformatics approaches (Mock et al. 2008; Scheffel et al. 2011; Shrestha et al. 2012; Brembu et al. 2017). It needs to be stressed that sequence similarity to and/or mRNA co-expression with a known frustule protein should not be regarded as sufficient proof for the frustule association of a protein, for two reasons. Firstly, there is no clear threshold above which similarity in the sequence between two proteins becomes significant for their functional similarity. Secondly, cell cycle synchronization relies on a silicon starvation-replenishment protocol, which not only influences silica biogenesis but also affects the expression of thousands of genes of other metabolic pathways (Brembu et al. 2017). Likewise, not all proteins that are present in extracts obtained from purified frustules are necessarily frustule-associated proteins in vivo, because frustule preparations (as is the case for any subcellular fraction) always contain contaminating proteins (Frigeri et al. 2006; Kotzsch et al. 2016). As a consequence of the above considerations, at least two independent methods should be applied for unequivocal identification of a frustule-associated protein. For example, proteins identified by proteomics analysis of frustules extracts should only be regarded as bona fide frustule proteins, if they have been located in the frustule through GFP tagging, or immunolocalization, or show an mRNA expression profile that is consistent with frustule biosynthesis, or have high sequence similarity to previously identified frustule proteins. Table 1 lists all currently known frustule-associated proteins that fulfill this criterion. Additionally, it shows from which diatom species long-chain polyamines (LCPA) have been isolated and characterized.

Fig. 2
figure 2

Extraction of intact proteins and other biomolecules from diatom frustules. The extraction procedures are indicated next to the arrows. AFSC = ammonium fluoride soluble components, AFIM = ammonium fluoride insoluble material

Table 1 Frustule-associated proteins and long-chain polyamines (LCPA) identified in diatoms. The acronyms of the “Frustule extracts” correspond to the nomenclature defined in Fig. 2. The list includes only proteins that have been identified as frustule components through two independent methods. Sp., unidentified species; Ac, Amphora coffeaeformis, Cas, Coscinodiscus asteromphalus, Ca, Craspedostauros australis, Cc, Cyclotella cryptica, Cco, Coscinodiscus concinnus, Cd, Chaetoceros didymium, Cf, Cylindotheca fusiformis, Cg, Coscinodiscus granii, Cw, Coscinodiscus wailesii, Fs, Fistulifera solaris, Na, Nitzschia angularis; Np, Nitzschia palea, Tp, Thalassiosira pseudonana, PPRP, plant pathogenesis-related protein, TPRP, Tetratricopeptide repeat motif containing protein, SPI, serine protease inhibitor, TLSP, Trypsin-like serine protease, STPK, serine/threonine protein kinase (note: this protein is identical to tpSTK1, which is located in the ER rather than the frustule (Sheppard et al. 2010)), HTTRP: protein containing an HSP70 domain and tetratricorepeat domain. References: [1] Kröger et al. 1994, [2] Kröger et al. 1996, [3] Kröger et al. 2002, [4] Poulsen et al. 2003, [5] Kröger et al. 1999, [6] Poulsen and Kröger 2004, [7] Sumper and Brunner 2008, [8] Wenzl et al. 2008, [9] Scheffel et al. 2011, [10] Kotzsch et al. 2016, [11] Kröger et al. 1997, [12] Kröger et al. 2001, [13] Kotzsch et al. 2017, [14] Kröger et al. 2000, [15] Davis and Palenik 2008, [16] Davis et al. 2005, [17] Frigeri et al. 2006, [18] Tesson et al. 2017, [19] Nemoto et al. 2014, [20] Nemoto et al. 2020, [21] Heintze et al. 2020, [22] Buhmann et al. 2014, [23] Lind et al. 1997, [24] Sumper et al. 2005, [25] Pawolski et al. 2018, [26] Sumper and Lehmann 2006

2 Components Involved in Silica Formation

This section will provide an overview of the structures and properties of components that have been shown through in vivo or in vitro experiments to be capable of influencing silica formation. The components include unique proteins, long-chain polyamines (LCPA), and nanopatterned insoluble organic matrices.

2.1 Chemical Structure of LCPA, Silaffins, and Silacidins

LCPA as abundant components in frustules were an entirely unexpected discovery, because they are distinct from proteins, polysaccharides, and lipids that were initially believed to represent the main groups of frustule-associated organic components (see above). LCPA have been isolated from the HF-extracts or NH4F extracts of EDTA/SDS-cleaned frustule preparations in all diatoms that were analyzed. They are linear oligo-propyleneimine chains that are attached either to diaminopopane or diaminobutane (Kröger and Poulsen 2008; Sumper and Brunner 2008) (Fig. 3a). In the case of diaminobutane as the basis molevcules, LCPA resemble spermine and spermidine, which are ubiquitous polyamines that carry one and two propyleneimine units, respectively. However, diatom LCPA carry between 5–20 propylamine units, show heterogeneous methylation of the N-atoms, and quaternary N-atoms can also be present (Kröger and Poulsen 2008, Sumper and Brunner 2008) (Fig. 3a). Each diatom investigated so far contains a species-specific spectrum of LCPA. Diatom genomes encode unusually high numbers of putative spermine and spermidine synthases. Candidate LCPA synthases have been postulated based on the presence of a signal peptide and unusual bi- and tri-functional polypeptides that have aminopropyltransferase, S-adenosylmethionine decarboxylase, and methyltransferase activities, yet they lack experimental verification (Michael 2011; Nemoto et al. 2020). DNP-ssNMR analysis has demonstrated that LCPA are embedded inside the silica of the diatom Stephanopyxis turris rather than being exposed on the surface of the frustule (Jantschke et al. 2015). This result implies that LCPA are present inside SDVs during silica formation, as this is the only conceivable mechanism by which biomolecules could become silica-embedded.

Fig. 3
figure 3

Examples of oligopropyleneimine bearing molecules from diatoms. (a) LCPA; diaminopropane (blue), spermidine (green), and spermine (red) moieties are highlighted in color. (b) Polyamine-modified lysines; chemical groups attached to the lysine residues are highlighted in red. Note that most but not all primary, secondary, and tertiary N-atoms in LCPA molecules and the modified lysines will be protonated (Bernecker et al. 2010); (# from Wenzl et al. 2004)

Oligo-propyleneimine chains (in the following abbreviated PA) are also present in silaffins, which is a group of diatom-specific proteins that were found in the HF- and NH4F-extracts of SDS-extracted frustules (Kröger and Poulsen 2008) (Table 1). Many of the lysine residues in silaffins carry PA modifications at their ε-amino groups. In the case of C. fusiformis silaffins, the PA residues carry 6–11 propylamine units (Kröger et al. 2001; Poulsen et al. 2003) while they comprise only two propylamine units in T. pseudonana silaffins (Sumper et al. 2007) (Fig. 3b). Generally, a large fraction of a silaffin’s lysine residues carry PA modifications, some lysines remain unmodified, and the remainder are dimethylated or trimethylated at the ε-amino group (Kröger et al. 1999, 2001; Poulsen et al. 2003; Poulsen and Kröger 2004; Sumper et al. 2007). Trimethylated lysine residues in silaffins are hydroxylated at the δ-C-atom, and the hydroxyl-group is phosphorylated (Fig. 3b) (Kröger et al. 2001, 2002; Poulsen et al. 2003). Additional phosphorylation sites are serine, threonine, and hydroxyproline residues (Kröger et al. 2002; Poulsen et al. 2003). Further posttranslational modifications include dihydroxylation of proline residues and complex O-glycosylation including sulfated saccharides (Poulsen et al. 2003; Poulsen and Kröger 2004). Besides the extensive addition of functional groups, silaffins are also subject to proteolytic processing in addition to removal of the N-terminal signal peptide for import into the endoplasmic reticulum (Kröger et al. 1999; Poulsen and Kröger 2004). Remarkably, the six silaffins known to date share no significant sequence similarities to each other and are predicted to be intrinsically disordered proteins due to the large predominance of hydrophilic amino acid residues (Scheffel et al. 2011). Therefore, the defining features of a silaffin are independent of the amino acid sequence and instead comprise (i) the presence of PA-modified lysines, (ii) O-phosphoryl-groups, and (iii) the absence of canonical secondary structure elements. Due to the current scarcity of silaffin sequences and mapped amino acid modification sites, it has remained unknown which features of a silaffin define sites for O-phosphorylation and PA modification. To answer this question, many more silaffin sequences and their tedious biochemical analysis would be required as previously reported for C. fusiformis silaffin-1 and T. pseudonana tpSil3 (Kröger et al. 2001, 2002; Sumper et al. 2007).

In the NH4F-extract of EDTA/SDS-cleaned frustules from T. pseudonana a second type of phosphoproteins, silacidins, was identified. Silacidins onsist of ~25 amino acid residues of which 35% are serine, 45% are glutamate or aspartate, and no positively charged amino acid residues are present (Wenzl et al. 2008). About 60% of the serine residues are phosphorylated thereby contributing substantially to their extremely high negative charge (Wenzl et al. 2008). Sequence homologs of silacidins have been identified in some but not all diatoms that have been sequenced to date (Kirkham et al. 2017). The C. fusiformis silaffin-1 precursor polypeptide contains an N-terminal domain that shows no sequence similarity to silacidins, but is similar regarding the drastic enrichment of serine, aspartate, and glutamate residues. This acidic domain is cleaved off during maturation of the silaffin-generating domain (Kröger et al. 1999). It is currently unknown whether the acidic N-terminal domain becomes phosphorylated and incorporated into the frustule.

Together LCPA, silaffins and siliacidins represent a complete set of charged polyelectrolytes: LCPA are exclusively polycationic, silacidins are exclusively polyanionic, and silaffins are zwitterionic. The net charge of silaffins is dominated by the ratio of cationic propylamine units vs. anionic phosphoryl groups, although sulfated glycan moieties can also have a significant influence (Poulsen et al. 2003). As described in the following section, the charge balance within silaffins as well as in mixtures of the three components is crucial for their ability to accelerate silica formation from silicic acid solutions in vitro.

2.2 Silica Formation by LCPA, Silaffins, and Silacidins

Silica formation assays have been used to investigate in vitro which of the frustule-associated components are capable of influencing the speed by with silica forms in a supersaturated solution of silicic acid. The assays are usually performed at pH 5.5 (buffered by a sodium acetate/acetic acid mixture) as this is presumed to be close to the pH inside the lumen of the SDVs (Vrieling et al. 1999; Shimizu et al. 2001; Yee et al. 2020). At this pH the spontaneous formation of silica by auto-polycondensation of silicic acid is very slow and results in the formation of small silica nanoparticles (diameters <10 nm) after several hours. However, when silaffin natSil1A was added, large silica nanospheres (diameters 400–700 nm) were formed within only 10 min, which demonstrated that this molecule was able to highly accelerate silica formation (Kröger et al. 2002). However, natSil1A cannot be regarded a catalyst for silica polycondensation, because it becomes incorporated into the silica nanoparticles and thus is consumed in the process. Therefore, natSil1A should be referred to as inducer of silica formation. The silica inducing activity of natSil1A critically depends on the presence of both the polyamine moieties and the phosphoryl groups (Kröger et al. 1999, 2002). It appears that the polypeptide backbone of natSil1A has little if any relevance for silica inducing activity, because rapid formation of large silica nanospheres was also achieved using only LCPA and phosphate in the assay (Kröger et al. 2000; Wenzl et al. 2008). In the absence of inorganic phosphate, LCPA by themselves have very little or no silica formation activity (Wenzl et al. 2008; Kotzsch et al. 2017), whereas mixtures of LCPA and silacidin have high silica formation activity (Wenzl et al. 2008). The efficiency of the silacidin in this system is remarkable, because with respect to phosphate concentration ~1000-fold less silacidin than inorganic phosphate was required to achieve the same silica formation activity (Wenzl et al. 2008). Altogether, the in vitro experiments clearly demonstrated that mixtures of biomolecules with oligo-propyleneimine moieties and polyanionic domains constitute an efficient system for silica formation under biologically relevant conditions (acidic pH, supersaturated silicic acid solution (Martin-Jézéquel et al. 2000; Kumar et al. 2020), time scale of minutes). It should be noted that of all native silaffins known to date, only natSil1A was shown to exhibit intrinsic silica formation activity, whereas the four other silaffins that were tested (natSil2, tpSil1/2H, tpSil1/2L, tpSil3) were inactive on their own. In all these cases the lack of silica forming activity is likely due to a surplus of negative charges within the molecule, which could be clearly demonstrated for natSil2. The glycan chains of natSil2 contain about as many sulfate residues as there are phosphate residues attached to the polypeptide backbone (Poulsen et al. 2003). After complete removal of the glycan chains, the resulting natSil2 derivative, which still contained all phosphate residues, had a silica forming activity that was comparable to the activity of natSil1A (Poulsen et al. 2003). The silica formation activity of deglycosylated natSil2 was eliminated when also the phosphate residues were removed, but was restored by the addition of inorganic phosphate (Poulsen et al. 2003). This set of experiments indicated that the relative ratio of electrical charges between the polycationic oligo-propylamine moieties and the numerous anionic phosphate and sulfate groups determines silica formation activity (Poulsen et al. 2003). This was further confirmed through experiments with mixtures of LCPA and strongly negatively charged silaffins (tpSil3, tpSil1/2H). At a constant concentration of LCPA, low or intermediate concentrations of silaffins enhanced the silica formation activity of the mixture (Poulsen and Kröger 2004) (Fig. 4a). However, from intermediate to high silaffin concentrations, the silica forming activity declined with increasing silaffin concentration (Fig. 4a). The bell-shaped silica vs. LCPA-silaffin diagram can be explained by the presence of LCPA-silaffin clusters that are held together by ionic interactions (Fig. 4a). At low to intermediate silaffin concentration, oligo-propylamine chains are partially accessible to interact with silicic acid molecules (Fig. 4a), which would become concentrated within these clusters thus promoting condensation reactions between the silicic acid molecules. Furthermore, oligo-propyleneimine chains might accelerate silicic acid polycondensation by performing acid-base catalysis (Kröger and Sandhage 2010). At high silaffin concentrations essentially all oligo-propylamine moieties are screened by the negatively charged domains of silaffins (Fig. 4a) prohibiting silicic acid binding and catalysis of polycondensation. Remarkably, the silaffin-to-LCPA stoichiometric ratio does not only influence the quantity, but also the morphology of the silica that is formed. Mixtures of tpSil3 and LCPA produced ~500 nm-thick plates of densely packed <20 nm-sized silica particles and large, polydisperse silica spheres ranging from 900 nm to 4.2 μm in diameter (Fig. 4b) (Poulsen and Kröger 2004). The proportion of silica spheres in the precipitate decreased with increasing tpSil3 concentration (Poulsen and Kröger 2004). Interestingly, a very different silica structure was produced by switching the silaffin, yielding porous silica meshworks in tpSil1/2H-LCPA mixtures (Fig. 4c) (Poulsen and Kröger 2004). The pores were in a biological relevant size range with diameters between 20 and 200 nm (Fig. 4c) (Poulsen and Kröger 2004). The mechanism by which silaffin-LCPA mixtures guide the formation of silica structures in vitro has so far remained elusive.

Fig. 4
figure 4

Aggregation behavior and silica-precipitating activity of LCPA-silaffin mixtures. (a) Increasing amounts of tpSil3 were added to a constant concentration of T. pseudonana LCPA at pH 5.8. The turbidity of the solution, measured at 660 nm (black dots and black line), increased due to the formation of LCPA-silaffin aggregates. After 10 min incubation with silicic acid the precipitated silica was determined (grey bars). The scheme above the graph shows interpretation of the aggregation behavior of LCPA-tpSil3 mixtures. The positively charged polyamines are shown as red bars and the blue patches denote negatively charged clusters in tpSil3. (b, c) SEM images of silica formed by mixtures of LCPA with tpSil3 (B) and tpSil1/2H (c). Scale bars: 2 μm (b), 200 nm (c). (a) Is from Sumper and Brunner (2008); (b, c) are from Poulsen and Kröger (2004). Reprinted with permission

It has been hypothesized that electrostatically cross-linked networks of LCPA, silaffins, and silacidins (or other polyanionic molecules) are general components of the silica morphogenesis process inside SDVs (Kröger and Poulsen 2008; Sumper and Brunner 2008). The morphogenesis mechanism may involve spontaneous phase separation (Sumper 2002), which has been shown to occur upon mixing of the components in appropriate stoichiometric ratios (Poulsen et al. 2003). Each diatom species appears to be equipped with a specific set of LCPA and silaffins (Kröger et al. 2000), which may thus allow for the formation of species-specific silica patterns. In vitro experiments using mixtures of LCPA, silaffins and silacidins have so far failed to produce silica structures that match the regularly patterned morphologies of diatom biosilica, although in some cases porous silica was obtained (Poulsen et al. 2003; Poulsen and Kröger 2004). This rather limited success is unsurprising as silica morphogenesis in vivo occurs within an extremely flat micro-container, the SDV, which is delimited by a lipid bilayer. It is reasonable to assume that the confined space of the SDV as well as the membrane composition and properties exert a strong influence on silica morphogenesis.

To understand the roles of LCPA, silaffins and silacidins in diatom silica morphogenesis, the in vitro analyses need to be complemented by in vivo studies. Transcriptomics analysis with a synchronized T. pseudonana culture demonstrated that expression of tpSil3 is upregulated during valve formation, whereas tpSil1 expression was downregulated during valve formation and instead coincided with girdle band expression (Shrestha et al. 2012). This result is consistent with the location of the tpSil3-GFP fusion protein, which was found predominantly in the valve and the valve adjacent girdle band when expressed under control of its own promoter (Scheffel et al. 2011). However, it is contradictory to the location of tpSil1-GFP (expressed using the tpSil1 promoter), which was exclusively present in valves (Poulsen et al. 2013). These conflicting data can be reconciled by assuming that tpSil1 is stored in an intermediate compartment before being incorporated into the valve SDV. Alternatively, tpSil1 might never enter the SDV but rather become associated with the valve on the cell surface. LCPA are embedded inside the silica (see above) and thus the highest amounts of LCPA are required during biogenesis of valves, which are the largest frustule building blocks. Consistent with this requirement, the expression of several genes encoding putative enzymes catalyzing oligo-propyleneimine biosynthesis (SET domain protein methyltransferases) was enhanced during frustule biosynthesis in Nitzschia palea (Nemoto et al. 2020). So far, the expression of the silacidin gene in synchronized T. pseudonana cells has not been reported. However, it has been demonstrated that on the protein level the ratio of silacidins vs. tpSil1/2L increases when T. pseudonana cells are starved of silicic acid (Richthammer et al. 2011). Based on this result and the observation that already small amounts of silacidins strongly enhance LCPA-mediated silica formation in vitro (see above), it has been suggested that silacidins are critical for frustule formation under conditions of low silicic acid availability (Richthammer et al. 2011). Knocking down the expression pattern of the silacidin gene in T. pseudonana using antisense RNA had significant effects on the size and silica content of the cells. The frustules of the knockdown mutants contained 21% less silica per surface area than the cells of the “control strain” (i.e. transformant expressing the antibiotic resistance gene but not the silacidin antisense RNA), which is consistent with a role of silacidin in silica biogenesis (Kirkham et al. 2017). Additionally, the knockdown mutants exhibited a 25% increase in valve diameter and a 2.2-fold higher cell volume, and thus were both wider and more elongated than wild type cells (Kirkham et al. 2017). It has been speculated that silacidins promote nutrient uptake by being involved in maintaining a favorable surface-to-volume ratio of diatoms cells (Kirkham et al. 2017).

2.3 Proteins of the SDV Membrane

Only omparatively recently, the first SDV membrane proteins have been discovered through two complementary approaches. In one approach chitin-free insoluble organic matrices from T. pseudonana were subjected to proteomics analysis and one of the identified proteins was Silicanin-1 (Sin1, formerly SiMat7) (Kotzsch et al. 2016). GFP-tagging in combination with time lapse confocal fluorescence microscopy demonstrated that Sin1 is co-located with valve and girdle band SDVs (Kotzsch et al. 2017). In the complementary approach, transcriptomics analysis of a synchronized T. pseudonana culture revealed numerous genes with silaffin-like expression patterns, which were then subjected to a bioinformatics screen to identify proteins with both an N-terminal ER-import signal peptide and a single transmembrane domain (Tesson et al. 2017). One of the resulting proteins, termed silicalemma-associated protein 1 (SAP1), was GFP-tagged and found to be associated with valve and girdle band SDVs through confocal fluorescence microscopy (Tesson et al. 2017). In the following the current knowledge on the SDV membrane proteins is summarized.

Sin1 is a predicted type-1 transmembrane protein with an N-terminal signal peptide for co-translational ER import, a ~40 kDa luminal domain, a single transmembrane helix, and a short C-terminal domain (20 amino acids) (Kotzsch et al. 2017). Sin1 is highly conserved among all diatoms from which transcriptome or genome data are available (sequence identity 46–66%). T. pseudonana encodes a protein with 67% sequence identity to Sin1, which was termed Sin2 (Kotzsch et al. 2017) and is also associated with valve and girdle band SDVs (Kotzsch and Kröger, unpublished data). The C-termini of Sin1 and Sin2 are devoid of any sequence motifs with known function, and thus it is unclear whether they interact with components in the cytoplasm. In the SDV lumen Sin1 will likely be engaged in forming supramolecular assemblies, because its luminal domain is prone to homo-multimerization at mildly acidic conditions and to interact with LCPA in vitro (Kotzsch et al. 2017). During SDV development, Sin1 becomes proteolytically processed and the luminal domain incorporated into the silica-associated organic matrix (Kotzsch et al. 2017). The fate of the cleaved-off membrane anchor and C-terminal domain is unclear (Kotzsch et al. 2017). Sin1 knockout mutants of T. pseudonana exhibited subtle yet quantifiable changes in valve but not girdle band morphology, and the silica content of the frustules was substantially reduced (67–77% of wild type frustules) (Görlich et al. 2019). The reduction in frustule silica content in Sin1 knockout mutants is consistent with the observation that the luminal domain of Sin1 enhances the silica formation activity of LCPA. The mechanism by which Sin1 influences valve morphology remains to be determined.

SAP1 is also a type-1 transmembrane protein, but it does not share sequence similarity with silicanins (Tesson et al. 2017). The luminal part of SAP1 is extremely rich in serine (15.5%), and the cytosolic C-terminus (28 amino acids) does not contain any sequence motifs of known function (Tesson et al. 2017). Two other type-1 transmembrane proteins with similar sequences, termed SAP2 and SAP3, are encoded in the T. pseudonana genome and additional homologs were identified in 11 other diatom species (Tesson et al. 2017). GFP-tagging revealed that SAP3 is also located in valve and girdle band SDVs whereas SAP2-GFP, surprisingly, resided in the cytoplasm (Tesson et al. 2017). During SDV maturation SAP1 was proteolytically processed and the luminal part incorporated into the mature frustule. It is not yet known whether the luminal domains of SAP1 and SAP3 become part of the insoluble organic matrices like Sin1, or become associated with the silica in a different way. It has been speculated that the serine residues of SAPs are phosphorylated and might influence silica morphogenesis through interaction with the oligo-propyleneimine chains of LCPA and silaffins in the SDV lumen (Tesson et al. 2017). Although there is so far no experimental evidence for such interactions and for SAPs being phosphoproteins, knockdown mutants of SAP1 and SAP3 did exhibited alterations in silica morphology (Tesson et al. 2017).

SDVs are acidic compartments (see above), and recent evidence indicates that a vacuolar type H+-ATPase (in the following termed VHA) is responsible for this. GFP-tagging revealed that VHA is located not only in the vacuole of T. pseudonana but also in valve and girdle band SDVs (Yee et al. 2020). When a small molecule inhibitor of VHA, concanamycin A (ConA), was applied in vivo, multiple SAP1-GFP bearing, spherical vesicles accumulated in the cytoplasm, and VHA-GFP was present throughout the cytoplasm with a slight accumulation in the mid-cell region of dividing cells where the valve SDVs are located in unperturbed cells (Yee et al. 2020). Only some of the SAP1-GFP bearing vesicles of ConA-treated cells accumulated the acidotropic dye PDMPO indicating that their acidification was severely compromised (Yee et al. 2020). It appears that SDVs are unable to form in ConA-treated cells, yet electron microscopy data of sectioned cells are lacking to corroborate this assumption. At low ConA concentrations, some cells were still able to divide, but the morphology of the valve was severely compromised (Yee et al. 2020). Altogether, these data show that the H+-pumping function of VHA is essential for vesicle-mediated delivery of proteins to the SDV, for acidification of the SDV, and possibly also for SDV assembly. Furthermore, VHA being also located in the vacuolar membrane raises the question, whether vesicle-mediated transport exists between the SDVs and the vacuole. This question challenges the prevailing view that SDV biogenesis is exclusively fueled by Golgi-derived vesicles.

Assuming that the molecules described in Sect. 2 are located in the SDV during silica biogenesis, and taking into account the experimental evidence on their possible functions, a picture of the molecular organization in the SDV has emerged (Fig. 5). This picture includes also information presented in other chapters on association of the cytoskeleton (microfilaments, microtubules) with the SDV (see Chap. “Structure and Morphogenesis of the Frustule”) and the presence of a yet unknown silica precursor (SiOX) in the cytoplasm (see Chap. “Silicic Acid Uptake and Storage by Diatom”). Association of microtubules and microfilaments with the SDV requires yet unknown cytoskeleton adapter proteins that bind to both the cytoskeleton filament and the cytoplasmic face of the SDV membrane (Fig. 5). None of the three classes of SDV membrane proteins known to date (silcanins, SAPs, VHA) bears known microfilament or microtubule binding domains. Therefore, these proteins may either not interact with the cytoskeleton, or bind to cytoskeleton adapter proteins. Alternatively, they may directly bind microfilaments and microtubules through novel interaction domains. Regarding transport of the silica precursor SiOX into the SDV lumen, it is currently unknown what molecule is transported and by what mechanism (passive diffusion, facilitated diffusion, or active transport; see Chap. “Silicic acid Uptake and Storage by Diatom”). It seems certain that SiOX is a silicic acid derivative different from monosilicic acid Si(OH)4, which is taken up by the cell and stabilized in the cytoplasm by an unknown mechanism (Hildebrand 2008). It is anticipated that in addition to the presumed cytoskeleton adapter proteins and the SiOX transporter, the SDV will contain yet unknown components in the lumen, the SDV membrane, and on the cytoplasmic surface. The most straightforward way to identify such components would require developing a method for SDV isolation, which has so far not been achieved.

Fig. 5
figure 5

Model of an SDV. The model assumes the formation of a silaffin-LCPA-silacidin phase self-assembled through polyvalent ionic interactions. Silica precursor molecules may accumulate within this phase through interactions with the oligo-propyleneimine moieties. Within the silaffin-LCPA-silacidin phase the accumulated precursor molecules are rapidly converted into patterned silica structures (not shown). The membrane-bound silicanin clusters bind LCPA molecules thereby either constituting an independent silica forming phase or interacting with the silaffin-LCPA-silacidin phase. The H+-pumping V-type ATPase is involved in establishing an acidic pH milieu in the SDV. As interaction partners of SAPs are currently unknown, it can only be speculated that they might influence silica morphogenesis via binding to silica-forming phases that are composed of any of the other SDV components

3 Insoluble Organic Matrices

Most, if not all, biominerals leave behind an organic residue when the mineral phase is gently, yet completely dissolved. Diatoms silica is no exception. After dissolving SDS/EDTA extracted T. pseudonana frustules with a mildly acidic solution of ammonium fluoride, three structurally different types of organic material have been identified: meshworks, microrings, and microplates (Fig. 6a–c). In the following, the compositions and properties of these ammonium fluoride insoluble materials (AFIM) and their possible role in silica biogenesis are described.

Fig. 6
figure 6

Ammonium fluoride insoluble organic matrices. (a) Chitin meshwork, (b) microring, and (c) microplate from T. pseudonana. (d) A microplate from C. cryptica, and (e) a higher magnification of the region of the microplate that is framed in (d). (f) A C. cryptica microplate after incubation with silicic acid in the presence of the soluble organic components. (g) Detail of the region framed in (f). The red arrows point to some electron translucent circles resembling cribrum pores; the turquoise rings highlight some areola like pores. (ac) SEM images, (dg) TEM images. Scale bars: 1 μm (ad), 250 nm (e, g), 500 nm (f). (Modified from Brunner et al. 2009 (a), Scheffel et al. 2011 (b), Kotzsch et al. 2016 (c), Pawolski et al. 2018 (dg)). Reprinted with permission

3.1 Biochemical Compositions of the AFIM

The meshworks in the T. pseudonana AFIM are mainly composed of chitin and contain a small proportion of associated proteins (Brunner et al. 2009). The microrings and microplates are mainly composed of proteins (~75%) but also contain a substantial amount of carbohydrates (~25%) with a complex monosaccharide composition (Kotzsch et al. 2016). Organic matrices were structurally characterized also in N. pelliculosa (Reimann et al. 1966), C. wailesii (Scheffel et al. 2011), Cyclotella cryptica (Pawolski et al. 2018), Coscinodiscus radiatus, Stephanopyxis turris, Nitzschia curvilineata, Triceratium dubium, Amphora salina (all in Tesson and Hildebrand 2013). Characteristic for these matrices is a fairly regular nano- and microarchitecture that is akin to the biosilica patterns (Fig. 6a–e). For example, in T. pseudonana the stripe patterns of the microrings are identical to the nonporous margins of the girdle bands (Fig. 6b), and the dot patterns in microplates correspond to the fultoportula patterns in the valves (Fig. 6c).

Biochemical analysis of the AFIM has been severely hampered by their insolubility, even in strongly denaturing aqueous solvents. Therefore, their compositions have so far only been characterized using either complete acid hydrolysis or incomplete digestion by anhydrous HF. Amino acid composition analysis has been performed for T. pseudonana (Brunner et al. 2009; Kotzsch et al. 2016) and C. cryptica (Pawolski et al. 2018), and analysis of monosaccharide composition for T. pseudonana (Kotzsch et al. 2016), C. radiatus, N. curvilineata, A. salina, and T. dubium (all in Tesson and Hildebrand 2013). The monosaccharide compositions were rather complex, yet mannose was always a dominant component. The amino acid compositions of the insoluble organic matrices from T. pseudonana and C. cryptica shared a marked predominance of glycine and serine, which is consistent with an early report on the amino acid composition of diatom frustules (Hecky et al. 1973).

A preparation containing microrings and microplates from T. pseudonana is, so far, the only insoluble organic matrix material from which peptide sequences have been be obtained. This lead to the identification of 12 proteins, of which seven had not been previously identified, and were denoted SiMat1-7 (for silica matrix; Kotzsch et al. 2016). Of the five known proteins, four were cingulins (CinY1, CinY2, CinY3, CinW3), which were previously identified in a bioinformatics search for silaffin-like proteins (Scheffel et al. 2011). The bioinformatics search retrieved two additional cingulins (CinW1, CinW2), for which no peptide data were obtained (Scheffel et al. 2011). One protein, silaffin tpSil1, had previously been identified in the NH4F-extracts (Poulsen and Kröger 2004). GFP-tagging revealed that all cingulins are specifically associated with girdle bands in vivo and are present in the microrings, thus confirming that miocrorings are girdle band-specific organic matrices (Scheffel et al. 2011). The amino acid sequence of SiMat1 exhibited features that are very similar to Y-type cingulins, and GFP-tagging demonstrating that this protein is located in girdle bands/microrings (Kotzsch et al. 2016). Therefore, SiMat1 has been renamed CinY4 (Kotzsch et al. 2016). SiMat7 was localized to the SDV membrane through GFP-tagging. It was renamed silicanin-1 (Sin1), and its structures and possible functions have been described above.

3.2 Role of Insoluble Organic Matrices in Silica Formation

The structural resemblance of the organic microrings and microplates with the biosilica of girdle bands and valves, respectively, has spurred the hypothesis that insoluble organic matrices might act as templates in silica morphogenesis (Scheffel et al. 2011). Therefore, in vitro mineralization experiments were performed during which surface-adsorbed insoluble organic matrices were exposed to metastable solutions of silicic acid at pH 5.5 in the presence or the absence of soluble components. The filamentous pattern in the microrings from T. pseudonana was enhanced after mineralization due to preferential silica deposition along the filaments (Scheffel et al. 2011). However, no porous silica patterns were formed. Adding synthetic polyamines accelerated silica formation, but had no influence on the silica pattern (Scheffel et al. 2011). In TEM analysis, microplates isolated from C. cryptica revealed patterns reminiscent of the hierarchical patterns of cribrum and areola pores in the valve biosilica (Fig. 6d, e) (Pawolski et al. 2018). In AFM analysis patterns of both pores and globules were observed in the regions where TEM revealed only pore-like patterns (Pawolski et al. 2018). When incubated with silicic acid, silica bearing, hierarchical pore patterns were formed that covered the entire surface in about 35% of the microplates. When the ammonium fluoride soluble material (i.e., proteins and LCPA) was included in the mineralization reaction the amount of microplates containing silicified hierarchical pore patterns over the entire surface doubled (Fig. 6f, g) (Pawolski et al. 2018). This enhancement was exclusively due to the presence of LCPA, whereas the soluble proteins did not contribute to morphogenesis of the pore patterns, but seemed to enhance the amount of deposited silica (Pawolski et al. 2018). This study demonstrated that ammonium fluoride soluble organic components and the insoluble organic matrices act synergistically in mineral morphogenesis in vitro. However, only relatively weakly silicified structures were obtained that reconstituted the hierarchical pore patterns but no other features of the C. cryptica biosilica (costae, fultoportulae).

It is currently unknown whether the chitin meshwork identified in T. pseudonana (Brunner et al. 2009) and the chitin-containing margin of C. cryptica microplates (Pawolski et al. 2018) promotes silica formation. Studies with secreted chitin fibers from T. pseudonana, which are distinct from the chitin meshwork, were unable to accelerate silica formation within biologically relevant times and pH (Spinde et al. 2011). NMR studies revealed that Si-OH groups of silica that formed spontaneously by auto-polycondensation after 24 h form a hydrogen bond specifically with the C6-OH group rather than C3-OH group of the N-acetylglucosamine unit of chitin (Spinde et al. 2011). This result suggests that in vivo mineralization of the chitin meshwork would require the activity of silica-forming components (e.g., LCPA, silaffins), but chitin itself can establish strong binding to the silica via numerous hydrogen bonds. It is unclear, whether chitin meshwork, micropates, or microrings are involved in silica morphogenesis in vivo. Such a role would require the insoluble organic matrix to be present inside the SDV during silica biogenesis, which remains to be investigated.

4 Components Involved in Frustule Function

The function of the frustule is subject to much speculation. It has been argued that biosynthesis of a silica-based cell wall requires much less energy than would be required for a purely organic cell wall (Raven 1983), and would provide increased resistance to parasites (Raven and Waite 2004). The frustule promotes the enzymatic conversion of HCO3 to CO2, by acting as a pH buffer, thus aiding in inorganic carbon acquisition (Milligan and Morel 2002). Due to the porous patterns, frustules are able to influence the propagation of light into the cell thus providing protection against harmful UV light (De Tommasi et al. 2018) and aiding in the acquisition of light for photosynthesis (Goessling et al. 2018). Due to the intricate architectures with ridges and pores, diatom frustules have extraordinary mechanical properties exhibiting the highest specific strength of any known biological material (Aitken et al. 2016). It has been hypothesized that the frustule acts as armor in the defense against zooplankton predators like copepods (Hamm et al. 2003). At the same time, the architecture, rigidity, and chemistry of the frustule also provides challenges. It imposes a unique mode of cell division and growth (see Chap. “Cellular Hallmarks and Regulation of the Diatom Cell Cycle”) during which the contact zone of the two halves of the frustule need to be repeatedly loosened and re-sealed. Furthermore, most aqueous environments are highly under-saturated regarding silica formation (2 mM at pH 7), and thus the frustule is thermodynamically driven to dissolve. The following section deals with the biomolecules that appear to play a role in coping with the challenges imposed by the chemistry and architecture of the frustule.

4.1 Preventing Silica Dissolution

The dissolution of silica from frustules is enhanced by bacteria due to the secretion of exoproteases (Bidle and Azam 1999). It seems evident that this effect was caused by the degradation of frustule-associated proteins. One of the likely targets of bacterial exoproteases are the frustulins, which are Ca2+-binding glycoproteins that appear to be present in all diatom species (Kröger et al. 1994, 1996; Nemoto et al. 2020). The characteristic feature of frustulins is the presence of multiple ACR domains (for acidic, cysteine-rich) each containing a conserved C-E/Q-G-D-C-D motif. In C. fusiformis, frustulins form a seemingly continuous coat around all parts of the biosilica (Kröger et al. 1997), and their Ca2+-affinity is required for attachment to the frustule. Frustulins are acidic proteins and thus might impede silica dissolution by neutralizing OH ions near the silica surface that would otherwise catalyze the cleavage of Si-O-Si bonds. Although this scenario is speculative, the constitutive synthesis and incorporation of frustulins into the frustule (Kröger et al. 1994) is consistent with a neutralizing function.

Other proteins that encase large parts of frustules are the tyrosine-rich protein AC3362 from A. coffeaeformis (Buhmann et al. 2014) and frustule-associated glycoproteins, termed FACs, from Craspedostauros australis (formerly Stauroneis decipiens) (Lind et al. 1997). The sequences of FACs have not yet been determined, but studies with monoclonal antibodies against FACs pointed to their potential role in adhesion to surfaces (Lind et al. 1997) (see Chap. “Diatom Adhesion and Motility”). Currently there is no evidence for FACs playing a role in protecting the frustule from silica dissolution. AC3362 is a basic rather than acidic protein, and it is part of the ammonium fluoride insoluble matrix in A. coffeaeformis (Buhmann et al. 2014). The protein does not appear to be present in any other diatoms (Buhmann et al. 2014), and is thus unlikely to be involved in a function that is as generic as protecting frustules from silica dissolution. In this context, it is interesting to speculate that silaffins and LCPA, which are believed to be general components of the biosilica-forming machinery (see above), might also play a role in impeding silica dissolution. Mixtures of silaffins and LCPA strongly promote silica formation from silicic acid in vitro (see above), which means that they decrease silica solubility. It will be challenging to prove a component’s involvement in preventing silica dissolution in vivo as such an activity would be difficult to distinguish from promoting silica biogenesis.

4.2 Regulating Epitheca-Hypotheca Contact in the Overlap Region

A functionally very interesting part of the frustule is the region where the epitheca overlaps the hypotheca (Fig. 7a). During the cell cycle the composition of the overlap region is subjected to continuous stepwise change as explained in the following. Immediately before cell division, each theca is equipped with its maximum number of girdle bands and the terminal girdle bands (called pleural bands) face each other (Fig. 7a). During cell division (i.e. cytokinesis and valve biogenesis, see Fig. 1 in Chap. “Cellular Hallmarks and Regulation of the Diatom Cell Cycle”) the contact between the pleural bands is broken and the pleural band of each theca establishes contact with the edge of the newly formed valve (Fig. 7a). Subsequently, the phase of cell growth starts with adding the first girdle band to the new valve. This requires breaking the pleural band-valve contacts and establishing contact between the new girdle band and the pleural band of the epitheca. Cell growth continues in a stepwise manner by successively adding new girdle bands to the hypotheca, and each time the epitheca pleural band exchanges its contact partner in the growing girdle band region of the hypotheca (Fig. 7b). These modes of cell division and growth ensure that the protoplast is always fully encased by silica. It appears that at any stage of frustule development epitheca and hypotheca are glued together, because the frustule remains intact even when the proptoplasma is removed by extraction with detergent (e.g. 2% SDS at 95 °C). Upon acid treatment, girdle bands become completely separated from each other and from the valve, suggesting that connections between the silica elements in an intact frustule are achieved via biopolymers that are sensitive to acid hydrolysis, like proteins and polysaccharides.

Fig. 7
figure 7

Contacts between epitheca and hypotheca during the cell cycle. (a, b) Schematic diatom cells are shown in cross section. For simplicity the protoplasts are not shown. Frustule components are shown in gray and in color. Green: girdle bands in the overlap region before exocytosis of a new girdle band/valve; blue: newly formed girdle band/valve.; red lines: molecules mediating epitheca–hypotheca contact. (a) Valve formation, (b) girdle band formation. (c) Model of the arrangement and interactions of pleuralins, frustulins, and silaffins in the overlap region of the C. fusiformis frustule (from De Sanctis et al. 2016). Reprinted with permission. The 3D structure of the PSCD4 domain from pleuralin-1 is shown on the upper right corner. Disulfide bonds are shown in orange and the electrostatic surface potentials are red for negative charge and blue for positive charge

Indeed, chitin and specific types of frustule-associated proteins, p150 and pleuralins, have been localized to the overlap region in diatoms using fluorescent probes and antibodies, respectively (Durkin et al. 2009; Wustmann et al. 2020; Kröger et al. 1997; Kröger and Wetherbee 2000; Davis et al. 2005). Chitin was particularly abundant in the overlap region when the cells were stationary due to nutrient limitation or silicon starvation (Durkin et al. 2009; Wustmann et al. 2020). The chitin in the overlap region is likely associated with p150, which is an acidic, 150 kDa protein that contains three chitin binding sites (Davis et al. 2005). Like for chitin, the abundance of p150 strongly increases upon silicon starvation, but in contrast to chitin it is not restricted to the overlap region and appears to be associated with all girdle bands (Davis et al. 2005). The increased abundance of chitin and p150 in the overlap region of stressed, stationary cells suggests that these two components are necessary to stabilize the epitheca–hypotheca contact. Stationary cells neither produce girdle bands nor valves and thus no temporary loosening of the epitheca–hypotheca contact is necessary until the conditions become favorable for cell growth.

Pleuralins are pleural band-associated proteins that have so far only been identified in C. fusiformis (Kröger et al. 1997). Their characteristic feature is the presence of multiple PSCD domains (each ~100 amino acids), each containing five disulfide bridges with proline, serine, cysteine, and aspartate constituting ~50% of the amino acids (Kröger et al. 1997). Due to the high proline content, the

PSCD domain is almost completely devoid of canonical secondary structures and has a loosely packed structure (Fig. 7c) (De Sanctis et al. 2016). The localization of pleuralins in nutrient stressed or stationary cells has not been investigated, but their locations at different stages of the cell cycle has been determined using immunogold electron microscopy. This revealed that pleuralins are located only at the pleural bands of the epitheca and completely absent from the hypotheca (Kröger and Wetherbee 2000). During cell division, pleuralins are secreted into the extracellular space between the two daughter cells from where they bind to the pleural bands of the hypotheca (Kröger and Wetherbee 2000). At this stage the hypotheca is primed to become the epitheca of one of the daughter cells as soon as a new valve has been produced (Fig. 7a). It was hypothesized that the presence of pleuralins in C. fusiformis frustules signals completion of the girdle band region and that the theca is now prepared to function as an epitheca (Kröger and Wetherbee 2000). NMR studies have shown that a recombinant PSCD domain interacts with frustulins and silaffins (De Sanctis et al. 2016), which suggests that pleuralins may be directly involved in connecting the pleural bands of the epitheca with the hypotheca (Fig. 7c). The observations that Ca2+ ions induce conformational changes in the PSCD domains (De Sanctis et al. 2016) and frustulin binding to the frustule requires Ca2+ (Kröger et al. 1994) spur the speculation that Ca2+ concentration changes in the overlap region trigger reversible binding and unbinding between epitheca and hypotheca. However, at this point it is not even known whether pleuralin–frustulin interactions occur across the overlap region or only within the same theca.

4.3 Achieving High Mechanical Performance

A key function of the frustule is mechanical protection against zooplankton grazers, which is believed to provide diatoms with a selective advantage over other algae being - at least partially - responsible for their superior productivity in the natural environment (Hamm et al. 2003). Indeed, under laboratory conditions the copepod Parvocalanus crassirostris ingested T. weissflogii cells that had frustules with strongly reduced silica content with a 2–3 times higher rate than the control T. weissflogii cells that had ~3 times higher silica content (Liu et al. 2016). In the natural environment, diatoms have to compromise between silica content of the cell wall and the need to stay afloat in the photic zone (for planktonic diatoms) as well as restricting the time for silicic acid uptake to still enable high duplication rates. Considering the importance of the armor function of the cell wall, one would expect genes that are crucial for the mechanical performance of frustules to be highly conserved among all diatoms. In agreement with this is the assumed function of frustulins (silica dissolution inhibitors) and their seemingly ubiquitous presence among the diatoms (see above). Sin1, which is present in the SDV membrane and incorporated into the biosilica (see above), is the only other highly conserved frustule protein known so far. Sin1 knockout mutants of T. pseudonana exhibited normal growth rates under laboratory conditions, yet they had reduced biosilica content (70% of wild type) and displayed relatively mild morphological aberrations in their valves (Görlich et al. 2019). Nanoindentation measurements revealed a 3.5-fold reduced strength and stiffness of the frustules of the sin1 knockout mutants (Görlich et al. 2019). Further investigations are required to clarify if the reduced silica content alone or also the morphological aberrations in the valve are responsible for the decreased mechanical performance of the frustules in sin1 knockout mutants.

5 Conclusion

Diatom cell walls have unique architectures ranging from the patterns of nano- to microsized pores, ribs, tubes, and spines to the bipartite arrangement of their shells. However, they are not the only organisms capable of making and shaping silica. Silica is the taxonomically most widespread biomineral, being produced by a plethora of organisms from all eukaryotic supergroups (Marron et al. 2016). Biosilica formation also exist among prokaryotes (Motomura et al. 2016). This raises the question, whether all silica forming organisms have inherited this physiological capability from a common ancestor, or whether it has evolved multiple times independently. It is yet too early to answer this question, as the molecular underpinnings of silica biogenesis have so far only been studied in sufficient detail in diatoms and sponges. Remarkably, both types of organisms employ completely different sets of proteins for this process (Otzen 2012), yet they do share the use of LCPA (Matsunaga et al. 2007). Sponges also contain polyamine-modified biomacromolecules, but it is not known whether these are proteins or other acid hydrolysis-sensitive biopolymers (Matsunaga et al. 2007). Furthermore, polyamine-modified lysines, identical to those from silaffins, have been found in the silica scales of a haptophyte alga (Durak et al. 2016). Linear and branched oligo-propyleneimine chains are also present in bacteria and archaea species, yet none of these are known to produce silica (Knott 2009). It is therefore tempting to speculate that the first silica forming organisms adapted preexisting LCPA molecules, which originally had other functions, for the novel purpose of controlling cellular silica deposition in a silicic acid-saturated environment. Different groups of silica forming organisms might then have independently evolved proteins to produce silica morphologies for adaptation to different life style requirements. To enable thorough investigation of the evolutionary history of silica formation in diatoms, in depth proteomics analyses of diatom frustules and SDVs are required along with the genome sequences from a wide variety of silica forming organisms. Furthermore, the enzyme machineries for LCPA biosynthesis need to be identified and compared among diatoms and other silica forming organisms.

Why is it important to study biological silica formation, and why in diatoms? As is detailed in several other chapters of this book, diatoms are key primary producers in the oceans, and their metabolism and proliferation strictly depends on their ability to biosynthesize silica. Therefore, understanding the mechanism of diatom silica formation and function, and the factors that influence them, are crucial for understanding fluxes of matter and energy in ocean ecosystems. Furthermore, the mechanisms for genetically controlled biomorphogenesis of mineralized structures is, so far, largely unanswered for any biomineral forming organism. Diatoms produce biominerals with complex structures that exhibit species-specific characteristics on the nano, meso-, and microscale (from ~10 nm up to ~100 μm), yet they are biologically simple (single-celled) and readily amenable for genetic manipulation. This makes them prime model systems for studying a cellular machinery that translates a genetic blueprint into a 3D mineral architecture. Finally, the relatively young research field of diatom nanotechnology, develops methods for utilizing diatom silica with its exceptional structures and properties for many applications ranging from energy harvesting to biomedicine (Losic 2017). A deep understanding of the silica formation mechanisms would enable the generation of engineered diatoms that produce silica-based materials with tailor-made structure and properties (Kröger and Brunner 2014; Kröger et al. 2018). This would represent an entirely new paradigm for materials production through a carbon-neutral and renewable biotechnological process.