Introduction

Olivine, (Mg,Fe)2SiO4, is a nominally anhydrous mineral (NAM), but may contain trace amounts of H+ bonded to structural O2− forming OH groups. This H in olivine and other NAMs is generally quantified as wt. ppm H2O and colloquially described as 'water' given that the relevant thermodynamic variable describing its incorporation is the fugacity of water (fH2O, Kohlstedt 2006). Even at low concentrations (up to 100 s wt. ppm H2O), water incorporated into olivine, and its magnesian end-member forsterite (Mg2SiO4), can exert considerable control over the physical and chemical properties of the crystals. Small amounts of water can lead to significant changes in strength (“hydrolytic weakening”) (Demouchy et al. 2012; Girard et al. 2013; Fei et al. 2013; Karato and Jung 2003; Mei and Kohlstedt 2000a, 2000b; Karato et al. 1986; Mackwell et al. 1985), texture development (Jung and Karato 2001; Karato et al. 2008), diffusion of cations (Fei et al. 2013, 2018; Costa and Chakraborty 2008; Fei and Katsura 2016), conductivity (Fei et al. 2018; Wang et al. 2006; Sun et al. 2019), elasticity (Zhang and Xia 2021; Jacobsen et al. 2008; Mao et al. 2010) and melting behaviour (Ueki et al. 2020).

The ability to predict and describe these changes to physical properties relies upon the ability to predict the point defect structure of hydrogen in olivine under various conditions. The mechanisms by which hydrogen can be incorporated into olivine has thus been the subject of considerable experimental study (Matveev et al. 2001; Le Losq et al. 2019; Berry et al. 2005, 2007a; Tollan et al. 2018, 2017; Lemaire et al. 2004; Mosenfelder et al. 2006, 2011; Padron-Navarta et al. 2014; Blanchard et al. 2017; Jollands et al. 2021). There are also studies of the distribution of water in natural rocks (see for example Demouchy and Bolfan-Casanova (2016)) though interpretation of these is often difficult as, firstly, they typically only sample the uppermost mantle and, secondly, the rapid diffusivity of H means that rapid changes to the hydrogen concentration and its distribution could occur during magmatic ascent (Karato et al. 2008; Demouchy and Bolfan-Casanova 2016). Despite many complexities, in general, four types of hydrated point defects have been observed. These are (using Kroger-Vink notation): \((2{\rm H})_{\rm Mg}^{\times}\), \({(4{\rm H}}{)}_{\text{Si}}^{\times}\), \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) and \({\left\{{\text{R}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^{\prime}\right\}}^{\times}\) where R is a trivalent atom such as Fe(III) or Al(III) (Blanchard et al. 2017; Berry et al. 2007a). While some clear trends emerge from the literature such as \((4H{)}_{Si}^{\times}\) appearing to be favoured at high pressures (> ~ 2 GPa) (Smyth et al. 2006; Xue et al. 2017; Withers and Hirschmann 2008; Mosenfelder et al. 2006) there is no clear overall function describing how hydrogen is distributed in olivine across the upper mantle. Each experiment is performed in a limited range of conditions that generally do not cover the entire mantle range and it is hard to determine from the experimental results how different parameters such as pressure, temperature and the amount of water interact with each other to control the distribution of water within the olivine. It is also important to know not just which are the most favoured defects but the concentrations of all H-bearing defects to precise levels. This is because properties such as the diffusion of other cations (e.g., Mg2+ diffusion in the presence of \(\text{(2H}{)}_{\text{Mg}}^{\times}\)) can be affected by defects that are present even at parts-per-trillion levels due to the low concentration of intrinsic defects (Muir et al. 2020).

This problem can be addressed by building a thermodynamic model of hydrogen distribution in forsterite. Using ab-initio calculations the energy of different H-bearing and H-free defects can be determined over a wide range of conditions. Then, thermodynamic relationships can be used to determine which defects are present at what conditions. Such an approach has previously been used to consider the relationship between hydrated Si and Mg vacancies (Walker et al. 2007; Qin et al. 2018) but has not been extended to consider all likely H-bearing defects across a range of mantle conditions. In this work, we build such a thermodynamic model and probe the distribution of hydrogen in forsterite in the presence of Ti and Al. This incorporates two of the important trace elements in olivine (De Hoog et al. 2010) but excludes some potentially important elements, most notably Fe. This is because Fe introduces numerous complications to the thermodynamics and the simpler Fe-free system needs to be understood before additional complications can be considered.

Methods

General method

We calculated the Gibbs free energy change associated with reactions involving a wide range of hydrous and anhydrous point defects in forsterite to construct a thermodynamic model of point defect distribution as a function of pressure, temperature, and chemistry. The enthalpies of isolated defects were evaluated using density functional theory (DFT) while lattice dynamics was used to determine the vibrational entropies. With the addition of analytically determined configurational entropy, these atomic scale calculations allowed us to calculate the free energy change across a series of reactions and use these to build a thermodynamic model. Minimising the Gibbs free energy at any particular set of conditions allows the equilibrium defect distribution under those conditions to be determined.

Water could incorporate into forsterite via a reaction such as:

$${\text{Mg}}_{\text{Mg}}^{\times}+{\text{H}}_{2}\text{O} + \frac{1}{{2}}{\text{Si}}{\text{O}}_{2}\text{=}{\text{(2H)}}_{\text{Mg}}^{\times}+{\frac{1}{{2}}{\text{Mg}}}_{2}{\text{Si}}{\text{O}}_{4}$$
(R0)

This produces defects that bear hydrogen. We shall from now on consider water in forsterite as 'H-bearing' defects such as \({(2H)}_{Mg}^{X}\) and refer to them as such. Defects lacking H are referred to as 'H-free'.

The possible H-bearing defects examined in this system are \(\text{(2H}{)}_{\text{Mg}}^{\times}\), \(\text{(4H}{)}_{\text{Si}}^{\times}\), \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\), \({\text{H}}_{\text{Mg}}^{\prime}\) (with an \({\text{Al}}_{\text{Mg}}^{\cdot}\) or free), \({\text{H}}_{\text{i}}^{\cdot}\) (interstitial hydrogen bound to an O1, O2 or an O3 oxygen with a neighbouring \({\text{Al}}_{\text{Si}}^{\prime}\) or free) and \(\text{(3H}{)}_{\text{Si}}^{\prime}\) (with an \({\text{Al}}_{\text{Mg}}^{\cdot}\) or free) whereas the important H-free defects are \({\text{V}}_{\text{Mg}}^{\prime\prime}\), \({\text{Mg}}_{\text{i}}^{\cdot\cdot}\) (interstitial hydrogen in I1 or I2 sites), \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\), \({\text{Al}}_{\text{Mg}}^{\cdot}\), \({\text{Al}}_{\text{Si}}^{\prime}\) and \({\text{Ti}}_{{\text{S}}{\text{i}}}^{\times}\).

Determining the parameters of R0 is difficult due to the presence of free water. Determining the free energy of water with sufficient accuracy at high temperature and high pressure is difficult in a DFT framework due largely to its high degree of hydrogen bonding. Instead, the favoured incorporation mechanism for H in forsterite can be determined by calculating the energy differences between the different incorporation mechanisms. This was done for H-bearing defects, intrinsic defects and some selected H-free extrinsic defects. The following reactions were present in the model (all presented in Kröger-Vink notation):

$$\text{2(2H}{)}_{\text{Mg}}^{\times}+{\text{Si}}_{\text{Si}}^{\times}+{\text{3Mg}}_{2}{\text{Si}}{\text{O}}_{4} \rightarrow \text{(4H}{)}_{\text{Si}}^{\times}+ \text{2} {\text{Mg}}_{\text{Mg}}^{\times}+ \text{4MgSi} {\text{O}}_{3}$$
(R1)
$$\text{(2H}{)}_{\text{Mg}}^{\times}+{\text{V}}_{\text{Mg}}^{\prime\prime} \rightarrow \text{2}{\text{H}}_{\text{Mg}}^{\prime}$$
(R2)
$$\text{(2H}{)}_{\text{Mg}}^{\times} \rightarrow \text{}{\text{V}}_{\text{Mg}}^{\prime\prime}+{\text{2H}}_{\text{i}}^{\cdot}$$
(R3)
$$\text{(2H}{)}_{\text{Mg}}^{\times}+{\text{Ti}}_{\text{Si}}^{\times} \rightarrow {\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime\prime}\right\}}^{\times}$$
(R4)
$$\text{3(2H}{)}_{\text{Mg}}^{\times}+{\text{Al}}_{\text{Mg}}^{\cdot}+{\text{Al}}_{\text{Si}}^{\prime}+{\text{Si}}_{\text{Si}}^{\times}+ \text{3} {\text{Mg}}_{2}{\text{Si}}{\text{O}}_{4} \rightarrow \text{2}{\text{Mg}}_{\text{Mg}}^{\times}+{\text{2Al}}_{\text{Mg}}^{\cdot}\text{+2(3H}{)}_{\text{Si}}^{\prime}+ \text{4MgSi} {\text{O}}_{3}$$
(R5)
$${\text{Al}}_{\text{Mg}}^{\cdot}+(3H)_{\text{Si}}^{\prime} \rightarrow \{{\text{Al}}_{\text{Mg}}^{\cdot}\text{(3H)}_{\text{Si}}^{\prime}\}^{\times}$$
(R6)
$$\text{(2H}{)}_{\text{Mg}}^{\times}+{\text{Si}}_{\text{Si}}^{\times}+ \text{2} {\text{Mg}}_{\text{Mg}}^{\times}+{\text{Al}}_{\text{Mg}}^{\cdot}+{\text{Al}}_{\text{Si}}^{\prime}+ \text{4MgSi} {\text{O}}_{3} \rightarrow {\text{2Al}}_{\text{Mg}}^{\cdot}+ \text{2} {\text{H}}_{\text{Mg}}^{\prime}+{\text{Si}}_{\text{Si}}^{\times}+ \text{3} {\text{Mg}}_{2}{\text{Si}}{\text{O}}_{4}$$
(R7)
$${\text{Al}}_{\text{Mg}}^{\cdot}+{\text{H}}_{\text{Mg}}^{\prime} \rightarrow \{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^{\prime}\}^{\times}$$
(R8)
$$\text{(2H}{)}_{\text{Mg}}^{\times}+{\text{Si}}_{\text{Si}}^{\times}+{\text{Al}}_{\text{Mg}}^{\cdot}+{\text{Al}}_{\text{Si}}^{\prime}+ \text{3} {\text{Mg}}_{2}{\text{Si}}{\text{O}}_{4} \rightarrow \text{2}{\text{Al}}_{\text{Si}}^{\prime}+ \text{2} {\text{H}}_{\text{i}}^{\cdot}+ \text{2} {\text{Mg}}_{\text{Mg}}^{\times}+ \text{4MgSi} {\text{O}}_{3}$$
(R9)
$${\text{Al}}_{\text{Si}}^{\prime}+{\text{H}}_{\text{i}}^{\cdot} \rightarrow \{{\text{Al}}_{\text{Si}}^{\prime}{\text{H}}_{\text{i}}^{\cdot}\}^{\times}$$
(R10)

H-free extrinsic reactions

$${\text{Al}}_{\text{Mg}}^{\cdot}+{\text{Al}}_{\text{Si}}^{\prime\prime}+{\text{2Mg}}_{\text{Mg}}^{\times}+ \text{4MgSi} {\text{O}}_{3} \rightarrow \text{2}{\text{Al}}_{\text{Mg}}^{\cdot}+{\text{V}}_{\text{Mg}}^{\prime\prime}+{\text{Si}}_{\text{Si}}^{\times}+ \text{3} {\text{Mg}}_{2}{\text{Si}}{\text{O}}_{4}$$
(R11)
$${2}{\text{Al}}_{\text{Mg}}^{\cdot}+{\text{V}}_{\text{Mg}}^{\prime\prime} \rightarrow {\left\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{Al}}_{\text{Mg}}^{\cdot}{\text{V}}_{\text{Mg}}^{\prime\prime}\right\}}^{\times}$$
(R12)
$${\text{Al}}_{\text{Mg}}^{\cdot}+{\text{Al}}_{\text{Si}}^{\prime} \rightarrow {\text{Al}_{\text{Mg}}^{\cdot}\{{\text{Al}}_{\text{Si}}^{\prime\}\times}}$$
(R13)

H-free intrinsic reaction

$${\text{Mg}}_{\text{Mg}}^{\times} \rightarrow {\text{V}}_{\text{Mg}}^{\prime\prime}+{\text{Mg}}_{\text{i}}^{\cdot\cdot}$$
(R14)
$${\text{O}}_{\text{O}}^{\times} \rightarrow {\text{V}}_{\text{O}}^{\cdot\cdot}{+ \text{O} }_{\text{i}}^{\prime\prime}$$
(R15)
$${\text{Si}}_{\text{Si}}^{\times} \rightarrow {\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}+{\text{Si}}_{\text{i}}^{\cdot\cdot\cdot\cdot}$$
(R16)
$${2}{\text{V}}_{\text{Mg}}^{\prime\prime}+{\text{Si}}_{\text{Si}}^{\times}+ \text{3} {\text{Mg}}_{2}{\text{Si}}{\text{O}}_{4} \rightarrow {\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}+ \text{2} {\text{Mg}}_{\text{Mg}}^{\times}+ \text{4MgSi} {\text{O}}_{3}$$
(R17)
$${\text{Mg}}_{\text{Mg}}^{\times}+{\text{O}}_{\text{O}}^{\times}+{\text{MgSi}}{\text{O}}_{3} \rightarrow {\text{V}}_{\text{Mg}}^{\prime\prime}{+{\text{V}}}_{\text{O}}^{\cdot\cdot}+{\text{Mg}}_{2}{\text{Si}}{\text{O}}_{4}$$
(R18)
$${\text{Mg}}_{2}{\text{Si}}{\text{O}}_{4} \rightarrow {\text{Mg}}_{\text{i}}^{\cdot\cdot}+{\text{O}}_{\text{i}}^{\prime\prime}+ \text{MgSi} {\text{O}}_{3}$$
(R19)
$${\text{Si}}_{\text{Si}}^{\times}+ \text{2} {\text{O}}_{\text{O}}^{\times}+{\text{Mg}}_{2}{\text{Si}}{\text{O}}_{4} \rightarrow {\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}+ \text{2} {\text{V}}_{\text{O}}^{\cdot\cdot}+ \text{2MgSi} {\text{O}}_{3}$$
(R20)
$${\text{Si}}_{\text{Si}}^{\times}+{\text{2Mg}}_{\text{Mg}}^{\times}+ \text{4MgSi} {\text{O}}_{3} \rightarrow {{{2}{\text{V}}}_{\text{Mg}}^{\prime\prime}+ \text{Si} }_{\text{i}}^{\cdot\cdot\cdot\cdot}+{\text{Si}}_{\text{Si}}^{\times}+{\text{3Mg}}_{2}{\text{Si}}{\text{O}}_{4}$$
(R21)
$${\text{Si}}_{\text{Si}}^{\times}+{\text{3Mg}}_{2}{\text{Si}}{\text{O}}_{4} \rightarrow {\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}+ \text{2} {\text{Mg}}_{\text{i}}^{\cdot\cdot}+ \text{4MgSi} {\text{O}}_{3}$$
(R22)
$${\text{2Mg}}_{\text{Mg}}^{\times}+{\text{Si}}_{\text{Si}}^{\times}+{\text{4O}}_{\text{O}}^{\times} \rightarrow {{2}{\text{V}}}_{\text{Mg}}^{\prime\prime}+{\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}+{{4}{\text{V}}}_{\text{O}}^{\cdot\cdot}+{\text{Mg}}_{2}{\text{Si}}{\text{O}}_{4}$$
(R23)

Buffer reaction

$${\text{Mg}}_{2}\mathrm{Si}{\text{O}}_{4} \rightarrow \text{MgO+MgSi}{\text{O}}_{3}$$
(R24)

One important note is that we allowed the number of sites in forsterite to vary if forsterite is created or destroyed (such as in R1 and R7) as explained in Muir et al. (2020) but this has no noticeable effect on the results. In this formulation water, starts (arbitrarily) as \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and then reacts until it reaches its thermodynamically favoured distribution across the various H-bearing sites. In our model, the starting distribution of hydrogen does not matter as the thermodynamic end point should always remain the same regardless of the path it takes to reach this point as all the reactions are state functions. In a real system, the starting distribution of hydrogen and thus the source of hydrogen does matter because of kinetic considerations but we do not consider kinetics in our model and instead assume that the long timespans of the mantle ensure thermodynamic equilibrium is reached. All reactions have been written as charge balanced to ensure that the overall cell maintains electrostatic neutrality.

The mantle is more closely represented by an enstatite buffer and so all reactions have been written in a system where MgSiO3 is present. These can be converted to a system where MgO is present (or any arbitrary aSiO2 value) by adding the energy of R24 in appropriate amounts.

Al was placed initially as an unbound pair of \({\text{Al}}_{\text{Mg}}^{\cdot}\) and \({\text{Al}}_{\text{Si}}^{\prime}\). Ti was initially placed as a 4 + cation replacing Si which is likely the major substitution mechanism of Ti in H-free forsterite (Berry et al. 2007b; Hermann et al. 2005). Defects in braces {} are defects that are locally associated with each other to form a neutral charge, this is represented in our model by placing them on adjacent sites. The concentrations of intrinsic defects are so low that the configurational entropy gain (which reduces the free energy) from randomly placing them in the crystal is much larger than the enthalpy loss (which also reduces the free energy) due to electrostatics that accompanies association of the defects (Muir et al. 2020). As an example the pairing energy of {\({\text{V}}_{\text{Mg}}^{\prime\prime}{\text{Mg}}_{\text{i}}^{\cdot\cdot}\)}, the Mg Frenkel pair, was predicted to be 1.9 eV. In a pure system at 1500 K, the enthalpy loss of binding these two defects only exceeds the entropy gain of unbinding them at pair concentrations greater than 3 ppm defects/f.u which is much larger than any Frenkel defect concentration that we predict. Thus, the intrinsic defects were always treated as isolated defects. For extrinsic defects that are produced in charge-charge pairs at much greater concentrations, we allowed these to form charge-charge associated pairs if thermodynamically favourable through the reactions R6, R8, R10, R12 and R13.

Determining the energy of defective systems

24 defect forming reactions are presented above. The aim was to find the concentration of defects that provides the lowest possible energy. Each reaction was assigned a reaction vector (× 1… × 24) between 0 and 1 which determines how far each reaction proceeds to the right between all reactants and all products. For any combination of × 1– × 24, we solved for the free energy (G) and the thermodynamic equilibrium was where this free energy is minimised. Solving for the free energy consists of two parts determined as the non-configurational energy and the configurational entropy (\({S}_{conf}\)):

$$\Delta G=\sum_{i=1}^{i=24}\Delta {E}_{i}{x}_{i}-T{S}_{conf}$$
(1)

The first half of Eq. 1 involves multiplying the energy (E) of each reaction at the appropriate pressure (P) and temperature (T) by its reaction vector to obtain the non-configurational energy. To determine the energy of each reaction, we calculated the energy of each term in each reaction at a series of P and T points using CASTEP and the Quasi-Harmonic Approximation (QHA) with the details given in the Supplementary Methods. The second half of Eq. 1 involves finding \({S}_{conf}\) for any collection of defects.

\({S}_{conf}\) has many different components and its determination is not straightforward. We used the Gibbs entropy formula:

$$S=-{k}_{B}\sum_{j}{p}_{j}ln{p}_{j}$$
(2)

where kB is the Boltzmann constant, j represents a specific configuration of defects and pj the probability that that configuration occurs. A configuration refers to a possible way in which defects are arranged across the supercell with a given concentration.

The probability of any specific configuration occurring is:

$${p}_{j}=\frac{1}{Z}{e}^{(-{U}_{j}/{k}_{B}T)}$$
(3)

where Uj is the internal energy of each configuration. Z in Eq. 3 is the canonical partition function:

$$Z=\sum_{i}{e}^{(-{U}_{j}/{k}_{B}T)}$$
(4)

Strictly speaking, Eqs. 3 and 4 should be calculated with Gj (the free energy of each configuration) rather than Uj (the internal energy of each configuration). This was an approximation made to allow us to calculate the energy of many different configurations quickly, as U is a lot more straightforward to calculate than G. We discuss this approximation in the supplementary information but of the defects tested the largest relative difference between Uj and Gj terms was found to be in \(\text{(4H}{)}_{\text{Si}}^{\times}\) which has geometrically very different H configurations. At 2000 K, the relative difference between Uj and Gj terms reached 0.22 eV/defect. This modification does not change the reactions in Table 1, however, as long as the most stable defect does not change with temperature which was found to be the case with the systems that we test. It only changes the terms in Eqs. 2, 3, 4 that reflect the configurational entropy terms of different internal geometric arrangements of each defect. These terms are proportional to the concentration of the specific defect and thus are generally very small and thus this approximation should not change the overall energy and distribution of hydrogen. We also did our calculations in the dilute limit with all systems fixed to the forsterite unit cell. This means PV terms do not vary, which reduces Hj terms to Uj (H = U + PV).

Table 1 Reaction energies (in eV/f.u) for the hydrated defect reactions (R1-R10) and some other important reactions as a function of pressure and temperature

The concentration of each defect is important because higher concentrations of each defect lead to larger number of possible configurations (j) (see Eq. 5 below) which will increase S in Eq. 2. Therefore, we need to determine the number of configurations and the relative enthalpy Uj of each. If we consider every possible configuration this number quickly becomes incalculable. All configurations have a large number of other configurations to which they are identical by the symmetry operations of the crystal space group. Using these relationships, we defined a scheme to group configurations into different types and thus bring the number of configurations down to a calculable number.

We shall thus define a configuration group as a set of configurations where each of the defects of each type is confined to a single type of site (such as all \({\text{V}}_{\text{Mg}}^{\prime\prime}\) on M1 rather than M2 sites) and we shall use these groups instead of individual configurations in Eqs. 14. This can be conceptualised by having a single defect of each type and so the different configuration groups simply change which site each defect occupies. With our assumption that defects are independent, every configuration with the defects confined to a single site is identical and contained within our configuration group and configurations where defects occupy multiple sites are included by the partitioning function. \({\text{V}}_{{\text{M}}{\text{g}}}^{\prime\prime}\text{, }{\text{H}}_{\text{Mg}}^{\prime}\text{, (2H}{)}_{\text{Mg}}^{\times}\) and \({\text{Al}}_{\text{Mg}}^{\cdot}\) were confined to M1 and M2 sites, \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\text{, }{\text{Ti}}_{\text{Si}}^{\times}\text{, (4H}{)}_{\text{Si}}^{\times}\text{,(3H}{)}_{\text{Si}}^{\prime}\) and \({\text{Al}}_{\text{Si}}^{\prime}\) to Si sites, \({\text{Mg}}_{\text{I}}^{\cdot\cdot}\) to M1 (as a split interstitial (Muir et al. 2020)) and I2 sites, \({\text{V}}_{\text{O}}^{\cdot\cdot}\) and \({\text{H}}_{\text{i}}^{\cdot}\) to O1, O2 and O3 sites and \({\text{O}}_{\text{I}}^{\prime\prime}\) and \({\text{Si}}_{\text{I}}^{\cdot\cdot\cdot\cdot}\) to I1,I2 and T1-T5 sites (which are defined in Muir et al. (2020)). \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\), \(\left\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{Al}}_{\text{Mg}}^{\cdot}{\text{V}}_{\text{Mg}}^{\prime\prime}\right\}\),\({\text{Al}}_{\text{Mg}}^{\cdot}\text{(3H)}_{\text{Si}}^{\prime\times}\),\({\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^{\prime\times}\), \({\text{Al}_{\text{Si}}^{\prime}{\text{H}}_{\text{i}}^{\cdot}}^{\times}\) and \({{\{Al}_{Mg}^{\cdot}{Al}_{Si}^{\prime}\}}^{\times}\) were calculated as pairs/trios with each element of the defect confined to a next or second-next neighbour site. All possible geometries of these pairs/trios were tested and considered as a separate configuration group. For defects containing hydrogen atoms each arrangement of hydrogen (hydrogen bound to a specific type of oxygen, pointing in/out of a vacancy) on each type of site was considered as a separate configuration group. The relative enthalpy of each configuration group was then calculated as a function of pressure. These enthalpies are presented in the supplement: Tables S1–S8 for H-bearing defects, Table S9, S10 for H-free defects and Table S11 for isolated defects on different crystallographic sites. For calculating the final free energy of the reactions, the most enthalpically stable configuration was chosen and its free energy calculated at high temperature. The effect of other configurations is confined to the configurational entropy term.

All possible configuration groups were then tabulated and their relative energy Uj assigned by applying energy penalties determined from the relative enthalpies in Table S1–11. The energy penalty is determined by the difference between the enthalpy of the defect in its current site with its current hydrogen arrangement compared to the enthalpy of the defect in its favored site with its favored hydrogen arrangement. This assumes that the energy of placing and moving a defect around the crystal is independent of the other defects and that defect-defect interaction terms are minimal.

To calculate the degeneracy (W) of each configuration group, we must first calculate the degeneracy at each site:

$$W=ln\frac{N!}{a!b!\dots z!}$$
(5)

where N is the total number of sites, and a,b,cz are the number of different types of atoms/defects at each site including a final z term, which is simply (N-a-b….− y). To solve this numerically, all defect concentrations were written in terms of defects per mol and then the Stirling approximation was used (\(lnn!\cong nln-n\)), giving:

$$W=NlnN-N-alna+a-blnb+b\dots -zlnz+z$$
(6)

Additional degeneracies from hydrogen arrangement degeneracy and the degeneracy of the bound pairs and trios were derived in a similar way and added to this term.

Knowing the degeneracy and relative energy of all configuration groups, the entropy was calculated using Eqs. 2 and 3 but summed across i, where i is simply a sum across every configuration group (j) appearing a number of times equal to its degeneracy (W). As the number/concentration of defects increases W increases and thus so does i and thus so does S in Eq. 2.

It is important to emphasise how all the reactions R1R24 end up coupled in determining the free energy both in this system and in defect bearing systems in general. While the \(\sum_{i=1}^{i=24}\Delta {E}_{i}{x}_{i}\) term in Eq. 1 depends upon the energetics of each reaction in isolation, \({S}_{conf}\) depends upon the concentration of all defects simultaneously. \({S}_{conf}\) only depends upon the concentration of the defects, the reaction by which they were produced is irrelevant. The energetics of any one reaction proceeding forwards therefore depends upon \({S}_{conf}\) before and after it proceeds and thus also upon the defect concentration that is resulting from all other reactions. Thus, all reactions must be considered simultaneously. Considering defect-producing reactions one at a time could cause incorrect defect concentrations to be obtained.

Thermodynamic minimisation

For any pressure and temperature, the energy of each defect at those conditions was determined. This was done by projecting first along pressure and then along temperature using 2nd-order polynomials and points at 5, 10 and 15 GPa (uncorrected) and 1000, 1500 and 2000 K. The energy of each defect was then placed into the reactions found in the text and the energy of each reaction (Ei in Eq. 1) determined at those conditions. We then used a series of minimisations to find the distribution of defects that gave the lowest free energy by minimising x1x24 in Eq. 1. In all cases, the water concentration, Ti concentration and Al concentration were fixed for each minimisation.

Solving this minimisation is a difficult problem as we were dealing with 24 variables that can have values that are many orders of magnitude different, multiple local minima and a configurational entropy term that has many terms and is difficult to solve analytically. Thus, we developed a bespoke solver that uses a brute force technique. This simply takes each variable (x1.. × 24) in their order of favourability (most favoured reactions first) and then increases or decreases that variable with a series of steps and continues to do so while G decreases. The step sizes begin at 1 and decrease to 1 × 10–20. Any steps that produce negative defect concentrations were discarded. The variables were cycled through multiple times until the stopping condition, which is when a full cycle of variables (x1.. × 24) is changed and G fails to vary by more than a cutoff which was set to 1 peV/system. This was found to be sufficient to give consistent answers. This method relies upon the large energy differences of each of the different reactions. R1 (the hydrated Si production reaction) is usually the most favoured reaction followed by R4 (the titanoclinohumite production reaction). The progress of other reactions have little effect on R1 and R4, while R1 and R4 have a dominant effect on the progress of other reactions. We tested this method using a range of starting points for our minimisation (such as R1 or R4 or R7 fully to the right) and consistently arrived at the same final minimisation result. Determining concentrations of defects that are below 1 × 10–20 defects/f.u. proved very difficult as we encountered issues with floating point numbers and the precision of our calculations (when the other defects had much higher concentrations) and thus we used this as our baseline cutoff beyond which variables were not minimised. Our minimisation process does not present a formal solution and may miss a true energy minimum and small variations in the final concentration of the products. However, it should provide a good guide to how different conditions vary the concentration of the water products.

Units and visualisations

In this work, H concentrations are always presented as wt. ppm normalised to water molecules. The overall water concentration ([H2O]bulk in wt. ppm, 1 wt. ppm = 15.6 H/Si 106) is the sum of these concentrations for all H-bearing defects. Ti concentrations are presented as wt. ppm TiO2 and Al concentration as wt. ppm Al2O3. To convert to wt. ppm of Ti and Al, multiply the concentrations by 0.599 and 0.53, respectively. To account for systematic errors in DFT pressure, all pressures have been corrected with a linear method outlined in the supplementary information. Unless stated, pressures are presented as corrected.

Results

Pure forsterite

Energies of our reactions are presented in Table 1 and Table S12. The energies of the different reactions are very different and thus the distribution of H-defects is mostly controlled by two reactions in these systems. Of the reactions involving the conversion of one H-bearing defect into a different H-bearing defect, R1 (the hydrated Si reaction which converts 2 \(\text{(2H}{)}_{\text{Mg}}^{\times}\) into 1 \(\text{(4H}{)}_{\text{Si}}^{\times})\) and R4 (the titanoclinohumite reaction which converts \(\text{(2H}{)}_{\text{Mg}}^{\times}\) into \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\)) are highly favoured, while the other reactions are much less favoured. Of the two most favoured reactions, the titanoclinohumite reaction R4 requires Ti and so in pure forsterite the hydrated Si reaction R1 largely controls the hydrogen distribution. The reactions which produce intrinsic defects (R14–R24) all have very high energies and thus intrinsic defects have low concentrations in these systems.

We thus predict hydrogen in pure forsterite to occupy two defects: \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and \(\text{(4H}{)}_{\text{Si}}^{\times}\). Two other defects have been proposed in the literature (Kohlstedt 2006)—these are \({\text{H}}_{\text{Mg}}^{\prime}\) and \({\text{H}}_{\text{i}}^{\cdot}\) with the latter being an interstitial hydrogen which is bound solely to an oxygen and not to any cationic sites. In the absence of other elements these two defects can be produced by R2 (the Mg vacancy disproportionation reaction which converts a \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and a \({\text{V}}_{\text{Mg}}^{\prime\prime}\) into 2 \({\text{H}}_{\text{Mg}}^{\prime}\)) and R3 (the free hydrogen production reaction which converts \(\text{(2H}{)}_{\text{Mg}}^{\times}\) into \({\text{H}}_{\text{i}}^{\cdot}\)). FTIR spectra of hydrated pure forsterite tend to show only a broad band centred at around 3160 cm−1, and/or a series of bands at ~ 3500–3620 cm−1 (Berry et al. 2005; Grant et al. 2006) which are attributed to \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and \(\text{(4H}{)}_{\text{Si}}^{\times}\), respectively (for example Lemaire et al. (2004)) with no evidence for the production of \({\text{H}}_{\text{Mg}}^{\prime}\) and \({\text{H}}_{\text{i}}^{\cdot}\). Likewise, we find that \({\text{H}}_{\text{Mg}}^{\prime}\) and \({\text{H}}_{\text{i}}^{\cdot}\) defects are extremely minor products. In none of our Al-free runs did the concentration of \({\text{H}}_{\text{Mg}}^{\prime}\) exceed 1 × 10–20 defects/f.u. (the limit of detectability we set in our model) and \({\text{H}}_{\text{i}}^{\cdot}\) never exceeded 1 × 10–6 defects/f.u. in the presence of Al and 1 × 10–9 defects/f.u. in the absence of Al (see Figure S1 for a plot of some concentrations of \({\text{H}}_{\text{i}}^{\cdot}\)). We confirm the unfavorability of these defects by considering them as isolated reactions in Table S13 but in summary, \({\text{H}}_{\text{i}}^{\cdot}\) and \({\text{H}}_{\text{Mg}}^{\prime}\) are not substantially stable products in pure H-bearing forsterite.

Whether \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and \(\text{(4H}{)}_{\text{Si}}^{\times}\) is the dominant host of structurally bound H in forsterite depends on pressure, temperature and water concentration. Figure 1 shows that increasing pressure strongly encourages the formation of \(\text{(4H}{)}_{\text{Si}}^{\times}\)—almost all H is contained in \(\text{(4H}{)}_{\text{Si}}^{\times}\) defects at pressures higher than ~ 3, 4 and 8 GPa at 1000, 1500 and 2000 K, respectively. This is because the hydrated Si reaction (R1) which produces \(\text{(4H}{)}_{\text{Si}}^{\times}\), eliminates an M-site vacancy which are large and thus this reaction becomes more favourable with increasing pressure (Table 1), e.g., at 1000 K, the reaction energies for R1 are − 1.337, − 2.139 and − 3.376 eV/f.u at 0, 5 and 10 GPa, respectively. Previous experimental work has shown that high pressure favours the formation of \(\text{(4H}{)}_{\text{Si}}^{\times}\) (Smyth et al. 2006; Xue et al. 2017; Withers and Hirschmann, 2008; Mosenfelder et al. 2006). It is important to distinguish that, in Fig. 1 and in our work in general, we vary pressure independently with fixed water concentrations in the forsterite ([H2O]bulk). Conversely, in experimental work and in reality, increasing the pressure also increases fH2O, and thus the water concentration in forsterite/olivine. These effects are interlinked because, as also shown in Fig. 1, increasing [H2O]bulk also stabilises \(\text{(4H}{)}_{\text{Si}}^{\times}\) over \(\text{(2H}{)}_{\text{Mg}}^{\times}\). In general, configurational entropy terms favour \(\text{(2H}{)}_{\text{Mg}}^{\times}\) over \(\text{(4H}{)}_{\text{Si}}^{\times}\) (there are twice as many \(\text{(2H}{)}_{\text{Mg}}^{\times}\) defect sites as \(\text{(4H}{)}_{\text{Si}}^{\times}\) defect sites for the same concentration of water) while enthalpy terms favour \(\text{(4H}{)}_{\text{Si}}^{\times}\) over \(\text{(2H}{)}_{\text{Mg}}^{\times}\) (due to R1 eliminating an unfavourable vacancy when \(\text{(4H}{)}_{\text{Si}}^{\times}\) is produced). With an increasing [H2O]bulk (akin to increasing fH2O in experiments) the configurational entropy terms become less important relative to the enthalpy terms and thus \(\text{(4H}{)}_{\text{Si}}^{\times}\) is favoured. Temperature favours \(\text{(2H}{)}_{\text{Mg}}^{\times}\) over \(\text{(4H}{)}_{\text{Si}}^{\times}\) because it multiplies the configurational entropy term. This is shown in Fig. 1 and alternatively plotted in Figure S2 where increasing the temperature increases the proportion of \(\text{(2H}{)}_{\text{Mg}}^{\times}\) at the same pressure and [H2O]bulk. Across a range of geophysically relevant P and T (1000–2000 K, 0–10 GPa) P is a stronger control than T and \(\text{(4H}{)}_{\text{Si}}^{\times}\) is the favoured H-bearing defect except at low concentrations of water, high temperatures and low pressures. Figure S3 plots the pressure at which \(\text{(4H}{)}_{\text{Si}}^{\times}\) becomes favoured over \(\text{(2H}{)}_{\text{Mg}}^{\times}\) for a variety of temperatures – for example, at 2000 K, \(\text{(4H}{)}_{\text{Si}}^{\times}\) becomes favoured over \(\text{(2H}{)}_{\text{Mg}}^{\times}\) at ~ 8.9 GPa with a [H2O]bulk of 0.1 wt. ppm but ~ 4.0 GPa with a [H2O]bulk of 100 wt. ppm while at 1500 K \(\text{(4H}{)}_{\text{Si}}^{\times}\) is the favoured water defect at all pressures when [H2O]bulk is > 0.3 wt ppm.

Fig. 1
figure 1

The fraction of H2Obulk that is in \(\text{(4H}{)}_{\text{Si}}^{\times}\) (1 = all water is \(\text{(4H}{)}_{\text{Si}}^{\times}\)) as a function of [H2O]bulk and pressure at different temperatures for pure forsterite. The rest of the water (1-x) is \(\text{(2H}{)}_{\text{Mg}}^{\times}\) except for an extremely small amount (< 0.001 and generally much smaller) of water that exists as \({\text{H}}_{\text{i}}^{\cdot}\) (Figure S1). Note that at 1000 K a different scale is used because the fraction of \(\text{(4H}{)}_{\text{Si}}^{\times}\) does not go below 0.97 in these conditions. All other extrinsic and intrinsic defects have much lower concentrations (< 1 × 10–10 wt. ppm) in all cases. An alternative rendering against pressure and temperature with fixed [H2O]bulk is given in Figure S2. Data used to construct Figs. 1, 2, 3, 5 and 6 are included in the supplementary spreadsheet

Kudoh et al. (2006) presents a contrary case. Using electron probe microanalysis a single crystal of hydrous forsterite with a very high amount of water (3800 wt. ppm) that was synthesized at high pressure (13.5 GPa, 1573 K) was argued to have water split roughly equally between Mg and Si sites. Under these conditions our model predicts overwhelmingly water should occupy \(\text{(4H}{)}_{\text{Si}}^{\times}\) sites. We believe that experimental evidence of Kudoh et al. (2006) is not clear cut however. The FTIR spectra presented in Kudoh et al. (2006) shows a strong triplet in the 3400–3600 cm−1 region which is generally associated with \(\text{(4H}{)}_{\text{Si}}^{\times}\) vacancies (see for example Le Losq et al. (2019)) while the 3160–3215 cm−1 region generally assigned to \(\text{(2H}{)}_{\text{Mg}}^{\times}\) vacancies is not shown but is likely flat due to its exclusion. A study by Smyth et al. (2006) in similar conditions (12 GPa, 1273–1873 K, up to 8900 ppm wt. water) presents the FTIR spectrum of the full 3000–3600 cm−1 region and shows again strong absorbance in the 3400–3600 cm−1 region and no absorbance in the 3160–3215 cm−1 region. Thus both Smyth et al. (2006) and Kudoh et al. (2006) present FTIR spectrums in agreement with our predictions if the 3400–3600 cm−1 region is assigned to \(\text{(4H}{)}_{\text{Si}}^{\times}\) peaks (which it was not in either of these works) as confirmed in Xue et al. (2017) by a combination of RAMAN, NMR and theoretical calculations. The microprobe analysis of Kudoh et al. (2006) suggests the presence of hydrated water on Mg vacancies however. Assuming that the water calibration of Kudoh is correct (3800 wt. ppm) then all water cannot be on the Si vacancies as the calculated occupancy numbers (0.993 for Mg and Si respectively) fit to water distributed roughly equally between \(\text{(4H}{)}_{\text{Si}}^{\times}\) and \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and if all water was contained in \(\text{(4H}{)}_{\text{Si}}^{\times}\) vacancies Si occupation would be expected to be 0.986. The source of the discrepancy between our model and this experiment and between the FTIR spectrum and the microprobe results within this experiment is likely due to the extremely high water concentration. Similar to how entropically unfavoured but enthalpically favoured \(\text{(4H}{)}_{\text{Si}}^{\times}\) is promoted by high water concentrations in our model, entropically unfavoured but (likely) enthalpically favoured grain boundary water is likely promoted by such high concentrations. The presence of grain boundary water would shift the distribution of water between Mg and Si sites in unknown directions, possibly towards Mg sites as detected by the microprobe while being absent from our model. The favourability of grain boundary water over bulk water is unknown but if we assume the enthalpy of grain boundary water is 1 eV lower than bulk water then we can estimate this favourability based on a balance of this enthalpy difference vs the configuration entropy difference (using the number of sites at the grain boundary vs in the bulk). With a circular grain with diameter of 0.05 mm we estimate that grain boundary water is ~ 45 more favourable than bulk water at 1500 K. This number is very approximate but demonstrates that at such high concentrations of water grain boundary water could be extremely important. This needs to be investigated further but in general we do not expect such high concentrations of water in the mantle and thus grain boundary water is likely unimportant and it is not present in our model.

The effect of Al

There are two major H-bearing defects in the pure forsterite system, \(\text{(4H}{)}_{\text{Si}}^{\times}\) and\(\text{(2H}{)}_{\text{Mg}}^{\times}\). The presence of Al allows the formation of a new major product \({\text{H}}_{\text{Mg}}^{\prime}\) which can be seen by the prominent bands at low water contents in Fig. 2 and high temperatures and low pressures in Fig. 3 (minor products formed in the presence of Al are shown in Figure S4 and S5). The concentration of \({\text{H}}_{\text{Mg}}^{\prime}\) was predicted to reach up to ~ 35 wt. ppm in appropriate conditions (high temperature, low pressure, low [H2O]bulk). \({\text{H}}_{\text{Mg}}^{\prime}\) is produced by converting \(\text{(2H}{)}_{\text{Mg}}^{\times}\) to \({\text{2H}}_{\text{Mg}}^{\prime}\) through R7 (the Al + Mg vacancy disproportionation reaction which converts \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and \({\text{Al}}_{\text{Si}}^{\prime}\) into \({\text{H}}_{\text{Mg}}^{\prime}\) and\({\text{Al}}_{\text{Mg}}^{\cdot}\)). R7 is more favoured than the equivalent Al-free reaction R2 (the Mg-vacancy disproportionation reaction) because it does not require a \({\text{V}}_{\text{Mg}}^{\prime\prime}\) defect which are high in energy and thus difficult to produce. This reaction doubles the amount of H-bearing defects on the Mg sites (as one \(\text{(2H}{)}_{\text{Mg}}^{\times}\) creates two \({\text{H}}_{\text{Mg}}^{\prime}\) defects) but the progress of the hydrated Si reaction R1 which converts \(\text{(2H}{)}_{\text{Mg}}^{\times}\) to \(\text{(4H}{)}_{\text{Si}}^{\times}\) is unaffected by Al and so the ratio of water on Mg sites vs Si sites is unaffected by Al. This can be stated as \(2[\text{(2H}{)}_{\text{Mg}}^{\times}]:[\text{(4H}{)}_{\text{Si}}^{\times}]\) in the Al-free case is the same as\([{\text{H}}_{\text{Mg}}^{\prime}] :[\text{(4H}{)}_{\text{Si}}^{\times}\)] in the Al-bearing case. This behaviour occurs because R1 is a more favourable reaction than R7. Thus in both pure and Al-containing forsterite, the hydrated Si reaction, R1, is the controlling reaction for the distribution of water between Mg and Si sites and the presence of Al just changes the ratio\([\text{(2H}{)}_{\text{Mg}}^{\times}]\):\([{\text{H}}_{\mathrm{Mg}}^{\prime}]\). This means that the trends with pressure, temperature and water concentration discussed above for pure forsterite also apply in Al-bearing forsterite. \({\text{H}}_{\text{Mg}}^{\prime}\) is favoured over \(\text{(2H}{)}_{\text{Mg}}^{\times}\) with a low [H2O]bulk or with a high Al concentration (Fig. 2). Pressure has little effect on the \([\text{(2H}{)}_{\text{Mg}}^{\times}]\):\([{\text{H}}_{\mathrm{Mg}}^{\prime}]\) ratio while temperature favours \(\text{(2H}{)}_{\text{Mg}}^{\times}\) over \({\text{H}}_{\mathrm{Mg}}^{\prime}\) but both of these effects are less important than the concentration of water or Al. This can be seen in Fig. 3 where we vary P and T for a variety of water concentrations with a high Al concentration and in all cases \({\text{H}}_{\mathrm{Mg}}^{\prime}\) is favoured and \(\text{(2H}{)}_{\text{Mg}}^{\times}\) is a very small minor product with concentrations generally below 1 × 10–9 defects/f.u.. In circumstances where \(\text{(2H}{)}_{\text{Mg}}^{\times}\) is favoured (\([\text{(2H}{)}_{\text{Mg}}^{\times}]\):\([{\mathrm{H}}_{\mathrm{Mg}}^{\prime}]\) is high, [H2O]bulk is high, [Al2O3] is low), the H distribution in Al-bearing forsterite is near identical to Al-free forsterite.

Fig. 2
figure 2

Plot of the fraction of H2Obulk that is in each defect (1 = all water is in that defect) for each major H-bearing defects in forsterite as a function of [H2O]bulk and (AC) TiO2 (top row) and (D, E) Al2O3 at 2000 K and 0 GPa. Minor products are shown in Figure S4

Fig. 3
figure 3

Plot of the fraction of H2Obulk that is in each defect (1 = all water is in that defect) as a function of temperature and pressure with fixed total [H2O]bulk in forsterite containing 500 wt. ppm Al2O3 and 500 wt. ppm TiO2 as a function of P and T. The three major defects are shown here, minor defects are shown in Figure S5. Due to the high amount of Al, \({\text{H}}_{\text{Mg}}^{\prime}\) is a major defect and \(\text{(2H}{)}_{\text{Mg}}^{\times}\) is a minor defect but with lower Al \(\text{(2H}{)}_{\text{Mg}}^{\times}\) would become major and \({\text{H}}_{\text{Mg}}^{\prime}\) minor (see Fig. 2) while the trends relative to \(\text{(4H}{)}_{\text{Si}}^{\times}\) and \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) would remain largely unchanged

It was stated above that Al does not affect the distribution of water in forsterite, only converting some \(\text{(2H}{)}_{\text{Mg}}^{\times}\) into \({\text{H}}_{\mathrm{Mg}}^{\prime}\). This behaviour of course only applies because we keep [H2O]bulk fixed, in real situations the addition of Al will increase the concentration of [H2O]bulk but a fixed [H2O]bulk is important for considering how the distribution of water is affected by the presence of Al.

In some conditions, \({\rm H}_{\rm Mg}^{\prime}\) can react further to produce \({\{{\rm Al}_{\rm Mg}^{\cdot}{\rm H}_{\rm Mg}^{\prime}\}}^\times\) through R8 (the Al hydrated Mg vacancy coupling reaction which couples \({\text{Al}}_{\text{Mg}}^{\cdot}\) and \({\text{H}}_{\text{Mg}}^{\prime}\)) as shown by its appearance as a band at high temperatures and large Al and [H2O]bulk concentrations in Figure S4 and S5. This is always a minor product, however. With [H2O]bulk = 100 wt. ppm the maximum concentration of \(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\) across our PT space is < 0.1 wt ppm and decreasing the water concentration will decrease this maximum. The reason this product is always minor is because it is favoured increasingly by lower temperatures and higher values of [H2O]bulk. This is common behaviour to all associated defect pairs as they are favoured by enthalpy and disfavoured by Sconf and thus they become increasingly more favoured as temperature decreases or [H2O]bulk increases. These are, however, the same conditions that favour the formation of \(\text{(4H}{)}_{\text{Si}}^{\times}\). Therefore, conditions which favour \(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\) over \({\text{Al}}_{\text{Mg}}^{\cdot}+{\text{H}}_{\text{Mg}}^{\prime}\) will also favour \(\text{(4H}{)}_{\text{Si}}^{\times}\) over both. \(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\) is favoured over \({\text{Al}}_{\text{Mg}}^{\cdot}+{\text{H}}_{\text{Mg}}^{\prime}\) only when the concentration of both is very small (generally less than 0.1 wt ppm) but in these cases it is possible that 100% of the \({\text{Al}}_{\text{Mg}}^{\cdot}+{\text{H}}_{\text{Mg}}^{\prime}\) pairs convert to \(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\). Across our runs we found that \(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\) formed between 0–100% of the available \({\text{H}}_{\text{Mg}}^{\prime}\) but that generally at adiabiatic mantle temperatures \({\text{[H}}_{\text{Mg}}^{\prime}]\) is much larger than \(\{\text{[}{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times]\).

Berry et al. (2007a) presented the IR spectra of Al containing forsterite synthesised in the presence of water, where a triplet of peaks in the FTIR spectra at 3344.5, 3350.5 and 3322 cm−1 were assigned to \(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\), with the different peaks representing Al on different sites (2 on M1, 1 on M2). Further experiments and theoretical calculations (Blanchard et al. 2017) confirmed these assignments. In the conditions that the crystals were annealed (1.5 GPa, 1673 K, ~ 100 wt. ppm Al2O3, saturated in water with a final concentration of ~ 20 wt. ppm) we predict \([{\text{H}}_{\text{Mg}}^{\prime}]\) to be ~ 7 ppm but \(\{{\text{[{Al}}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times]\) to be < 1 ppb and thus undetectable, i.e., our model is not in line with the experimental results. The crystals were cooled, however, before spectra were recorded, and at 300 K we predict the Al hydrated Mg vacancy coupling reaction R8 to go entirely to the right converting all \({\text{H}}_{\text{Mg}}^{\prime}\) into \(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\) which would create a detectable amount of\(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\). The exact amount of \(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\) that is produced will be reliant on the kinetics of the cooling process. This could be tested by recording IR spectra at high temperature where dissociation would be expected to be observed. As can be seen from Table S2 \(\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\) has energetically viable arrangements where \({\text{Al}}_{\text{Mg}}^{\cdot}\) can be on the M1 and the M2 sites and so the assignment of the FTIR triplets in Berry et al. (2007a) and Blanchard et al. (2017) is plausible.

Al can also promote minor products (pictured in Figure S4 and S5) though the concentration of these products is always predicted to be below 1 wt. ppm. Some \(\text{(3H}{)}_{\text{Si}}^{\prime}\) is produced through R5 (Al + Si vacancy disproportionation reaction which produces \({\text{Al}}_{\text{Mg}}^{\cdot}\) and \(\text{(3H}{)}_{\text{Si}}^{\prime}\) from \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and\({\text{Al}}_{\text{Si}}^{\prime}\)) and a small amount of \({\text{H}}_{\text{i}}^{\cdot}\) is produced through R9 (Al catalysed free hydrogen production which produces \({\text{H}}_{\text{i}}^{\cdot}\) and \({\text{Al}}_{\text{Si}}^{\prime}\) from \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and\({\text{Al}}_{\text{Mg}}^{\cdot}\)), though considerably more \({H}_{i}^{\cdot }\) is produced in the presence of Al than is produced in pure forsterite. We reiterate that \({\text{H}}_{\text{Mg}}^{\prime}\) is a hydrogen confined to a Mg defect whereas \({\text{H}}_{\text{Int}}^{\cdot}\) is an interstitial hydrogen that is not confined to a defect. These two defects should have large differences in diffusional and vibrational properties and thus have different effects on physical properties and IR spectra. These minor products \(\text{(3H}{)}_{\text{Si}}^{\prime}\) and \({\text{H}}_{\text{i}}^{\cdot}\) have concentrations orders of magnitude lower than the main products of\(\text{(2H}{)}_{\text{Mg}}^{\times}\), \({\text{H}}_{\text{Mg}}^{\prime}\) and \(\text{(4H}{)}_{\text{Si}}^{\times}\) and thus will not affect the concentration of these major products to any significant degree. The presence of these minor products will still be important, however, for any processes that rely specifically on these defects. Both \(\text{(3H}{)}_{\text{Si}}^{\prime}\) and \({\text{H}}_{\text{Int}}^{\cdot}\) are predicted to exist as unassociated defects as \(\{{\text{Al}}_{\text{Mg}}^{\cdot}\text{(3H}{)}_{\text{Si}}^\prime\}^\times\) and \(\{{\text{Al}}^{\prime}_{\text{Si}}{{\text{H}}}_{\text{i}}^{\cdot}\}^{\times}\) do not form (R6 and R10 which couple the products of R5 and R9 respectively go entirely to the left) within our detectability limits (1 × 10–20 defects/f.u.).

The H-free Al disproportionation reaction R11 converts \({\text{Al}}_{\text{Mg}}^{\cdot}+{\text{Al}}_{\text{Si}}^{\prime}\) into 2 \({\text{Al}}_{\text{Mg}}^{\cdot}\) and 1 \({\text{V}}_{\text{Mg}}^{\prime\prime}\). In dry conditions, this reaction can produce large amounts of \({\text{V}}_{\text{Mg}}^{\prime\prime}\) leading to an increase of the concentration of \({\text{V}}_{\text{Mg}}^{\prime\prime}\) compared to pure forsterite but this reaction is suppressed by water and is negligible beyond ~ 5 wt. ppm water except for very high concentrations of Al. This can be seen in Fig. 4 where the steep decline in \({\text{[}{\text{V}}}_{\text{Mg}}^{\prime\prime}]\) with increasing [H2O]bulk signifies the suppression of this reaction. Discussion of the \(\left\{{\text{Al}}_{\text{Mg}}^{\cdot}{\text{Al}}_{\text{Mg}}^{\cdot}{\text{V}}_{\text{Mg}}^{\prime\prime}\right\}\) cluster is given in the supplementary information but generally it is an extremely minor product (concentrations are always below 2 × 10–15 defects/f.u.) which is not expected to be important.

Fig. 4
figure 4

Plot of the three major intrinsic defects as a function of water concentration at 2000 K and 0 GPa (corrected) with three different crystal chemistries (solid line = pure forsterite, dashed line = 500 wt. ppm TiO2, dotted line = 500 wt. ppm Al2O3). The only major difference induced by crystal chemistry is Al induces extra \({\text{V}}_{\text{Mg}}^{\prime\prime}\) due to R8 but this effect is suppressed by even low amounts water. The absolute value of these concentrations is less constrained than for extrinsic defects as their concentration is much smaller but relative trends with [H2O]bulk are better constrained

Defects interacting with Al can change the ratio of \({\text{Al}}_{\text{Mg}}^{\cdot}/{\text{Al}}_{\text{Si}}^{\prime}\) from its initial value of 1 in multiple possible reactions (R5, R7, R9, R11). We find that this effect is mostly controlled by the Al + Mg vacancy disproportionation reaction R7\((\text{(2H}{)}_{\text{Mg}}^{\times}\) and \({\text{Al}}_{\text{Si}}^{\prime}\) into \({\text{H}}_{\text{Mg}}^{\prime}\) and \({\text{Al}}_{\text{Mg}}^{\cdot})\). This ratio is listed as a function of water concentration in Table S14. We find that H-induced changes to this ratio are generally small with \([{\text{Al}}_{\text{Mg}}^{\cdot}]\) usually being < 0.1% larger than \([{\text{Al}}_{\text{Si}}^{\prime\prime}]\) but at high temperatures, low pressures, low Al contents and high water contents the \({\text{Al}}_{\text{Mg}}^{\cdot}/{\text{Al}}_{\text{Si}}^{\prime}\) ratio can become significant. In such conditions, measuring two of the water content, the Al content and the \({{\text{A}}{\text{l}}}_{\text{Mg}}^{\cdot}/{\text{Al}}_{\text{Si}}^{\prime}\) value would allow you to know the other value.

In the absence of any other reactions Al exists as \({\text{Al}}_{\text{Mg}}^{\cdot}+{\text{Al}}_{\text{Si}}^{\prime}\) pairs which could associate into \({{\{Al}_{Mg}^{\cdot}{Al}_{Si}^{\prime}\}}^{\times}\) through R13 (Al pair coupling reaction). We predict that the favourability of this reaction is highly dependent upon pressure, temperature and Al concentration and that \({{\{Al}_{Mg}^{\cdot}{Al}_{Si}^{\prime}\}}^{\times}\) associated pairs make up anywhere between 1.01 to 99.81% of the total \({\text{Al}}_{\text{Mg}}^{\cdot}+{\text{Al}}_{\text{Si}}^{\prime}\) pairs with the remaining percentage being unbound. Lower temperatures, higher Al concentrations and higher pressures lead to a greater percentage of \({{\{Al}_{Mg}^{\cdot}{Al}_{Si}^{\prime}\}}^{\times}\) pairs being bound, with water concentration having little effect.

The effect of Ti

As shown in Fig. 2, Ti has a large effect on the distribution of H through the formation of \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) via the titanoclinohumite reaction R4 which is very favourable (Table 1). \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) is favoured versus \(\text{(4H}{)}_{\text{Si}}^{\times}\) at low pressures and versus \(\text{(2H}{)}_{\text{Mg}}^{\times}\) at low temperatures and thus can be the major product at low pressures and temperatures as seen by the large band at low pressures in Fig. 3. With increasing pressure \(\text{(4H}{)}_{\text{Si}}^{\times}\) becomes more stable than \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) as was also the case with \(\text{(2H}{)}_{\text{Mg}}^{\times}\), but \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) is generally stable to a higher pressure against \(\text{(4H}{)}_{\text{Si}}^{\times}\) than\(\text{(2H}{)}_{\text{Mg}}^{\times}\). This is shown in Figure S3 where the presence of Ti increases the pressure at which \(\text{(4H}{)}_{\text{Si}}^{\times}\) becomes the dominant water carrier by up to 2 GPa at 2000 K and by up to 7 GPa at 1500 K. A conversion of \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) to \(\text{(4H}{)}_{\text{Si}}^{\times}\) with pressure has been observed previously as Kohlstedt et al. (1996) showed FTIR peaks associated to \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{^{\prime\prime}}\right\}}^{\times }\) in crystals annealed at 0.3 GPa but only those associated with \(\text{(4H}{)}_{\text{Si}}^{\times}\) in crystals annealed at 5 GPa though these peaks were not interpreted as such within this paper.

The binding energy (energy of associated defect minus the energy of isolated defects) of \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) was calculated to be extremely large (between 5–6 eV between 0–15 GPa and 0–2000 K) and thus \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) is almost always an associated pair. Only at an extremely low concentrations (< 1 ppt wt.) would disassociating this pair be favourable. In our model, we always treated this as an associated pair.

As with Al, the site location of Ti is related to the water content. As shown in Table S15 the \({\text{Ti}}_{\text{Si}}^{\times}/{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\) ratio decreases roughly linearly with water content for a given P and T. The ratio varies strongly and nonlinearly with P and T, however, so fitting a universal law is complex and will have overlapping points but with a known P and T the water content can be solved from the \({\text{Ti}}_{\text{Si}}^{\times}/{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\) ratio. This provides both a test of unknown water content if this ratio can be measured and a test of our model if the water content is known.

In Table S16, we compare our model data with a model produced from experimental data (Padron-Navarta and Hermann 2017). We find a good match between their model and our prediction within the experimental region but an increasingly large mismatch outside of this region due to the absence of \(\text{(2H}{)}_{\text{Mg}}^{\times}\) in the experimental measurements. This absence is because the experimental measurements were done at low temperatures (1023–1323 K) where we predict that \(\text{(2H}{)}_{\text{Mg}}^{\times}\) does not form and thus would not be seen in the experiment. This is evidence that extrapolating outside measured T and P regions (and [H2O]bulk and trace element concentration ranges) is very difficult as different H-bearing defects can form when you change these variables. The predicted concentration of \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{^{\prime\prime}}\right\}}^{\times }\), however, always remains within 10 ppm between our model and their model.

Discussion

Water exponents of intrinsic defects

Many experimental studies have related the incorporation of H2O into point defects in olivine and other NAMs using the water fugacity (fH2O) of the system and an exponent (rf), i.e.:

$$\left[{\text{Defect}}\right] \alpha {\text{f}}_{{\text{H}}_{2}{\text{O}}}^{\text{rf}}$$
(8)

(Kohlstedt et al. 1996; Withers et al. 2011; Tollan et al. 2017; Withers and Hirschmann, 2007; Rauch and Keppler, 2002; Mierdel and Keppler, 2004; Lu and Keppler, 1997; Bromiley and Keppler, 2004). The experimental method generally involves either varying P at constant T and water activity (aH2O), thus varying fH2O, or varying aH2O (and thus fH2O) at constant P and T, with both methods allowing the relationship between defect concentrations and fH2O to be determined. In simple systems where each product can be described by a single equation (such as \(\text{(4H}{)}_{\text{Si}}^{\times}\) being solely produced by the hydrated Si production reaction R1) and where configurational entropy is unimportant, each product can indeed be described by a single number (rf) which should be relatively insensitive to pressure, temperature and water concentration. In complex systems rf will often vary significantly with conditions such as P and T.

In our calculations, we do not have fH2O but can instead determine

$$\left[{\text{Defect}}\right] \alpha {\text{[H}}_{2}{\text{O}}{]}_{\text{bulk}}^{\text{rc}}$$
(9)

Herein, all calculated exponents are rc, as in Eq. 9 rather than rf as in Eq. 8. While Eqs. 9 and 8 have occasionally been treated as the same in previous literature (such as in Fei and Katsura (2016)) they are not, as Eq. 9 has additional implicit configurational entropy mixing terms that are not present in Eq. 8. These terms vary based on the form water takes when adsorbed in the system but stem from the fact that [H2O]bulk is not a real product and these are the terms that mix H2O to whatever configuration it takes in forsterite. The effect of this is varied and rc can be equal to, greater than or less than rf. For Eq. 9 the dominant water defect (the species that contains most of the water and controls the charge balance regime) in forsterite should have an rc that trends towards 1 while the exponents of the other water species are (in a simple system) dependent on how the minor water species relate to the major species. An example demonstrating how to calculate the “ideal” value of rc and the relationship between rf and rc is given in the supplementary information for reaction R1.

Our predicted rc values are given in Table 2. The values of rc are quite variable and often do not have their ideal value. As an example in the case of pure forsterite with no trace elements except H the rc value for \(\text{(2H}{)}_{\text{Mg}}^{\times}\) at 0 GPa varies between 0.5 to 0.82 with temperature. The “ideal” value is 0.5 in a \(\text{(4H}{)}_{\text{Si}}^{\times}\) dominated system and 1 in a \(\text{(2H}{)}_{\text{Mg}}^{\times}\) dominated system and it varies between these. Thus we predict that water exponents in “real” systems should be heavily dependent upon pressure, temperature and chemical environment, and that they can also vary across common ranges in H2O concentration. Particularly large variations in rc are seen when the major H-bearing defect changes from \(\text{(2H}{)}_{\text{Mg}}^{\times}\) to \(\text{(4H}{)}_{\text{Si}}^{\times}\). This can be seen in Table 2 where at high temperature and low pressures (which favours \(\text{(2H}{)}_{\text{Mg}}^{\times})\) rc for \(\text{(2H}{)}_{\text{Mg}}^{\times}\) is 0.82 and rc for \(\text{(4H}{)}_{\text{Si}}^{\times}\) is 1.63 (at 2000 K and 0 GPa, respectively) but as temperature decreases and pressure increases (which favours \(\text{(4H}{)}_{\text{Si}}^{\times}\)) these values instead trend towards and become (at 10 GPa and 1000 K) 0.50 and 1.00, respectively. The exponents for \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and \(\text{(4H}{)}_{\text{Si}}^{\times}\) always have a roughly 1:2 ratio because [\(\text{(4H}{)}_{\text{Si}}^{\times}]\) is controlled by the hydrated Si production reaction R1.

Table 2 The water exponent rc determined for each of these systems between 10–1000 wt. ppm water

Exponents of intrinsic defects

The presence of water will also have a large effect on the concentration of intrinsic defects in forsterite. In general, the presence of water is seen to suppress the formation of intrinsic defects. As shown in Fig. 4, increasing [H2O]bulk generally decreases the concentration of defects produced intrinsically. This is because intrinsic defects in forsterite (and generally in minerals) form due to configurational entropy gains upon formation, and these gains become relatively lowered in the presence of H-bearing defects (or other extrinsic defects).

Each atom in Mg2SiO4 has an equivalent vacancy and interstitial defect. Mg vacancies (\({\text{V}}_{\text{Mg}}^{\prime\prime})\) and Mg interstials (\({\text{Mg}}_{\text{i}}^{\cdot\cdot}\)) are the most prominent intrinsic defects due to the favourability of the Mg Frenkel reaction (R14) over other intrinsic reactions (Table S12). \({\text{V}}_{\text{O}}^{\cdot\cdot}\) is the next most prominent intrinsic vacancy (formed in conjunction with \({\text{V}}_{\text{Mg}}^{\prime\prime}\) in R18) and then \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) (formed by converting \(2{\text{V}}_{\text{Mg}}^{\prime\prime}\) into \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) in the vacancy Si production reaction R17, a H-free analogy to the hydrated Si production reaction R1). We predict that \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) is produced entirely by R17 and not by any of the other possible \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\)-forming reactions, which have much higher energies (Table S12). Therefore, the concentration of \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) is proportional to the concentration of \({\text{V}}_{\text{Mg}}^{\prime\prime}\), which is a reactant in the vacancy Si production reaction R17. Previous thermodynamic models have used the Si Frenkel reaction (R16) as a basis for forming \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) (Stocker and Smyth 1978) and therefore came to different conclusions about the effect of various conditions on \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\). However, we find the formation of \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) via R16 to be extremely unfavourable. No mechanism that we tested which produces \({\text{O}}_{\text{i}}^{\prime\prime}\) or \({\text{Si}}_{\text{i}}^{\cdot\cdot\cdot\cdot}\) was favourable—the concentration of these defects was always below the detection limit (1 × 10–20 defects/f.u.) and likely far below this limit based on the extremely high energies of all reactions that produce these interstitials. We therefore conclude that Si and O interstitials are not present in forsterite to any significant degree. In Costa and Chakraborty (2008), it was predicted that in olivine Si diffuses via a vacancy mechanism, which agrees with our results, but that O diffuses via an interstitial method which does not. We predict that R18 which produces \({\text{V}}_{\text{O}}^{\cdot\cdot}\) (alongside \({\text{V}}_{\text{Mg}}^{\prime\prime}\)) is always more favoured than the reactions which produce \({\text{O}}_{\text{i}}^{\prime\prime}\) (R15 and R22). Thus we predict that it is difficult to produce \({\text{O}}_{\text{i}}^{\prime\prime}\) in forsterite and that a \({\text{V}}_{\text{O}}^{\cdot\cdot}\) mechanism is more likely for O diffusion unless an external source of \({\text{O}}_{\text{i}}^{\prime\prime}\) is present.

The effect of water on exponents associated with intrinsic defects has been previously speculated in Kohlstedt (2006). Using similar mass action equations to R0, R1 and R17 (as well as variations on these to consider alternative Mg water sites and Si hydrogen concentrations) and the effect of water on the equilibrium constants of these reactions (as demonstrated in the supplementary information) they postulated that water has no effect on \({\text{V}}_{\text{Mg}}^{\prime\prime}\) and \(\text{V}^{\prime\prime\prime}_{\text{Si}}\) (i.e. rc = rf = 0) in the charge balance regime \(\text{[}{\text{H}}^{\cdot}\text{]=[}{\text{H}}_{\text{Me}}^{\prime}\text{]}\), where their \({H}_{Me}^{\prime}\) is effectively equivalent to our \({\text{H}}_{\text{Mg}}^{\prime}\). This line of reasoning was extended to O vacancies \({\text{V}}_{\text{O}}^{\cdot\cdot}\) by Fei and Katsura (2016) who used similar arguments to postulate that rc = rf = 0 ie \(\left[{\text{V}}_{\text{O}}^{\cdot\cdot}\right]\propto [{H}_{2}O{]}_{bulk}^{0}\propto {f{H}_{2}O}^{0}\). We find, however, that generally these products can have negative rc (Table 2) due to the suppressive effect of water on their configurational entropy of formation. The exponent of \({\text{V}}_{\text{Mg}}^{\prime\prime}\) is sometimes positive due to the effects of the free hydrogen production reaction R3 which creates \({\text{V}}_{\text{Mg}}^{\prime\prime}\) and \({\text{H}}_{\text{i}}^{\cdot}\). While R3 only proceeds forwards by a small amount and produces low \([{\text{V}}_{\text{Mg}}^{\prime\prime}]\) compared to the concentration of other extrinsically produced defects, \([{\text{V}}_{\text{Mg}}^{\prime\prime}]\) produced by R3 can be much higher than \([{\text{V}}_{\text{Mg}}^{\prime\prime}]\) produced intrinsically through the Mg Frenkel reaction R14. We cannot determine rc for \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) directly because the concentration of \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) is extremely low and often below our detection limit. However, as the concentration of \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) is controlled entirely by the Si vacancy production reaction R14rc for \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) should be close to twice the rc of \({\text{V}}_{\text{Mg}}^{\prime\prime}\).

We find that water can have significant effects on the concentrations of \({\text{V}}_{\text{Mg}}^{\prime\prime}\), \({\text{V}}_{\text{Si}}^{\prime\prime\prime\prime}\) and \({\text{V}}_{\text{O}}^{\cdot\cdot}\), particularly in the presence of Al and Ti, which is contrary to their predicted ideal behaviour where the concentration of water should have no effect. This is unsurprising as configurational entropy (which causes deviations from these ideal values) will always be important for intrinsic defects, as it is fundamental to their creation. All of the intrinsic defect-forming reactions have high positive enthalpies meaning that they only proceed forwards due to the configurational entropy gain of producing defects. Therefore, in most scenarios, the exponents for intrinsic defects will be heavily sensitive to configurational entropy and will deviate from their ideal values. Effectively, this means that when dealing with intrinsic defects their exponents (rc and rf) are particularly hard to extrapolate across temperature and pressure space and must be measured at the desired conditions.

The distribution of water in upper mantle conditions

As discussed above, the distribution of hydrogen in forsterite is highly complex with multiple interacting variables that defy simple parameterisation and that each set of pressure and temperature conditions could behave differently. In this section, we want to show how the H-bearing defects behave in a set of geophysically relevant pressure and temperature conditions, i.e., those of the upper mantle. Figure 5 and 6 (with alternative renderings in Fig. 7 and S6-S9) show the distribution of H-bearing defects along two likely mantle geotherms (taken from Green and Ringwood (1970)) with varying Ti and Al concentrations. In Fig. 7 and S6-S9, the concentrations of Ti and Al are both correlated with depth based on De Hoog et al. (2010) but there is simply an example and in reality there is likely considerable variation in the concentration of both of these products. In Fig. 5, we show the variation of these two concentrations separately. Figure 5, 6 and 7 either have a varying water concentration with depth (Fig. 5 and 7) using a fit to natural samples from Demouchy and Bolfan-Casanova (2016) (Table S16) or a fixed water concentration with depth (Fig. 6). While water, Ti and Al concentrations are likely strongly correlated with depth they are also possibly laterally heterogenous and in this way we can demonstrate how varying the concentration of these products varies the distribution of H regardless of depth. \({\text{H}}_{\text{Mg}}^{\prime}\) likely has similar diffusional and other properties to \(\text{(2H}{)}_{\text{Mg}}^{\times}\) and thus we shall consider them together with a combined concentration [H + VMg].

Fig. 5
figure 5

Plot of log10 of the concentration of water (ratio of water in each defect multipled by [H2O]bulk in wt. ppm) in each of the three major defects: \(\text{(4H}{)}_{\text{Si}}^{\times}\), hydrated Mg sites \(\text{(2H}{)}_{\text{Mg}}^{\times}+{\text{H}}_{\text{Mg}}^{\prime}\))- labelled as [H + VMg], and \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\). This is along an oceanic or continental geotherm with water content varying with depth (Table S17) and with varying Ti and Al contents. From the top (Al2O3 = 0 wt. ppm, TiO2 = 0 wt. ppm) going clockwise TiO2 concentration increases up to 90 degrees (Al2O3 = 0 wt. ppm, TiO2 = 500 wt. ppm) then Al2O3 up to 180 degrees (Al2O3 = 500 wt. ppm, TiO2 = 500 wt. ppm) then TiO2 decreases up to 270 degrees (Al2O3 = 500 wt. ppm, TiO2 = 0 wt. ppm) before Al2O3 decreases back to 0. [H + VMg] values have been truncated to − 6, \(\text{(4H}{)}_{\text{Si}}^{\times}\) and \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) graphs to − 3, in the truncated region there are often very sharp decreases in concentration until there is effectively none of that product. Data were calculated at a series of gridpoints (supplementary spreadsheet) that were then interpolated using the inter2p spline method of MATLAB. Very sharp changes are seen around the cardinal directions where concentration of one extrinsic atom (Ti or Al) goes to 0, this is because very large changes in H distribution occur over very small ranges of Al2O3 or TiO2 when they are small. Such small concentrations are poorly visualised in this graph but probably are not important in the mantle

Fig. 6
figure 6

As Fig. 5 but with fixed [H2O]bulk (given in wt. ppm) irrespective of depth. These graphs vary the concentration of TiO2 only going from TiO2 = 0.1 wt. ppm at 0 degrees to, TiO2 = 500 wt. ppm at 180 degrees with no Al2O3 present

Fig. 7
figure 7

Plot of the concentration of the major H-bearing defects as a function of depth along an oceanic geotherm with varied water concentration with depth (Table S17). Solid lines represent a fixed TiO2 and Al2O3 concentration of 500 wt. ppm, the dotted line represent a varied TiO2 and Al2O3 concentration with depth (Table S17). The lines have significant roughness which is likely due to a lack of granularity in our water distribution and geotherm functions, with a fixed water concentration (Figure S6-S9) trends are smoother. The same graph along a continental geotherm is shown in Figure S6 and with varying fixed water concentrations along an oceanic geotherm in Figure S7-S9

Overall, we conclude that depth (and thus pressure) is the most important variable in distributing H in the mantle. In Figs. 5, 6 and 7, an overall trend with depth can be seen regardless of other conditions. Considering first [H + VMg], this has a near-0 value at the surface which rises rapidly with depth, peaks in the mid upper mantle (seen by the middling yellow band in Figs. 5 and 6) and then decreases rapidly. This peak is tabulated in Table S18 but is at 100–210 km with shallower values favoured by higher temperatures. Even at this peak concentration, hydrated Mg vacancies (\(\text{(2H}{)}_{\text{Mg}}^{\times}\) and \({\text{H}}_{\text{Mg}}^{\prime}\)) never become the dominant H-bearing defect. This behaviour is effectively insensitive to Ti concentration with the main effect of Ti being to reduce the maximum value of [H + VMg] at its peak (Fig. 7 and Table S18). Ti has a strong effect on \([\text{(4H}{)}_{\text{Si}}^{\times}]\) however and so we shall consider a Ti-poor and a Ti–rich regime. An important value is the saturation concentration which is [TiO2] ~ 4.42 × [H2O]bulk. This is the concentration at which there is enough Ti for all water molecules to form \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\). With low Ti concentrations ([TiO2] < saturation) \([\text{(4H}{)}_{\text{Si}}^{\times}]\) is predicted to be the dominant H-bearing phase throughout the lower mantle and its concentration is nearly equal to [H2O]bulk regardless of depth. Thus if [H2O]bulk increases with depth (Fig. 5) so does \([\text{(4H}{)}_{\text{Si}}^{\times}]\) and if [H2O]bulk is fixed with depth (Fig. 6) \([\text{(4H}{)}_{\text{Si}}^{\times}]\) is also fixed with depth. This can be seen most clearly in the [TiO2] = 0 regions of Fig. 6 where \([\text{(4H}{)}_{\text{Si}}^{\times}]\) does not change with depth. In these Ti-poor cases, the minimum concentration of \(\text{(4H}{)}_{\text{Si}}^{\times}\) is when [H + VMg] is at its maximum in the mid mantle. With a [H2O]bulk of 1/10/100/1000 wt. ppm the minimum \([\text{(4H}{)}_{\text{Si}}^{\times}]\) is 0.2/6.3/85.4/966 wt. ppm while the maximum \([\text{(4H}{)}_{\text{Si}}^{\times}]\) is 1/10/100/1000 wt. ppm. Thus, in a hypothetical (likely unrealistic) Ti-poor system, variations in \([\text{(4H}{)}_{\text{Si}}^{\times}]\) throughout the conditions of the upper mantle are small and decrease with [H2O]bulk. With high Ti concentrations ([TiO2] > saturation) very different behaviour is seen. In these regions, a large band exists at the top of the upper mantle where \([{\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}]\) is the dominant H-defect and \([\text{(4H}{)}_{\text{Si}}^{\times}]\)=0 (the blue band in the Ti-regions of the \(\text{(4H}{)}_{\text{Si}}^{\times}\) diagram of Figs. 5 and 6). \([\text{(4H}{)}_{\text{Si}}^{\times}]\) remains effectively at 0 until ~ 40 km in oceanic mantle and ~ 80 km in continental mantle and then steadily increases with depth, overtakes Ti to be the dominant H-defect at around ~ 100–250 km and reaches its Ti-free value at around 250–300 km along both oceanic and continental geotherms. \([{\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}]\) generally decreases slightly (fixed [H2O]bulk) or increases slightly (variable [H2O]bulk) with depth until ~ 200 km when its concentration drops rapidly with depth in favour of [\(\text{(4H}{)}_{\text{Si}}^{\times}]\).

This overall trend with depth is quite robust in the face of all other variables.

[Al2O3] has no large effect on the distribution of the major products as discussed above and which can be seen in the Ti = 0 sections of Fig. 5 which are largely homogenous with varying Al content. [TiO2] has a big effect on H distribution as explained above but this effect quickly saturates. When [TiO2] < saturation the effect of [TiO2] is largely linear as \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) is strongly favoured and thus forms linearly with increasing [TiO2]. This can be seen in Figs. 5 and 6 by a smooth colour change along the [TiO2] concentration gradients before the saturation point is reached and the colour no longer changes. When [TiO2] > saturation, varying the Ti concentration has no effect as seen by the regions in Figs. 5 and 6 where [TiO2] varies. At deep depths, \(\text{(4H}{)}_{\text{Si}}^{\times}\) is favoured and the value of [TiO2] is largely irrelevant. Increasing [H2O]bulk favours \([\text{(4H}{)}_{\text{Si}}^{\times}]\) and suppresses \([{\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}]\) in the upper mantle but very high concentrations of water are needed before \([{\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}]\) is suppressed significantly. This can be seen in Fig. 6 where at 10 wt. ppm water the value of \([\text{(4H}{)}_{\text{Si}}^{\times}]\) at shallow depths is very dependent on [TiO2] (the blue band) but at 100 wt. ppm water only very high values of [TiO2] affect \([\text{(4H}{)}_{\text{Si}}^{\times}]\) (the blue band forms at much higher concentrations of [TiO2]). At 1000 wt. ppm \({\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}\) is effectively suppressed and \([\text{(4H}{)}_{\text{Si}}^{\times}]\) is effectively insensitive to [TiO2]. As seen by the difference between H-distributions along oceanic and continental geotherms in Fig. 5 different temperatures have a small effect on H distribution. This mostly affects [H + VMg] which should be much larger in hotter mantle (Table S18). This shows however that temperature fluctuations in the mantle can be important. All of these effects are small compared to the effect of depth, however, and mostly shift the changes seen with depth rather than suppress them.

Overall we find that across mantle pressures and temperatures that there are multiple different H-bearing defect regimes and that there is no one uniform H distribution that can be applied across the upper mantle. Aside from depth, lateral variations in temperature or the concentrations of Ti or water could cause large changes in H distribution. This means that when modelling the effect of water on upper mantle forsterite properties, multiple equations are likely required to represent mantle heterogeneity and depth. The large variation in the distribution of H with depth would be difficult to predict from measurements which only modify a single variable or which sample only small regions of P and T space and which only the measure most prominent H-defects. The coupling of T and P is particularly important in controlling the [H + VMg]: \([\text{(4H}{)}_{\text{Si}}^{\times}]\) ratio as this ratio is increased by the former and decreased by the latter and thus the shape of the geotherm will control the shape and depth of the [H + VMg] spike. In upper mantle conditions, however, we find P to be overwhelmingly the most important variable and thus when considering the H distribution in forsterite a range of pressures must be examined. [H2O]bulk is an important secondary variable (for relative concentrations, for absolute concentrations it is of critical importance) and experimental studies on saturated forsterite may over-represent \([\text{(4H}{)}_{\text{Si}}^{\times}]\) (which is favoured by high [H2O]bulk) when compared to mantle forsterite which will likely be undersaturated. Properties such as forsterite conductivity (Sun et al. 2019; Fei et al. 2018) and strength (Demouchy and Bolfan-Casanova 2016) are partially functions of [H + VMg] and \([\text{(4H}{)}_{\text{Si}}^{\times}]\) respectively and have different relations with [H2O]bulk and fH2O in different conditions that could be encountered in the upper mantle and thus the effect of water on these properties cannot be modelled with a single exponent as has been previously attempted.

We have not considered the effect of Fe in these systems but the complex relationship between the defects likely holds in the presence of Fe. The main effect of iron is likely that it introduces a competing reaction which forms \(\{{\text{Fe}}_{\text{Mg}}^{\cdot}{\text{H}}_{\text{Mg}}^\prime\}^\times\). This could be the main H-bearing defect under various conditions (Berry et al. 2007a) but is unlikely to affect the fundamental relationships between [H + VMg], \([\text{(4H}{)}_{\text{Si}}^{\times}]\) and \({[\left\{{\text{Ti}}_{\text{Mg}}^{\cdot\cdot}\text{(2H}{)}_{\text{Si}}^{\prime\prime}\right\}}^{\times}]\) described here and will likely simply scale the values of these concentrations down while they maintain their internal depth dependence. Iron will also make the system more sensitive to fO2 fluctuations which have been ignored in this work due to them not having an obvious effect on forsterite chemistry. The presence of Fe will make this system more complex and thus will further the overall conclusion that the distribution of H in forsterite/olivine is very complex and very sensitive to conditions. Other elements that could be present in the mantle and that interact strongly with hydrogen such as fluorine (Crepisson et al. 2014) and boron (Kent and Rossman, 2002) will further increase complexity and need to be considered further.

Conclusions

In conclusion, we built a thermodynamic model of hydrogen distribution in forsterite from first principles. We find that even for a very simple system (pure forsterite plus three trace elements) a very complex model is required to correctly predict hydrogen distribution. In different pressure, temperature, water fugacity and trace element regimes different H-bearing defects can be favoured and the effect of water fugacity will be different as seen through differing rc and rf values in different pressure and temperature regimes. Extrapolating properties such as the concentration of H-bearing defects and the water exponents (rc and rf) between these regimes is extremely difficult and thus only results in the desired regime should be used when considering H distribution in forsterite. This has effects both on building models for the upper mantle where multiple constraints are required on the effect of water and on experiments where high water fugacities may overstate the effect of \(\text{(4H}{)}_{\text{Si}}^{\times}\) and where derived H-defect concentrations and water exponents may be reflective of only very narrow ranges of pressure and temperature space.

To move towards a full understanding of the effect of water in olivine in the upper mantle, more elements need to be considered. Iron is an obvious choice for consideration but it is important to establish which elements can form H-bearing defects that can be examined. Elements like Ni are reasonably abundant in olivine but lack obvious associated H-bearing defects whereas elements like Cr are less abundant but could form H-bearing defects such as via the 3 + oxidation state of Cr. Each additional element will increase the complexity of the models here and thus the effect of hydrogen in real olivine in the lower mantle is likely to be very complex and water likely has different effects on the properties of olivine in different parts of the upper mantle.