Keywords

In thermodynamic equilibrium, and at temperatures below melting, materials tend to form crystalline states, which possess long-range order and translational symmetry. Understanding the structure of materials is crucial for understanding their properties. However, the prediction of crystal structure has been a long-standing challenge in physical science. Back in 1988, Maddox summarized this problem with the following words [1]:

One of the continuing scandals in the physical sciences is that it remains in general impossible to predict the structure of even the simplest crystalline solids from knowledge of their chemical composition… Solids such as crystalline water (ice) are still thought to lie beyond mortals’ ken.

Over the next few years, programs started appearing that attempted to do just this and, in 1994, Gavezzotti [2] addressed the fundamental questions “Are crystal structures predictable?” The answer was again asserted as “No.”

Crystal structure prediction (CSP) is particularly necessary when crystal structure information is not readily available. At normal conditions, the crystal structure of most materials can be trivially determined by modern experimental techniques such as X-ray diffraction. However, the same treatment becomes extremely problematic when it comes to extreme conditions, and computer simulation becomes essential for obtaining structural information. Not only at extreme but also at normal conditions crystal structure prediction is of enormous value – this is one of the most fundamental problems in materials science and a necessary key step in computational materials discovery.

What do we mean precisely by CSP problem? The simplest and most important case is to find, at given pressure (and temperature) conditions, the stable crystal structure knowing only the chemical formula [3].Footnote 1 Many types of advanced techniques have been proposed to address this problem [413] and these are described in a recent book [3]. Among these methods, the USPEX method [1318], based on evolutionary algorithm, is the leading one, and has been viewed as a revolution in crystallography [19]. It has led to many exciting discoveries, early examples of which, confirmed by experiment, include the superhard phase of boron with partially ionic bonding [20], transparent insulating phase of sodium [21], etc. In this chapter we will give an overview of the modern crystal structure prediction field, and particularly the methodology and a few recent applications based on evolutionary algorithms. Discussions here follow closely those in [13, 15, 18].

1 Methodology

1.1 Energy Landscape

Before talking about the prediction of the crystal structure, let us first consider the energy landscape that needs to be explored. The number of distinct points on the landscape can be estimated as:

$$ C=\left(\begin{array}{c}\hfill V/{\delta}^3\hfill \\ {}\hfill N\hfill \end{array}\right){\displaystyle \prod \left(\begin{array}{c}\hfill N\hfill \\ {}\hfill {n}_i\hfill \end{array}\right)}, $$

where N is the number of atoms in the unit cell of volume V, δ is a relevant discretization parameter (for instance, 1 Å), and n i is the number of atoms of ith type in the unit cell. Even for small systems (N ≈ 10), C is astronomically large (roughly 10N if one uses δ = 1 Å and a typical atomic volume of 10 Å3). Such an enormous number of structures cannot possibly be sampled, even on the most advanced supercomputer, making direct solution of the CSP impossible.

The dimensionality of the energy landscape is

$$ d=3N+3, $$

where 3N − 3 degrees of freedom are from N atoms, and the remaining six dimensions are defined by the lattice. CSP is an NP-hard problem, and the difficulty increases exponentially with the dimensionality. Yet great simplification can be achieved if structures are relaxed, i.e., brought to the nearest local energy minima. Relaxation introduces some intrinsic chemical constraints (bond lengths, bond angles, avoidance of unfavorable contacts). Therefore, the intrinsic dimensionality can be reduced:

$$ {d}^{*}=3N+3-\kappa, $$

where κ is the number of correlated dimensions, which could vary greatly according to the intrinsic chemistry in the system. For example, the dimensionality drops a lot from 99 to 11.6 for Mg16O16, while only a little, from 39 to 32.5, for Mg4N4H4. Thereby, the reduced complexity for the energy landscape of local minima is

$$ {C}^{*}= \exp \left(\beta {d}^{*}\right). $$

This implies that any efficient search method must include structure relaxation (local optimization). We also note that all global optimization methods rely on the assumption that the reduced energy landscape should have an overall shape (Fig. 1a). An extreme (and, fortunately, unrealistic) case of a golf-course landscape (Fig. 1b) gives an opposite example, where total lack of structure of the landscape will lead any global optimization method to fail.

Fig. 1
figure 1

Simplified illustration of energy landscape: (a) the general landscape; (b) golf-course like landscape. The landscape (a) could be transformed to a bowl-shaped one without noise by interpolating local minima points as shown by the dashed line, but (b) does not have such a helpful transformation

1.2 Global Optimization Methods

As the stable structure corresponds to the global minimum of the free energy surface, crystal structure prediction is mathematically a global optimization problem. Several global optimization algorithms have been devised and used with some success in CSP – for instance, simulated annealing [4, 5], metadynamics [6, 7], genetic algorithms [8], evolutionary algorithms [13], random sampling [9], basin hopping [10], minima hopping [11], and data mining [12].

One either has to start already in a good region of configuration space (so that no effort is wasted on sampling poor regions) or has to use a “self-improving” method that locates, step by step, the best structures. The first group of methods includes metadynamics, simulated annealing, basin hopping, and minima hopping approaches. The second group essentially includes only evolutionary algorithms. Alternatively, data mining approaches use advanced machine learning concepts and predict the structures based on a large database of known crystal structures [12]. Among all these groups of methods, evolutionary algorithms present a particularly attractive approach for solving CSP. The strength of evolutionary simulations is that they do not require any system-specific knowledge except chemical composition, and are self-improving, i.e., in subsequent generations increasingly good structures are found and used to generate new structures. Its power has been evidenced by many recent discoveries in the field of CSP [2027].

1.3 Evolutionary Algorithm

The evolutionary algorithm (EA) mimics Darwinian evolution and employs natural selection of the fittest and such variation operators as genetic heredity and mutations. It can perform well for different types of free energy landscapes. Unlike in genetic algorithms, we represent the coordinates of atoms in the unit cell and lattice vectors by real numbers (rather than binary “0/1” strings) – and therefore our algorithm is not genetic but evolutionary. The search space here is continuous and not discrete as with binary string representation.

The procedure is as shown in Fig. 2:

Fig. 2
figure 2

The EA implemented in the USPEX code for crystal structure prediction. Several versions of this algorithm, as well as other algorithms, such as evolutionary metadynamics [7] and variable-cell NEB method [28], are implemented in USPEX as well

  1. 1.

    Initialization of the first generation, that is, a set of structures satisfying the hard constraints are randomly generated.

  2. 2.

    Determination of the quality for each member of the population using the so-called fitness function.

  3. 3.

    Selection of the best members from the current generation as parents, from which the new generation is created by applying specially designed variation operators.

  4. 4.

    Evaluation of the quality of all new trial solutions (i.e., structures).

  5. 5.

    Repeat steps 3 and 4 until pre-specified halting criteria are achieved.

The above algorithm has been implemented in the USPEX (Universal Structure Predictor: Evolutionary Xtallography) code [1317]. Fitness function mathematically describes the target direction of the global search, which can be either a thermodynamic fitness (to find stable states) or a physical property (to find materials with desired properties).

1.4 Variation Operators

An essential step in an EA is to deliver the good gene to the next population. In USPEX, such delivery is done via variation operators. In general, the choice of variation operators follows naturally from the representation and the nature of the fitness landscape, and may or may not be inspired by physical processes representing transformations between likely good solutions.

Heredity is a core part of the EA approach, as it allows communication between different trial solutions or classes of solutions by combining parts from different parents. In USPEX, to generate a child from two parents, the algorithm first chooses a plane which is parallel to one lattice plane, and then cuts a slice with a random thickness and random position along the other lattice vector; such slices from two parent structures are then matched to form a child structure. In this process, the number of atoms of each type is adjusted to ensure conservation of chemical composition.

Mutation operators use a single parent to produce a child. Lattice mutation applies a stain matrix with zero-mean Gaussian random strains to the lattice vectors; soft-mode mutation (which we call softmutation for brevity) displaces atoms along the softest mode eigenvectors, or a random linear combination of softest eigenvectors; the permutation operator swaps chemical identities of atoms in randomly selected pairs of unlike atoms.

1.5 Fingerprints: A Tool to Identify Similar Crystal Structures and to Prevent Premature Convergence

A general challenge for global optimization methods is to avoid getting stuck in a local minimum and thus skip the global minimum. In the context of EA, this is due to the fact that good structures tend to produce children that bear resemblance to them, and it is possible for a good low-energy (but still not the global minimum) structure to come to dominate the population. Such behavior is especially common for energy landscapes with many good local minima, and a successful algorithm should address this problem. To prevent this, the key is to control the diversity of the population. Thus one question comes up – how can we detect similar structures and measure the similarity quantitatively?

Direct comparison of atomic coordinates will not work due to translational invariance (i.e., adding a constant vector to coordinates of all atoms will not change the structure) and because they are represented in lattice vectors units and there are many equivalent ways to choose a unit cell. Free energy difference is not a good parameter either: two completely different structures can have very close energies.

An ideal function characterizing a structure should be (1) derived from the structure itself, rather than its properties, (2) invariant with respect to shifts, rotations, and reflections in the coordinate system; (3) sensitive to different orderings of the atoms; (4) formally related to experiment; (5) robust against numerical errors, and (6) capable of incorporating short-range and long-range order. In USPEX, we use the so-called fingerprint function [29] to describe a crystal structure. It has the formulation very similar to pair distribution function (PDF), which for an elemental solid is

$$ \mathrm{PDF}(R)={\displaystyle \sum_i{\displaystyle \sum_{j\ne i}\frac{1}{4\pi {R}_{ij}^2\frac{N}{V}\Delta}\delta \left(R-{R}_{ij}\right)}}, $$

where R ij is the distance between atoms i and j, V is the unit cell volume, N is the number of atoms in the unit cell, and Δ is a bin width (in Å). The index i goes over all atoms in the unit cell and index j goes over all atoms within the cutoff distance from the atom i. The PDF at long distances oscillates around the value +1, which is not convenient for our purposes, and we subtract this “background” value for convenience. Generalizing to systems containing more than one atomic type, we introduce fingerprint as a matrix, the components of which are fingerprint functions for A–B type distances:

$$ {F}_{\mathrm{A}\mathrm{B}}(R)={\displaystyle \sum_{{\mathrm{A}}_i,\mathrm{cell}}{\displaystyle \sum_{{\mathrm{B}}_j}\frac{\delta \left(R-{R}_{ij}\right)}{4\pi {R}_{ij}^2\frac{N_{\mathrm{A}}{N}_{\mathrm{B}}}{V}\Delta}}}-1. $$

One can measure the similarity between two structures by calculating the cosine distance between two fingerprint functions:

$$ {d}_{ij}=0.5\cdot \left(1-\frac{f_i{f}_j}{\left|{f}_i\right|\left|{f}_j\right|}\right). $$

Using this new crystallographic descriptor, we can improve the selection rules and variation operators above. During the selection process, only one copy of each distinct structure is used, and all its copies are killed. Fingerprint theory brings many other benefits (quantification and visualization of energy landscapes, use of ordered fragments of crystal structures, etc.); see [14, 17, 29].

2 New Developments

USPEX has been widely successfully in applications to very different kinds of systems, enjoying high success rate and efficiency. To predict very large and complex crystal structures, this method has been improved in many ways (generation of random symmetric structures, smart variation operators learning about preferable local environments and directed mutations, ageing technique, etc. [17]). Below we give examples of two major subjects, prediction of structures from molecular building blocks, and simultaneous optimization of both configurational and compositional space to find novel compounds.

2.1 Predicting Structures from Building Blocks

Molecular crystals are extremely interesting because of their applications as pharmaceuticals, pigments, explosives, and metal-organic frameworks [30]. The periodically conducted blind tests of organic crystal structure prediction, organized by Cambridge Crystallographic Data Centre (CCDC), have been the focal point for this community and they reflect steady progress in the field [3136]. The tests show that it is now possible to predict the packing of a small number of rigid molecules, provided there are cheap force fields accurately describing the intermolecular interactions. In these cases, efficiency of search for the global minimum on the energy landscape is not crucial. However, if one has to use expensive ab initio total energy calculations or study systems with a large number of degrees of freedom (many molecules, especially if they have conformational flexibility, lead to astronomically large numbers of possible structures), efficient search techniques become critically important.

Compared to the prediction of atomic structures, there are several features to be taken into account for molecular crystals:

  1. 1.

    A typical unit cell contains many more atoms than a normal inorganic structure, which means an explosion of computing costs if all these atoms are treated independently.

  2. 2.

    Molecules interact with each other by weak forces, such as the van der Waals (vdW) interactions, and the inter-molecular distances are typically larger than those in atomic crystals, which leads to the availability of large empty space.

  3. 3.

    Most of the molecular compounds are thermodynamically less stable than simpler molecular compounds from which they can be obtained (such as H2O, CO2, CH4, NH3, H2). This means that a fully unconstrained global optimization approach in many cases will produce a mixture of these simple molecules, which are of little interest to organic chemists. To study the packing of the actual molecules of interest it is necessary to fix the intra-molecular connectivity.

  4. 4.

    Crystal structures tend to be symmetric, and the distribution of structures over symmetry groups is very uneven [37, 38]. For example, 35% of inorganic and 45% of organic materials have the point group 2/m. Compared to inorganic crystals, there is a strong preference of organic crystals to a small number of space groups. Most organic crystals are found to possess space groups: P21/c (36.59%), P-1 (16.92%), P212121 (11.00%), C2/c (6.95%), P21 and Pbca (4.24%).

If we start to search for the global minimum with randomly generated structures, it is very likely that most of the time will be spent on exploring those uninteresting disordered structures far away from the target. Fortunately, the prediction of stable complex molecular structures can be achieved under the constraint of fixed molecules (or partially flexible molecules) as building blocks. The truly interesting problem for most organic chemists can be solved by constrained global optimization, finding the most stable packing of molecules with fixed bond connectivity. This will not only make the global optimization process meaningful, but at the same time will simplify it, leading to a drastic reduction of the number of degrees of freedom and of the search space. In order to apply constraints on the EA, we mainly need to modify the initialization of structures and variation operators.

2.1.1 Initialization: Generation of Molecular Structures

It is essential that all newly generated structures consist of molecules with desired bond connectivity. The efficiency can be greatly enhanced by using symmetry (so that different molecules in the unit cell are symmetrically related to each other) in the random generation of new structures – a population of symmetric structures is usually more diverse than a set of fully random (often disordered) structures. Diversity of the population of structures is essential for the success and efficiency of evolutionary simulations.

The initial structures are usually generated randomly, with randomly selected space groups. First, we randomly pick 1 of 230 space groups, and set up a Bravis cell according to the prespecified initial volume with random cell parameters consistent with the space group. Then one molecule is randomly placed on a general Wyckoff position and is multiplied by space group operations. If two or more symmetry-related molecules are found close to each other, we merge them in one molecule that sits on a special Wyckoff position and has averaged coordinates of the molecular center and averaged orientational vectors (or random, when the average value is zero). Adding new molecular sites one by one, until the correct number of molecules is reached, we get what we call a random symmetric structure (Fig. 3). During this process we also make sure that no molecules overlap or sit too close to each other.

Fig. 3
figure 3

Illustration of generating a random symmetric structure with four molecules per cell. For a given space group randomly assigned by the program (in this case, P2 1 /c), the Bravis cell is generated, and molecular center is placed onto a random position (in this case, the general position 4e or 2a + 2d). Molecules are then built at the Wyckoff sites preserving their intramolecular connectivity and with their orientations obeying space group symmetry operations. Molecular geometry often breaks space group symmetry, leading to a subgroup, and we allow this. For clarity of the figure, molecules occupying positions at the corners and faces of the unit cell are shown only once

2.1.2 Variation Operators

Child structures (new generation) are produced from parent structures (old generation) using one of the following variation operators:

  1. 1.

    Heredity.

  2. 2.

    Permutation.

  3. 3.

    Coordinate mutation.

  4. 4.

    Lattice mutation (seldom used for molecular crystals).

    These are the same as in atomic crystal structures, with the only difference that variation operators act on the geometric centers of the molecules and their orientations, i.e., whole molecules, rather than single atoms, are considered as the minimum building blocks. Since molecules cannot be considered as spherically symmetric point particles, additional variation operators must be introduced.

  5. 5.

    Rotational mutation of the whole molecules.

  6. 6.

    Modified softmutation, which must retain molecular connectivity and is thus a hybrid operator of coordinate and rotational mutation. Figure 4 shows how variation operators work in our algorithm. Below we describe how these variation operators were used in our test cases.

    Fig. 4
    figure 4

    Variation operators: (a) heredity; (b) coordinate mutation; (c) rotational mutation

2.1.2.1 Heredity

This operator cuts planar slices from each individual and combines these to produce a child structure. In heredity, each molecule is represented by its geometric center (Fig. 4a) and orientation. From each parent, we cut (parallel to a randomly selected coordinate plane of the unit cell) a slab of random thickness (within the bounds of 0.25–0.75 of the cut lattice vector) from a random height in the cell. If the total number of molecules of each type obtained from combining the slabs does not match the desired number of molecules, a corrector step is performed: molecules in excess are removed while molecules in shortage are added; molecules with a higher local degree of order have higher probability to be added and lower probability to be removed. This is equivalent to our original implementation of heredity for atomic crystals.

2.1.2.2 Rotational Mutation

A certain number of randomly selected molecules are rotated by random angles (Fig. 4c). For rigid molecules there are only three variables to define the orientation of the molecules. For flexible molecules, we also allow the mutation of torsional angles of the flexible groups. A large rotation can have a marked effect on global optimization, helping the system to jump out of the current local minimum and find optimal orientational ordering and optimal molecular conformation.

2.1.2.3 Softmutation

This powerful operator, first introduced for atomic crystals [14], involves atomic displacements along the softest mode eigenvectors, or a random linear combination of the softest eigenvectors. In the context of molecular crystals it becomes a hybrid operator, combining rotational and coordinate mutations. In this case, the eigenvectors are calculated first and then projected onto translational and rotational degrees of freedom of each molecule and the resulting changes of molecular positions and orientations are applied, preserving rigidity of the fixed intra-molecular degrees of freedom. To calculate efficiently the normal modes, we construct the dynamical matrix from bond hardness coefficients [14]. The same structures can be softmutated many times, each time along the eigenvector of a new mode.

2.2 Method for Variable-Composition Searches: Prediction of New Compounds

This is a function to enable simultaneous prediction of all stable stoichiometries and structures. A pioneering study was done by Johanesson et al. [39], who succeeded in predicting stable stoichiometries of alloys within a given structure type. However, a simultaneous search for stable structures and compositions is much more challenging. This means that we are dealing with a complex landscape consisting of compositional and structural coordinates which require a series of modification of the standard EA approaches. This was done in 2008 in the USPEX code (see [40, 41]).

In order to involve the variation of chemical composition, we need to consider the following issues:

  1. 1.

    The sampling should cover the whole range of compositions of interest.

  2. 2.

    Proper fitness should be devised to evaluate the quality of structures that have different compositions.

  3. 3.

    Smart selection rules are needed, based on the fitness function.

  4. 4.

    Variation operators should allow the variation of stoichiometries.

2.2.1 Fitness: Representation as a Convex Hull

For a system with a given chemical formula, the optimizing target only involves energy per formula unit. If one wants to study a system of compounds with different stoichiometries, the stability can be evaluated by the formation energy towards the decomposition into mixtures of other compounds. Let us take a simple binary system AB as an example. The energy of formation of A x B1−x can be expressed as

$$ {E}_{\mathrm{formation}}={E}_{\mathrm{A}\mathrm{B}}-x{E}_{\mathrm{A}}-\left(1-x\right){E}_{\mathrm{B}}, $$

where E A and E B correspond to the energy of the elemental A and B forms. Cleary E formation is a function of the compositional ratio x, and its calculation requires the knowledge of E A, E B, and E AB. Stable compounds have negative energy of formation. If we draw the plot of E formation (x) for a series of structures/compositions in the A–B system as shown in Fig. 5, any structure with negative E formation can be stable towards decomposition into the elements A and B – this is visually easy to detect, as structure AB, stable against decomposition into A and B, is below the line drawn from A to B. However, for a compound A x B1−x to be thermodynamically stable, this is necessary but not sufficient – a sufficient condition is that this compound is stable to decomposition into any other compounds (not only elements A and B), i.e., is below all the possible “decomposition lines.” All thermodynamically stable compounds form a convex hull. The fitness of a structure/composition can be defined as the minimum vertical distance from the convex hull (see Fig. 5).

Fig. 5
figure 5

Energy of formation as a function of composition. The stable structures need to be below all the possible “decomposition lines,” and form a convex hull. The fitness can be defined as the minimum vertical distance from the convex hull

2.2.2 Selection

With the fitness available, we can proceed to the selection process. In a standard EA approach we select low-energy structures from the current generation. That is, the current population is considered as the selection pool. For variable-composition calculations, a modified selection rule can be beneficial: we are facing a much more complex search space, and the population size is usually insufficient to represent the diversity of the whole system. Thus we need to build the selection pool from the whole history. At the end of each generation, we update the convex hull and then calculate the fitness for the structures from all previous generations, and rank them after discarding identical structures identified by fingerprints. One common behavior of this data set is that the distribution of “high fitness structures” is very uneven in the compositional space. There might exist many low-energy structures for some particular compositions while only a few structures for other compositions. This indicates that the energy window varies a lot with stoichiometries, and thus a direct selection from the ranking list might bias the search considerably. To revise this, we use a simple rule to set the maximum number of structures for each composition when building the selection pool (Figs. 6 and 7).

Fig. 6
figure 6

The evolution of selection pool in USPEX for variable-composition structure prediction of a binary Lenard–Jones system (see Fig. 10 for details)

Fig. 7
figure 7

Illustration of zebra heredity operator. It is quite obvious that in the case of variable composition the child structure obtained from many slices would be much more reasonable than the one obtained from traditional two-slice heredity

2.2.3 Variation Operators

Some of the variation operators, like softmuation and permutation, have the same formulation as used in standard EA. Heredity, however, is defined in a slightly different way. First, the chemistry-preserving constraints in the heredity operator should be removed. Second, if we consider two parent structures with quite different stoichiometries, their child structures obtained by normal heredity will very likely have two distinct chemical blocks as shown in Fig. 8, and such structures will be closer to the idea of a two-phase assemblage (a result of decomposition) than a single phase with a definite chemical composition. To remedy this we cut many slices from both parents (the thickness determined stochastically according to the approximate atomic radii) in a “zebra” pattern – the modified heredity operator is called “zebra heredity.”

Fig. 8
figure 8

Illustration of transmutation: one can obtain the NaCl-type structure from the simple cubic structure by transmuting half of the atoms

To allow further change of chemical composition, we introduce a “chemical transmutation” operator. This operator turns out to be quite efficient for driving the system from a known minimum to another good minimum in a different area of compositional space.

2.2.4 Implementation and Tests

After considering all the above ideas developments, the EA for variable-composition searches can be designed, as shown in Fig. 9.

Fig. 9
figure 9

The flowchart of variable-composition prediction in the USPEX code

An example of a (very difficult) system is given in Fig. 10. Consider a simple binary Lennard–Jones A–B system; the potential for each atomic ij-pair is given by

Fig. 10
figure 10

Variable-composition USPEX simulation of the A x B y binary Lennard–Jones system. In the upper panel: stable compositions (A14B, A8B, A3B, A2B, AB). The lower panel shows some of the stable structures

$$ {U}_{ij}={\varepsilon}_{ij}\left[{\left(\frac{R_{\min }}{R}\right)}^{12}-2{\left(\frac{R_{\min }}{R}\right)}^6\right], $$

where R min is the distance at which the potential reaches minimum, and ε is the depth of the minimum. In these simulations we use additive atomic dimensions R min(BB)=1.5R min(AB) = 2R min(AA) and non-additive energies (to favor compound formation) ε AB = 1.25; ε AA = 1.25ε BB . Odd as it may seem, a binary Lennard–Jones system with a 1:2 ratio of radii exhibits a large number of ground states – including the exotic A14B compound and the well-known AlB2-type structure, and several marginally unstable compositions (such as A8B7, A12B11, A6B7, A3B4, AB2). The correctness of these predictions is illustrated by the fact that a fixed-composition test simulation at AB2 stoichiometry produced results perfectly consistent with the variable-composition runs.

Figure 11 shows a practically interesting example of variable-composition simulations – B–N system at ambient and high (50 GPa) pressure. At ambient pressure, hexagonal BN is thermodynamically stable, and B13N is right at the border between stability and metastability. On increasing pressure, B13N becomes metastable and only BN (in the cubic, diamond-like, form) is stable. One can also notice a strong increase of stability of BN – its enthalpy of formation increases from ~ −1.5 eV/atom at 1 atm to ~ −2.2 eV/atom at 50 GPa. Variable-composition calculations are a very powerful tool to explore chemical reactivity of the elements and its dependence on external conditions, such as pressure.

Fig. 11
figure 11

Variable-composition USPEX simulation of B–N system at 1 atm (left) and 50 GPa (right)

3 Applications

As an illustration of constrained global optimization for molecular crystals, we consider a promising material for hydrogen storage, Mg(BH4)2. To illustrate how pressure leads to the formation of new chemical compounds (which are most efficiently predicted by variable-composition searches), we show recent results on the Xe–O and Mg–O systems. In all the calculations, global optimizations were carried out by the USPEX code, and the VASP code [42] was employed for local optimization (i.e., structural relaxation), using the PBE exchange-correlation functional [43] and the PAW method [44].

3.1 Mg(BH4)2 [16, 27]

Lightweight metal borohydrides have recently received much attention owing to their high gravimetric and volumetric hydrogen densities compared to other complex hydrides [45]. Of these, magnesium borohydride, Mg(BH4)2, as a prominent lightweight solid-state hydrogen storage material with a theoretical hydrogen capacity of 14.8 wt%, has been extensively studied at both ambient and high pressure conditions.

3.1.1 Mg(BH4)2 at Ambient Condition

As a test, we first explore the energy landscape of Mg(BH4)2 at ambient condition. Mg(BH4)2 at ambient condition has been extensively studied as a template for developing novel hydrogen-storage solutions. Based on the experimental data, the ground-state α and β phases have been assigned space groups P6122 (330 atoms per unit cell) and Fddd (704 atoms/cell), and turned out to have unexpectedly complex crystal structures [4649]. There had been disputes between experimentalists and theoreticians regarding the nature of these ground-state structures [5052]. Recent theoretical work then predicted a new body-centered tetragonal phase (with I4m2 symmetry), which has slightly lower energy than the P6122 phase, by using the prototype electrostatic ground-state approach (PEGS) [50]. Later, based on the prototype structure of Zr(BH4)4, another orthorhombic phase with F222 symmetry was found to have even lower energy than all previously proposed structures [52].

In general, the previous theoretical discoveries of novel Mg(BH4)2 phases were conducted either by ad hoc extensive searching or by chemical intuition. However, USPEX does not rely on any prior knowledge except chemical composition, and could be particularly useful for predicting stable crystal structures for these complex metal hydride systems. If we consider the BH4 - ion as a molecular group, the search space would be dramatically reduced. Within 10 generations (or just 400 structure relaxations), USPEX found the F222 phase (Fig. 12a) as the most stable structure at ambient pressure. Moreover, the I-4 m2 structure (Fig. 12b) was also found by USPEX in the same calculation, with enthalpy less than 1.2 meV/atom above that of the F222 phase. Compared to the previous work, our method is clearly more universal, systematic, and robust, enables efficient structure prediction for complex molecular systems, both organic and inorganic.

Fig. 12
figure 12

Mg(BH4)2 polymorphs at ambient conditions found by USPEX. (a) F222 phase; (b) I-4m2

3.1.2 Mg(BH4)2 Under High Pressure

To improve the reversible hydrogen absorption or desorption kinetics or get new metastable polymorphs, recent studies focused on the stabilization of the high-pressure phases of Mg(BH4)2 at ambient pressure. Most recently, new δ, δ′, and ε phases of Mg(BH4)2 were successfully synthesized under pressure [53]. Many of them turned out to retain their structure upon decompression to ambient conditions. Crystal structures of γ and δ phases were, apparently convincingly, resolved using powder synchrotron X-ray diffraction [53]. Unexpectedly, theoretical phonon calculations showed the P42 nm structure (proposed by Filinchuk for the δ phase [53]) to be dynamically unstable at ambient pressure, which means that the exact crystal structure of the δ phase is still unresolved, even for such a simple structure with only 22 atoms per cell, and still less for the poorly characterized δ′ and ε phases. Therefore, the polymorphism and phase diagram of this important compound required further investigation.

According to our prediction, the tetragonal I41/acd and trigonal P-3m1 phases are found to be the most stable ones in structure searches at 2–5 GPa and 10–20 GPa, respectively. Interestingly, within the whole pressure range (up to 20 GPa), we did not find the P42 nm structure proposed by Filinchuk et al. [53], but instead found the I41/acd phase with 4 formula units (44 atoms) per cell and P-4 phase with 2 formula units per cell at pressures below 5 GPa (see Fig. 13). Given that the P42 nm structure is dynamically unstable at ambient pressure, and based on our enthalpy calculations, we hypothesized that the I41/acd and P-4 structures might correspond to the experimentally observed δ and δ′ phases. Further investigation confirmed this suggestion, as we will show below.

Fig. 13
figure 13

(a) Enthalpy curves (relative to the γ phase) of various structures of Mg(BH4)2 as a function of pressure; (b) the I41/acd structure; (c) the P-4 structure. Enthalpies are given per formula unit. The inset in (a) shows the energy per formula unit of I41/acd, P-4, and P42 nm structures (relative to the P42 nm structure) at zero pressure, including vdW interactions

I41/acd-Mg(BH4)2 becomes more stable than the γ phase at pressures above 0.7 GPa (Fig. 13). In the room-temperature experiment, a pressure-induced structural transformation is observed for the porous γ phase, and occurs in two steps: the γ phase turns into a diffraction-amorphous phase at 0.4–0.9 GPa and then, at approximately 2.1 GPa, into the δ phase [53]. We note a tiny enthalpy difference between I41/acd and P-4 structures at pressures around 1 GPa. As pressure increases to 9.8 GPa, the P-3 m1 structure becomes the most stable structure, in agreement with earlier predictions [46, 54]. Bil et al. [55] indicated that it is important to treat long-range dispersion interactions to get the ground state structures of magnesium borohydrides correctly. We have examined the energetic stability of the considered structures through a semi-empirical Grimme correction to DFT energies, stresses and forces [56]. When this correction is included, the I41/acd and P-4 structures once again come out as more stable than the P42 nm structure, by 21.2 kJ/mol and 15.4 kJ/mol, respectively. Energetic stability seems to correlate with the degree of disparity of bond lengths and atomic Bader charges. The P42 nm structure has two inequivalent Mg–H distances, 2.26 and 2.07 Å, compared to 2.11 and 2.07 Å in the I41/acd structure, and 2.12 and 2.06 Å in the P-4 structure. As we can see, the more homogeneous bond lengths, the greater stability. Bader charges, computed using the code [57], show the same picture: for H atoms we find them to be −0.63e and −0.59e in the P42 nm structure, −0.63e and 0.62e in the P-4 structure, and −0.63e and −0.61e in the I41/acd structure. More homogeneous Bader charges and bond lengths in the I41/acd and P-4 structures correlate with their greater thermodynamic stability at ambient pressure, in agreement with proposed correlations between local bonding configurations and energetic stability [52].

Our calculations suggest that the P42 nm structure, proposed by experiment for the δ phase, is unstable. This implies that either density functional theory calculations are inaccurate for this system, or experimental structure determination was incorrect. To assess these possibilities, we simulated the XRD patterns of the I41/acd and P-4 structures, and compared them with the experimental XRD pattern of the δ phase at ambient pressure (see Fig. 14a). One observes excellent agreement, both for the positions and the intensities of the peaks (including both strong and weak peaks), of the I41/acd structure with experiment [53]. The situation is very peculiar: two structures, I41/acd and P42 nm, have nearly identical XRD patterns, both compatible with experiment – but one, I41/acd, is the true thermodynamic ground state (global minimum of the enthalpy), whereas the other, P42 nm, is not even a local minimum of the enthalpy (dynamically unstable structure, incapable of sustaining its own phonons). In this situation, the true structure is clearly I41/acd. This case gives a clear real-life example of the fact that very different structures can have very similar powder XRD patterns, making structure determination from powder data dangerous, and in such cases input from theory is invaluable. The P-4 structure also has a rather similar XRD pattern, but the peak positions are slightly shifted. Comparison with an independent experimental XRD pattern collected at 10 GPa (Fig. 14b) shows that the peak positions and intensities of the I41/acd structure are once again in excellent agreement with the experimental data [46], while the strong peaks of the P-4 structure at 9.9°, 11.6°, and 11.8° obviously deviate from the observed ones. This reinforces our conclusion that the I41/acd structure is the best candidate for the high pressure δ phase. At pressures below 10 GPa, a mixture of I41/acd and P-4 phases is possible, as the XRD peaks of these two structures are quite similar. We must remember that in the experiment, the δ and δ′ phases are nearly indistinguishable [53]. This example highlights the importance of theoretical simulations in establishing crystal structures, when only powder XRD data are available: purely experimental solutions may be dangerous even for simple structures, such as the structure of the δ phase with only six non-hydrogen atoms in the unit cell.

Fig. 14
figure 14

Simulated XRD patterns of the I41/acd, P-4 and P42 nm structures of Mg(BH4)2 with the X-ray wavelength of 0.770518 Å at ambient pressure (a) and 0.36814 Å at 10 GPa (b) in comparison with the corresponding experimental results [46, 53]

3.2 Xe–O system [25]

Xenon is a noble gas, chemically inert at ambient conditions. A few xenon fluorides have been found [5861], with Xe atoms in the oxidation states +2, +4, or +6. Upon application of high pressure, insulating molecular structure of XeF2 was found to transform into two- and three-dimensional extended solids and to become metallic [61]. Clathrate Xe–H solids were also observed [62]. Two xenon oxides (XeO3, XeO4) [63] are known at atmospheric pressure, but are unstable and decompose explosively above 25°C (XeO3) and −40°C (XeO4) [64]. A crystalline XeO2 phase with local square-planar XeO4 geometry has recently been synthesized at ambient conditions [65].

Growing evidence shows that noble gases, especially Xe, may become much more reactive under pressure [66]. The formation of stable xenon oxides and silicates could explain the missing xenon paradox, i.e., the observation that the amount of Xe in the Earth’s atmosphere is an order of magnitude less than what it would be if all Xe were degassed from the mantle into the atmosphere [67]. One possibility to explain this deficiency is to assume that Xe is largely retained in the Earth’s mantle. In fact, a recent experiment discovered that xenon reacts with SiO2 at high pressures and temperatures [68, 69]. At the same time, recent theoretical investigation showed that no xenon carbides are stable, at least up to the pressure of 200 GPa [70], and experimental and theoretical high pressure work [71] found no tendency for xenon to form alloys with iron or platinum.

Here we address possible stability of xenon oxides using quantum-mechanical calculations of their energetics. We have performed structure prediction simulations for the Xe–O system for the compositions of XeO, XeO2, XeO3, XeO4 at 5, 50, 100, 120, 150, 180, 200, and 220 GPa. Our calculation at 5 GPa yielded lowest-enthalpy structures that always contained the O2 molecules, indicating the tendency for segregation of the elements, and indeed at 5 GPa decomposition was found to be energetically favorable. This suggests that the reaction observed by Sanloup et al. [68, 69] at 0.7–10 GPa was an entropically driven incorporation of Xe impurities into the structure of SiO2, rather than enthalpically-driven formation of a stoichiometric xenon silicate or oxide. Indeed, solid solutions and point defects are stabilized by entropy (rather than enthalpy) [72].

3.2.1 Stable Xe–O Compounds Under High Pressure [25]

Figure 15 shows the enthalpy of formation of all the Xe oxides as a function of pressure. Below 83 GPa all xenon oxides are unstable. At 83 GPa, XeO-Pbcm becomes stable, followed by XeO2-P21/c above 102 GPa and XeO3-P42/mnm above 114 GPa. There is a clear trend of increasing the oxidation number of Xe on increasing pressure.

Fig. 15
figure 15

(a) Predicted enthalpies of formation of Xe–O compounds at high P and T = 0 K; (b) predicted Gibbs free energy of formation of Xe–O compounds at different temperatures (shifted for clarity by −0.1 eV/atom at each successive temperature) and P = 120 GPa; (c) crystal structure of XeO (Pbcm) at 100 GPa, and its ELF isosurface (ELF = 0.85) on the Xe–O chain; (d) crystal structure of XeO2 (P21/c) at 120 GPa, and its ELF isosurface (ELF = 0.85) on the XeO4 square; (e) crystal structure of XeO3 (Pmmn) at 200 GPa, and its ELF isosurface (ELF = 0.82) on XeO12 anticuboctahedra

A simple and clear analysis of chemical bonding can be carried out using the electron localization function (ELF) [73]. The ELF gives information about the valence electron configuration of an atom in a compound. States with closed-shell electronic configurations (Xe0, 5s 25p6, and Xe6+, 5s 2) will exhibit a spherical ELF distribution, whereas open-shell states (Xe2+, Xe4+) will not. For Xe2+ one p-orbital is empty and the ELF will have a toroidal shape; likewise, Xe4+ can be formed by the removal of two p-orbitals and the ELF will show a two-lobe maximum corresponding to the shape of the lone p-electron pair.

The most stable structure of XeO at 100 GPa has space group Pbcm and eight atoms in the unit cell. As shown in Fig. 15c, Xe atoms are in a twofold (linear) coordination and Xe–O bonds form chains, with O–Xe–O angles of 175.6° and Xe–O–Xe angles of 112.6°. The alternating Xe–O bond lengths are 2.0 and 2.1 Å. The ELF picture shows a toroidal maximum of ELF around each Xe atom, exactly what one should expect for Xe2+ state.

For XeO2, the stable structure above 102 GPa has space group P21/c and 24 atoms in the unit cell. Xenon atoms have a slightly non-planar square coordination and the structure consists of 1D-ribbons of edge-sharing XeO4-squares (Xe–O distances are 2.0 and 2.1 Å), with four Xe–O bonds and two lone pair maxima forming an octahedron, consistent with the geometry proposed by recent experiments [65]. Just as in XeO, there are no peaks visible in the ELF isosurface along the Xe–O bonds (Fig. 15d). Above 198 GPa it transforms into the XeO2-Cmcm structure.

XeO3 becomes stable at 114 GPa. Its structure has space group P42/mnm and 16 atoms in the unit cell. It is stable against decomposition into Xe and O2 as well as into XeO or XeO2 and O2. P42/mnm phase is composed of two sublattices: square XeO2 chains, again suggesting the Xe4+ state, and linear chains made of O2 dumbbells. Above 145 GPa, the molecules in the linear –O2–O2– chains are partly dissociated and we observe the –O2–O– chains in the C2/c phase that has 48 atoms per unit cell. Above 198 GPa, the structure transforms to a Pmmn phase with eight atoms per unit cell. In this remarkable structure, the oxygen atoms form anticuboctahedra in which the Xe atoms sit in the center (Fig. 15e). The ELF distribution around Xe atoms in the Pmmn phase is spherical around the xenon, which points at the Xe6+ valence state with a spherically symmetric 5 s2 valence shell. Again, we observe the tendency of increasing oxidation states under pressure.

Xenon fluorides are stable at ambient conditions, whereas xenon oxides become stable above 83 GPa. Xenon carbides are unstable up to 200 GPa at least [69]. It appears that xenon forms compounds most readily with the most electronegative atoms, and that in turn suggests that ionicity is essential. This is somewhat counterintuitive, given that the xenon atom has a very stable closed valence shell and its electronegativity is rather high. The electronegativity difference (1.4 for Xe–F, 0.8 for Xe–O, and 0.56 for Xe–C) determines the degree of ionicity at ambient conditions. However, ionicity often seems to be enhanced under pressure. Spontaneous ionization under pressure was recently found even in a pure element, boron [20].

3.2.2 Xe–Si–O System in the Earth’s Mantle [25]

Table 1 shows the representative chemical reactions involving xenon oxides and silicates in the Earth's lower mantle at 100 GPa. Xe oxides are only stable above 83 GPa, i.e., at pressures corresponding to the lower mantle. Since in the Earth's mantle metallic Fe should be present [74, 75], stability of Xe oxides needs to be explored in the presence of metallic Fe. In our calculations of phase equilibria, we took into account that at lower mantle conditions Fe has the hcp structure and FeO has the antiferromagnetic inverse NiAs structure [76, 77]. Calculations show that all the predicted xenon oxides are very strong oxidants and will oxidize Fe, producing iron oxide and free xenon (FeO + Xe). Therefore, Xe oxides cannot be present in the lower mantle, where free Fe should exist.

Table 1 Representative chemical reactions involving xenon oxides and silicates in Earth’s lower mantle at 100 GPa

Since xenon oxides are not stable in coexistence with metallic Fe, we investigated the formation of stable xenon silicates under pressure, focusing on XeSiO3 and Xe2SiO4, which contain the least oxidized divalent xenon. All of the investigated compositions were unstable towards decomposition into XeO, XeO2, SiO2, and elemental Xe; Xe2SiO4 (Fig. 16) proved to be one of the least unstable silicates, but is still unstable. In this structure, Xe atoms terminate the silicate perovskite layers, suggesting that xenon could also be stored in perovskite/post-perovskite stacking faults [78] or at grain boundaries or dislocations.

Fig. 16
figure 16

Crystal structure of the least unstable Xe2SiO4 obtained from USPEX

3.3 Mg–O system [26]

Magnesium oxide (MgO) is one of the most abundant phases in planetary mantles, and understanding its high-pressure behavior is essential for constructing models of the Earth’s and planetary interiors. For a long time, MgO was believed to be among the least polymorphic solids – only the NaCl-type structure has been observed in experiments at pressures up to 227 GPa [79]. Static theoretical calculations have proposed that the NaCl-type (B1) MgO would transform into CsCl-type (B2) and the transition pressure is approximately 490 GPa at 0 K (474 GPa with the inclusion of zero-point vibrations) [8082]. Calculations also predicted that MgO remains non-metallic up to extremely high pressure (20.7 TPa) [81], making it to our knowledge the most difficult mineral to metalize. Thermodynamic equilibria in the Mg–O system at 0.1 MPa have been summarized in previous studies [8385], concluding that only MgO is a stable composition, though metastable compounds (MgO2, MgO4) can be prepared at very high oxygen fugacities.

Using ab initio variable-composition evolutionary simulations, we explored the entire range of possible stoichiometries for the Mg–O system at pressures up to 850 GPa. In addition to MgO, our calculations find that two extraordinary compounds (MgO2 and Mg3O2) become thermodynamically stable in the regions of high and low oxygen chemical potential at 116 GPa and 500 GPa, respectively. To confirm this and to obtain the most detailed picture, we then focused our search on two separate regions of chemical space: Mg–MgO and MgO–O, respectively. Since the structures in the two regions exhibit different properties, we describe them separately.

3.3.1 MgO2

It is well known that monovalent (H–Cs) and divalent (Be–Ba and Zn–Hg) elements are able to form not only normal oxides but also peroxides and even superoxides [86] (for instance, BaO2 has been well studied at both ambient and high pressure [87, 88]). Our structure prediction calculations identified the existence of magnesium peroxide with Pa3 symmetry and 12 atoms in the unit cell at ambient pressure, which is in good agreement with experimental results [89]. In this cubic phase, Mg is octahedrally coordinated by oxygen atoms (which form O2 dumbbells); see Fig. 17c. However, Pa3 MgO2 (c-MgO2 from now on) is calculated to have a positive enthalpy of formation from MgO and O2, and is therefore metastable. The calculation shows that, on increasing pressure, c-MgO2 transforms into a tetragonal form with space group I4/mcm. In the t-MgO2 phase (Fig. 17d), Mg is eight-coordinate. Here we see the same trend of change from six- to eightfold coordination as in the predicted B1–B2 transition in MgO. However, in MgO2 it happens at a mere 53 GPa, compared to 490 GPa for MgO. Most remarkably, above 116 GPa the t-MgO2 structure has a negative enthalpy of formation from MgO and O2, indicating that t-MgO2 becomes thermodynamically stable. Furthermore, its stability is greatly enhanced by pressure and its enthalpy of formation becomes impressively negative, −0.43 eV/atom, at 500 GPa!

Fig. 17
figure 17

(a) Convex hull for the MgO–O system at high pressures; (b) the enthalpy of formation of MgO2 as a function of pressure; (c) Pa3 structure (c-MgO2); (d) I4/mcm structure (t-MgO2)

We also examined the effect of temperature on its stability by performing quasiharmonic free energy calculations using the PHONOPY code [90]. Thermal effects tend to decrease the relative stability of MgO2 by 0.008 meV/(atom*K), which is clearly insufficient to change the sign of the formation free energy (G), and MgO2 remains stable even at extremely high temperatures.

3.3.2 Mg3O2

For the Mg-rich part of the Mg-O phase diagram (Fig. 18), USPEX shows completely unexpected results. First of all, elemental Mg is predicted to undergo several phase transitions induced by pressure: hcp–bcc–fcc–sh. At ambient conditions, Mg adopts the hcp structure, while bcc-Mg is stable from 50 to 456 GPa, followed by the transition to fcc and simple hexagonal phase at 456 GPa and 756 GPa, respectively. These results are in excellent agreement with previous studies [9193]. Unexpectedly, Mg-rich oxides, such as Mg2O and Mg3O2, begin to show very competitive enthalpy of formation at pressures above 100 GPa. However, they are still not stable against decomposition into Mg and MgO, and their crystal structures could be thought of as a combination of blocks of Mg and B1–MgO. This situation qualitatively changes at 500 GPa, where we find that Mg3O2 becomes thermodynamically stable. This new stable (t-Mg3O2) phase has a very unusual tetragonal structure with the space group P4/mbm. This crystal structure can be viewed as a packing of O atoms and 1D-columns of almost perfect body-centered Mg-cubes. As shown in Fig. 19, there are two types of Mg atoms in the unit cell, Mg1 and Mg2. Here, Mg2 atoms form the cubes, joined into vertical columns and filled by Mg1 atoms.

Fig. 18
figure 18

(a) Convex hull for the Mg–MgO system at high pressures; (b) the corresponding P–T stability diagram of Mg3O2; (c) ELF isosurfaces of t-Mg3O2 (ELF = 0.83); (d) charge density distribution of t-Mg3O2 viewed along the c-axis showing interstitial charge density maxima

Fig. 19
figure 19

(a) Crystal structures of t-Mg3O2 at 500 GPa, space group P4/mbm, a = 4.508 Å, c = 2.367 Å, Mg1(0.3494, 0.1506, 0.5); Mg2(0, 0, 0); O(0.8468, 0.6532, 0); (b) 1D-column of body-centered Mg-cubes

Within the cubic columns, one can notice empty (Mg1)2(Mg2)4 clusters with the shape of flattened octahedra, with Mg–Mg distances ranging from 2.08 Å (Mg1–Mg2) to 2.43 Å (Mg2–Mg2). The coordination environments are quite different: each Mg1 is bonded to two Mg1 atoms and eight Mg2 atoms, while each Mg2 atoms is bonded to six O atoms (trigonal prismatic coordination) and two O atoms. Oxygen atoms in t-Mg3O2 are coordinated by eight Mg2 atoms.

The ELF distribution in t-Mg3O2 (Fig. 18c) also shows strong charge transfer from Mg to O. However, we surprisingly found a very strong interstitial ELF maximum (ELF = 0.97) located in the center of the Mg-octahedron (Fig. 18d). To obtain more insight we performed Bader analysis. The resulting charges are +1.592e for Mg1, +1.687e for Mg2, −1.817e for O, and −1.311e for the interstitial electron density maximum. Such a strong interstitial electronic accumulation requires an explanation. At high pressure, strong interstitial electron localization was found in some alkali and alkaline-earth elements; for instance, sodium becomes a transparent insulator due to strong core–core orbital overlap [21]. As a measure of size of the core region we use the Mg2+ ionic radius (0.72 Å3 [94]), while the size of the valence electronic cloud is represented by the 3s orbital radius (1.28 Å [95]). In Mg3O2, Mg–Mg contacts at 500 GPa (2.08 Å for Mg1–Mg2, 2.37 Å for Mg1–Mg1, and 2.43 Å for Mg2–Mg2) are only slightly shorter than the sum of valence orbital radii, but longer than the distance at which strong core-valence overlap occurs between neighboring Mg atoms (0.72 + 1.28 = 2.00 Å). Thus, the main reason for strong interstitial electronic localization is the formation of strong multicenter covalent bonds between Mg atoms; the core-valence expulsion (which begins at distances slightly longer than the sum of core and valence radii and increases as the distance decreases) could also play some role for valence electron localization.

Strong Mg–Mg covalent bonding is not normally observed; the valence shell of the Mg atom only has a filled 3s 2 configuration, unsuitable for strong bonding. Under pressure, the electronic structure of the Mg atom changes (p- and d-levels become significantly populated), and strong covalent bonding can appear as a result of p–d hybridization. There is another way to describe chemical bonding in this unusual compound. We must remember that Mg3O2 is anion-deficient compared with MgO; the extra localized electrons in Mg octahedron interstitial play the role of anions, screening Mg atoms from each other. These two descriptions are complementary.

3.3.3 Geophysical Implications

What are the implications of these two Mg–O compounds for planetary sciences? High pressures, required for their stability, are within the range corresponding to deep planetary interiors. In the interiors of terrestrial planets, reducing conditions dominate, due to the excess of metallic iron. This makes the presence of MgO2 unlikely. However, given the diversity of planetary bodies it is not impossible to imagine that on some planets strongly oxidized environments can be present at depths corresponding to the pressure of 116 GPa and greater (in the Earth this corresponds to depths below ~2,600 km), which would favor the existence of MgO2. At the more usual reducing conditions of planetary interiors, Mg3O2 could exist at pressures above 500 GPa in deep interiors of giant planets. There it can coexist in equilibrium with Fe (but probably not with FeO, according to our DFT and DFT + U calculations of the reaction of Fe + 3MgO = FeO + Mg3O2). According to our calculations (Fig. 17), Mg3O2 can only be stable at temperatures below 1,800 K, which is too cold for deep interiors of giant planets; however, impurities and entropy effects stemming from defects and disorder might extend its stability field into planetary temperatures. Exotic compounds MgO2 and Mg3O2, in addition to their general chemical interest, might be important planet-forming minerals in deep interiors of some planets.

4 Outlook

Evolutionary algorithms, based on physically motivated forms of variation operators and local optimization, are a powerful tool enabling reliable and efficient prediction of stable crystal structures. This method has a wide field of applications in computational materials design (where experiments are time-consuming and expensive) and in studies of matter at extreme conditions (where experiments are very difficult or sometimes beyond the limits of feasibility).

One of the current limitations is the accuracy of today’s ab initio simulations; this is particularly critical for strongly correlated and for systems where van der Waals interactions are essential [96] – although for the case of van der Waals bonding good progress has been achieved recently [97, 98]. Note, however, that our method itself does not make any assumptions about the way energies are calculated and can be used in conjunction with any method that is able to provide total energies. Most practical calculations are done at T = 0 K, but temperature can be included as long as the free energy can be calculated efficiently. Difficult cases are aperiodic and disordered systems (for which only the lowest-energy periodic approximants and ordered structures can be predicted at this moment).

We are suggesting USPEX as the method of choice for crystal structure prediction of systems with up to ~100 atoms/cell, where no information (or just the lattice parameters) is available. Above ~100 atoms/cell runs become expensive due to the “curse of dimensionality” (although still feasible), eventually necessitating the use of other ideas within USPEX or another approach. There is, however, hope of enabling structure prediction for very large (>200 atoms/cell) systems. USPEX has been applied to many important problems. Here we highlighted the methodology and some applications in (1) prediction of molecular crystal structures and (2) variable-composition structure predictions. Due to lack of space, we did not describe here the following important advances:

  • Methods to predict structures of nanoparticles [17] and surfaces [99], including variable-cell and variable-composition surface reconstructions.

  • Hybrid optimization approach to optimize physical properties [23, 24] – this technique can be used for practically any physical property, and its variable-composition extension is available in USPEX.

  • Evolutionary metadynamics [7], a powerful hybrid of the evolutionary algorithm USPEX and metadynamics.

One can expect many more applications to follow, both in high-pressure research and in materials design.