Keywords

1 Introduction

Artificial intelligence (AI) has already demonstrated its powerful performance and efficiency through numerous examples. In the field of natural science, AI is rapidly improving efficiency, comparable to the fourth wave. One area where AI algorithms have been introduced and are making a significant impact is computational materials science [1,2,3,4,5]. This field, which primarily simulates the properties and phenomena of materials, is gaining attention across various industries, including semiconductors, batteries, light-emitting devices, chemistry, new drug development, catalysts, and solar cells [6,7,8,9,10,11,12,13,14].

Density Functional Theory (DFT) [15, 16], proposed in the 1970s, boasts surprisingly high accuracy and has become a major method in the field of computational science. However, DFT methods require large computational resources at the cost of high accuracy [10, 17, 18]. Recently, AI technology has been rapidly introduced to solve these problems and has achieved considerable results. However, it still lacks accuracy compared to the DFT method [19].

A fundamental and central task in computational materials science is to calculate the total energy for a given atomic structure. A total energy regression AI model called machine learning potential is based on calculating the total energy by predicting and summing the contributions of each target atom to the total energy [20,21,22,23,24]. This method has a similar structure to the empirical potential, which is a traditional method of calculating total energy.

We propose a new architecture that considers both atoms and bonds. This approach is more physically intuitive as the total energy of a material consists of both the energy of the atoms themselves and the energy of chemical bonds. This architecture is more natural and similar to DFT calculations. If the architecture of the AI model is similar to how DFT calculations work, then it can be expected that the AI model will learn more easily. This effect is especially effective in non-metallic materials with clear chemical bonds rather than metallic materials with ambiguous bonds. Any AI model that interprets atomic structure as a graph and calculates total energy can benefit from this approach and improve its performance.

2 Background and Base Model

Machine Learning Potential. In 2007, Behler and Parrinello first proposed an AI architecture for regressing total energy based on atomic positions [20]. Density functional theory-based first-principles calculations are one of the most widely used methodologies in computational materials science and can calculate various properties for materials. Total energy is one of the most basic and essential physical quantities in property calculations. Total energy is a kind of function that takes atomic structure as a variable, meaning the type and arrangement of atoms that make up an arbitrary material. For example, an atomic structure containing N atoms is expressed as an atomic number \(\textbf{Z}=\{Z_1, Z_2, Z_3, ..., Z_N\}\) and each atom’s position \(\textbf{R}=\{\textbf{r}_1, \textbf{r}_2, \textbf{r}_3, ..., \textbf{r}_N\}\) where \(\textbf{r}_i\) is a Cartesian coordinate \(\textbf{r}_i=\{x_i, y_i, z_i\}\). The traditional computational chemical methodology of empirical potential was adopted. According to this methodology, total energy \(E_{total}\) is

$$\begin{aligned} E_{total}=\sum _{i}E_{i} \end{aligned}$$
(1)

where \(E_i\) is the contribution of the \(i_{th}\) atom to total energy. The architecture of machine learning potential is shown in Fig. 1. Almost all machine learning potentials published so far have a structure like this. In Fig. 1, \(G_i^{\varphi }\) is an input vector of size \(\varphi \) transformed by symmetric function.

Fig. 1.
figure 1

Basic architecture of a machine learning potential [20]. The gray and pink boxes represent the descriptor and main network, respectively. All black and dotted arrows indicate the flow of data. The round circle represents an operation that calculates the sum of input values. The data shape at each step is indicated by blue numbers. (Color figure online)

Base Model. The model proposed in this study is based on GemNet-OC [25] and uses a similar architecture. GemNet-OC is a graph neural network (GNN) that represents an atomic system as a graph G = (V, E), where the set of graph nodes, V, represents each atom and the set of edges, E, is defined as all pairs of atoms within a certain cutoff distance. The first model, similar to state-of-the-art GNNs, was proposed in 1997 but only gained popularity after several works demonstrated its potential for a wide range of graph-related tasks [26]. GemNet-OC evolves the two-level message passing scheme proposed in MXMNet into an interaction layer and utilizes both edge and node embeddings. GemNet-OC is a model based on Geometric Message Passing Neural Networks (GemNet) [27], which uses a similar architecture and improves the accuracy of forces experienced by atoms.

3 Datasets

We used the Open Catalyst 2022 (OC22) dataset for model training [19]. The OC22 dataset is designed to enable the development of generalizable machine learning (ML) models for catalysts, particularly for oxygen and hydrogen evolution reactions and oxide electrocatalysis. The dataset consists of oxide surface structures combined with constituent elements and oxide surface structures with adsorbed molecules, as well as defects such as atomic substitution and vacancies. By providing a diverse and representative training dataset, OC22 aims to support the development of generalized models that can accurately predict catalytic reactions on oxide surfaces. Models generated from this dataset are expected to accelerate the discovery and design of new catalysts for a wide range of applications.

The primary task of OC20 is to regress the total energy obtained through first-principles calculations based on DFT from the atomic structure. The dataset is divided into training/validation/test sets and each set includes both material surface structures and surface structures with adsorbed molecules. The dataset includes 19,142 material surface structures and 43,189 surface structures with adsorbed molecules, with a total of 9,854,504 data points. Diversity in surface structure and adsorption structure was prioritized when constructing the dataset to ensure that a generalized model can be built.

4 Edge Based Architecture of GemNet-OC

GemNet-OC is a graph neural network (GNN) that represents atomic systems as graphs, with its architecture being improved to map the energy of the edge embedding using GemNet as a base model. In this architecture, nodes and edges are embedded respectively, with the edge embedding being used to regress the force received by atoms. An architecture similar to the empirical potential for the total energy was proposed, as shown in Eq. (1) and Fig. 1.

However, the total energy can also be described in terms of heat statistics. Specifically, it can be expressed as

$$\begin{aligned} E_{total}=\varOmega +\sum _{i}\mu _{i} \end{aligned}$$
(2)

where \(\varOmega \) is the formation energy representing chemical interaction between atoms and \(\mu _{i}\) is the energy of one atom in terms of thermostatistics. The formation energy \(\varOmega \) can be further expressed as \(\varOmega =\sum _{m}e_m\), which is the sum of the binding energies \(e_m\) of atomic pairs. The binding energies \(e_m\) of atomic pairs are mapped to the values from the edges of the atomic structure graph by the main network.

This structure has several advantages. Firstly, \(\mu _{i}\) is more consistent about the placement of atoms than \(E_{i}\), making the model easier to train. Secondly, the total energy naturally regresses by the formation energy calculated as the sum of the binding energies. To reflect these formulas, the architecture was modified to map energy to edges as well. The modified architecture can be seen in Fig. 2.

Fig. 2.
figure 2

Main network of the GemNet-EB architecture. Changes are highlighted in orange. \(\square \) denotes the layer’s input, || concatenation, \(\sigma \) a non-linearity. The massage passing block and Embedding block have same architecture with the GemNet-OC [25]. (Color figure online)

5 Results

Due to limited computational resources, only 200,000 training data points (1/40 of the total OC22 dataset) were used for training over 9 epochs. However, since the same dataset was used for all models being compared and 200,000 is still a large number of data points, it is still meaningful to test the performance of the models. The training and validation errors were reduced equally for all models, indicating that there was no underfitting or overfitting (Fig. 3). The inset of Fig. 3-b shows that the validation error of GemNet-EB (shown in red) is about 3.9% smaller than that of the base model GemNet-OC (shown in green).

Fig. 3.
figure 3

Energy mean absolute error (MAE) of train (a) and validation (b) respectively. The x-axis represents the epoch and the y-axis represents the MAE on a log scale. The inset in (b) is a zoom-in of the red box zone at the bottom right. (Color figure online)

6 Conclusion

We proposed a new and improved architecture for regressing the total energy from atomic structures. Our newly proposed GemNet-EB model achieved a 3.9% lower validation error than the base model. Of course, other experiments are possible, but further studies are needed because it is difficult to test more than this due to the limitation of computer resources. However, this study is still significant because we applied the new concept of the total energy prediction model, as shown in Eq. (2), and achieved a low validation error rate with this method. There is a significant difference in errors between the other models and the GemNet base model, indicating that the edge embedding of the GemNet-OC model also plays an important role in total energy regression. By directly mapping bond energy using edge embedding, we were able to improve the accuracy of the model. While edge embedding is indirectly reflected in node embedding through the interaction block, we were able to improve performance by directly connecting it to total energy regression.

However, considering that the accuracy of DFT calculations used for surface structure and surface adsorption energy studies is less than 0.01 eV, there is still room for improvement. Since our new method is not limited to any specific model or architecture and can be applied to any GNN-based model for atomic structures, our proposed new architecture can serve as a foundation for advancing machine learning potential.