1 Introduction

With the flourishing of nanotechnology, transition-metal-related nanocatalysts have attracted the interest of many scientists due to their high activity, high selectivity and long lifetime. Among the transition-metal-containing catalysts, Mo2C is known for its extreme hardness, thermal stability and high reactivity in heterogeneous catalysis. It has catalytic activity in a wide variety of reactions: hydrogenation [1], hydrocracking [2], Fischer–Tropsch [3], ammonia synthesis and decomposition [4], water–gas shift [5], dehydrogenation [6], isomerization [7] and methanation [8]. In ultra-dispersed form, the molybdenum carbide nanoparticles (MCNPs) were found to have equivalent activity to noble metals in hydrogenation reactions [9] and are promising materials for large-scale industrial applications such as the in situ upgrading of oil sands [10].

Although ab initio or density functional theory (DFT) methods are often the first choice for studying many chemical problems, there are still many cases for which semiempirical methods can be attractive [11]. For example, when the system contains a large number of atoms such that the structural features or the dynamics or the entropy plays important roles, the neglect of these factors may be more severe than choosing a method slightly less accurate than ab initio or DFT methods [11]. Also, technical issues related to energy-level crossings during chemical reactions and resulting SCF convergence failures in ab initio or DFT methods make the application of these methods very difficult in practice. Reactions on nanoparticles, such as the hydrogenation and hydrocracking of hydrocarbons on an MCNP in an in situ oil sands emulsion environment, are problems with both of the above-mentioned characteristics. First, the MCNPs are usually larger than 2 nm in size and are amorphous [12]; therefore, a small cluster model of a few tens of atoms may miss the structural (shape of the active sites) and electronic features (electron delocalization over the entire nanoparticle) of the MCNPs. Periodic models, on the other hand, ignore the finite-size effect [13]. Also, the catalytic reactions in industry usually happen at high temperature in reaction environments, which implies potentially important dynamic and entropic effects. Secondly, we have previously tried to use DFT [14] to study the kinetics of benzene hydrogenation on MCNPs using cluster models large enough to accommodate the benzene and all the incoming hydrogen atoms. Due to difficulties in SCF convergence and the N 3N 4 scaling of computer time with respect to the number of orbitals, our DFT attempts were not fully successful.

Density-functional-based tight-binding (DFTB) is an approximate method [15] based on the second-order expansion of the DFT energy functional with respect to the density, and it requires only a small number of parameters. It has proven to be successful in the calculations of geometries [16], dynamics [17], vibrational frequencies [18] and band structures. Although DFTB may be quantitatively less accurate than DFT, it has been shown to give reliable predictions of structure and energies in many chemical and physical problems such as enzyme-catalyzed reactions, semiconductor etching reactions and surface chemistry.

In this work, the DFTB parameters for the elements Mo, C, H, O and Si are developed for the study of the hydrogenation/hydrocracking of hydrocarbons on MCNPs in their working environments. The goal of the work is to develop and validate a new set of DFTB parameters for the five elements in view of their application to these important systems. The C, H and Mo parameters are critical for the catalyzed hydrocarbon reactions and have been tested in terms of electronic structure, geometry, vibrations and reaction energies. Si and O can be used to model the clay particles (SiO2) in the environment and have been tested electronically and structurally. They have recently been applied in an extensive study of the benzene hydrogenation reaction on MCNPs, including a quantum mechanical/molecular mechanical (QM/MM) model that included 100 benzene molecules to model the solvent and used umbrella sampling in order to calculate the free-energy profiles and bring out important entropic effects [19]. Reference [19] should be read along with the present paper since it represents the ground-breaking application for which the DFTB parameters were developed.

2 Methodology

The total energy of DFTB is the expansion of the DFT energy expression around the reference density to second order [15, 16, 20]:

$$\begin{aligned} & E\left[ \rho \right] = \mathop \sum \limits_{i}^{\text{occ}} n_{i} \left\langle {\psi_{i} \left| { - \frac{1}{2}\nabla^{2} + \int \frac{{\rho_{0} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{{\prime }} } \right)}}{{\left| {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{{\prime }} } \right|}} {\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{{\prime }} + v_{\text{xc}} \left[ {\rho_{0} } \right]} \right|\psi_{i} } \right\rangle \\ & \quad - \frac{1}{2}\int \int \frac{{\rho_{0} \rho_{0}^{{\prime }} }}{{\left| {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} '} \right|}} {\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} {\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{'} + E_{\text{xc}} \left[ {\rho_{0} } \right] - \int v_{\text{xc}} \left[ {\rho_{0} } \right]\rho_{0} {\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} + E_{\text{nn}} \\ & \quad + \frac{1}{2}\int \int \left( {\frac{{\delta \rho \delta \rho^{{\prime }} }}{{\left| {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{{\prime }} } \right|}} + \left. {\frac{{\delta^{2} E_{\text{xc}} }}{{\delta \rho \delta \rho^{\prime}}}} \right|_{{\rho_{0} }} } \right){\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} {\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{{\prime }} , \\ \end{aligned}$$
(1)

where \( v_{\text{xc}} \) is the exchange–correlation potential, and \( E_{\text{nn}} \) is the nuclear repulsion energy. Practically, in DFTB, we solve KS-like equations:

$$ \left( {\hat{T} + V_{\text{eff}} } \right)\psi_{i} = \varepsilon_{i} \psi_{i} , $$
(2)

where \( V_{\text{eff}} \) is expressed as the superposition of spherical atom potentials calculated from superposed spherical atomic densities. These spherical atoms, however, are under certain confinement potentials to reduce their spatial extension taking into account the chemical environment of the atoms in target systems. Therefore, determining proper confinement potentials and subsequently solving the KS equations of a spherical pseudoatom:

$$ \left( {\hat{T} - \frac{{Z_{A} }}{r} + \int \frac{{\rho_{0}^{A} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{{\prime }} } \right)}}{{\left| {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{{\prime }} } \right|}}{\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{{\prime }} + v_{\text{xc}} \left[ {\rho_{0}^{A} } \right] + V_{\text{conf}} } \right)\phi_{\nu }^{A} = \varepsilon_{\nu } \phi_{\nu }^{A} , $$
(3)

to obtain the confined atomic orbitals are key steps of generating DFTB parameters (in this context, the Slater–Koster files). Afterward, the KS orbitals in Eq. (2) can now be constructed as a linear combination of these atom-centered confined atomic orbitals, considering only valence orbitals:

$$ \psi_{i} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} } \right) = \mathop \sum \limits_{\nu } c_{\mu i} \phi_{\mu }^{A} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {R}_{A} } \right), $$
(4)

where \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {R}_{A} \) is the nucleus where orbital ϕ A μ is centered. Assuming a two-center approximation to the \( V_{\text{eff}} \) in the DFTB KS equivalent Eq. (2), three-center contributions in DFT are reduced to two-center contributions that are centered at the same position as the basis function:

$$ V_{\text{eff}} \approx V_{\text{eff}}^{A\left( \mu \right)B\left( \mu \right)} $$
(5)

To ensure correct dissociation energies, the diagonal elements of the first-order KS matrix F 0 μμ are not calculated with the \( V_{\text{eff}} \) of the confined pseudoatom, but using that of free spherical atoms. As a result, the Hamiltonian matrix elements in standard DFTB are:

$$ H_{\mu \nu }^{0} = \left\{ {\begin{array}{*{20}l} {\varepsilon_{\mu }^{\text{free}} ,} \hfill &\quad {\mu = \nu ,} \hfill \\ {\phi_{\mu } |\hat{T} + V_{\text{eff}} |\phi_{\nu } ,} \hfill &\quad {\mu \in A,\nu \in B, A \ne B} \hfill \\ {0,} \hfill &\quad {\text{otherwise}} \hfill \\ \end{array} } \right. $$
(6)

The Hamiltonian matrix elements and the overlap matrix elements are calculated once and stored in the DFTB parameter files. In the self-consistent correction (SCC) version of the method [21], the Hamiltonian matrix elements are corrected by a charge fluctuation term \( H_{{{{\mu \nu }}}}^{1} \):

$$ H_{\mu \nu } = H_{\mu \nu }^{0} + \frac{1}{2}S_{\mu \nu } \mathop \sum \limits_{\xi }^{N} \left( {\gamma_{\mu \xi } + \gamma_{\nu \xi } } \right){{\Delta }}q_{\xi } = H_{\mu \nu }^{0} + H_{\mu \nu }^{1} $$
(7)

where \( {{\Delta }}q_{\xi } \) is the change of orbital population (charge) compared to that of the neutral free atom. In DFTB the total energy expression in Eq. (1), the sum in the first line is the electronic energy term, which can also be written as:

$$ E_{\text{bnd}} = \mathop \sum \limits_{i}^{\text{occ}} n_{i} \psi_{i} |\widehat{H}_{0} |\psi_{i} , $$
(8)

where \( \widehat{H}_{0} \) depends only on the reference density \( \rho_{0} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} } \right). \) The sum of the second line in Eq. (1) is the repulsive term:

$$ E_{\text{rep}} = - \frac{1}{2}\int \int \frac{{\rho_{0} \rho_{0}^{\prime } }}{{\left| {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{\prime } } \right|}}{\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} {\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{\prime } + E_{\text{xc}} \left[ {\rho_{0} } \right] - \int v_{\text{xc}} \left[ {\rho_{0} } \right]\rho_{0} {\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} + E_{\text{nn}} $$
(9)

This term is somewhat similar in spirit to the exchange–correlation energy in DFT. It collects all the terms that are inefficient to evaluate analytically. It is called the repulsive energy because it contains the nuclear–nuclear repulsion term E nn and is, in total, repulsive. However, it also contains the exchange–correlation energy and the electron–electron Coulomb interactions. This term is usually fitted against DFT or other ab initio methods. The fitting of the repulsive potential is usually the most labor-intensive task in DFTB parameterizations [22]. The sum of the third line in Eq. (1) depends both on the reference density and on the density fluctuation and is a second-order term:

$$ E_{{ 2 {\text{nd}}}} = \frac{1}{2}\int \int \left( {\frac{{\delta \rho \delta \rho^{{\prime }} }}{{\left| {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{{\prime }} } \right|}} + \left. {\frac{{\delta^{2} E_{\text{xc}} }}{{\delta \rho \delta \rho^{{\prime }} }}} \right|_{{\rho_{0} }} } \right){\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r} {\text{d}}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {r}^{'} . $$
(10)

The combination of \( E_{\text{bnd}} \)- and \( E_{{2{\text{nd}}}} \)-related parameters is collectively called the electronic part of SCC-DFTB. Technically, Kohn–Sham and overlap matrix elements are typically tabulated using the Slater–Koster (SK) technique [23] and together with the other parameters stored as text files called Slater–Koster (SK) files. Although the self-consistent charge (SCC) correction can improve the quality of the DFTB calculations, it is known to cause convergence failure for transition-metal systems at high temperature, which is the case for most MCNP-catalyzed reactions. So in this work, the repulsive part is created to work with the standard DFTB without the SCC correction.

The most popular DFTB parameter sets are the “mio” set from Elstner et al. [21]. and the “matsci” set from Seifert et al. [24]. The traditional way of developing DFTB parameters is to introduce new elements to one of these two sets. In other words, one needs to make the new elements work with each of the elements in the older set. Instead of expanding the older parameter sets using this traditional approach, we took advantage of a recently developed DFTB parameter set (electronic part only) for the whole periodic table [25]. In this development, a semiautomatic parameterization scheme was used to build the electronic part of the DFTB parameters for the periodic table. A confinement potential was used to tighten the Kohn–Sham (KS) orbitals of each element with two free parameters; the parameters for each element were optimized such that this element can work with most other elements, reaching consistency throughout the entire periodic table.

There are two considerations for us to use this new strategy in our current work. First, the mio [21] and matsci [24] sets were developed many years ago and were optimized to deal with systems of special interest. The new DFTB electronic parameters developed recently by Wahiduzzaman et al. [25] considered many different phases and combinations of the elements in the periodic table and reproduce very well the DFT band structures of the target systems. These electronic parts of the DFTB parameters conveniently laid the foundation for bridging the gap between material systems and organic systems in DFTB. Testing these electronic parts and parameterizing the repulsive part is the best strategy for developing a specific DFTB parameter set for treating inorganic material–organic molecule hybrid systems with good accuracy. Secondly, the traditional way of DFTB parameter development requires only the consistency among the (usually only a few) elements of the systems of interest. As a result, there exist many independent parameter sets that work well by themselves, but could not work with one another. This has greatly hindered the application of the DFTB method. Our new strategy makes it more convenient for future extensions of our parameter set to other elements.

3 Computational details

3.1 Validating the electronic part

The electronic part of the DFTB energy comes directly from the DFTB pseudoatoms and parameters reported recently [25]. In this procedure, the confinement potentials are defined by two free parameters r 0 and σ and optimized to produce the band structures for a variety of crystal systems:

$$ V_{\text{conf}} = \left( {\frac{r}{{r_{0} }}} \right)^{\sigma } , $$
(11)

where r 0 is the confinement radius, and σ is another free parameter. The valence shells and electronic configurations of the related pseudoatoms are listed in Table 1. For validation of the electronic part, the band structures of the periodic systems are calculated using the same method as in the literature [25] (DFT/PBE/TZP/ZORA in ADF/BAND [26]); molecular systems are calculated with DFT/PBE/DZVP/GEN-A2 in deMon2k [27].

Table 1 Confinement potentials of the atoms Mo, C, H, O and Si

3.2 Building the repulsive part

The repulsive parts were parameterized against the original Perdew, Burke and Ernzerhof (PBE) generalized gradient approximation (GGA) exchange and correlation using the deMon2k [27] code. The DZVP [28] basis set was used for all light atoms. Mo atoms were treated with a relativistic model core potential RMCP [29] and the relevant valence basis set. The GEN-A2 auxiliary set [30] was used for the fitting of the Coulombic potential and the exchange–correlation functionals. These settings have been shown to give very accurate energies and structures for Mo-related systems in our previous benchmarkings [14]. For periodic DFT calculations, the VASP code was used with the PBE/PAW [31] method. The kinetic energy cutoff was 500 eV, and the Brillouin zone was sampled with 13 × 13 × 13 k-points. DFTB calculations for parameterization were performed with the deMonNano [32] code. For transition-state searches, the nudged elastic band (NEB) method with the DFTB + code embedded in the Atomic Simulation Environment (ASE) program [33] was used, with a convergence criterion of 0.10 eV/Å. Each chain of states in the NEB consists of 19 images. The Mo38C19 cluster has been used in our previous work [14]. In the present study, the Mo38C19 substrate cluster was fixed, and all other atoms on the surface were allowed to move.

The repulsive potentials are approximated as the superposition of short-ranged diatomic contributions, and they are calculated as the difference between the DFT energy and the binding energy \( E_{\text{bnd}} \) as follows:

$$ E_{\text{rep}} \left[ {R_{\alpha \beta } } \right] = \left\{ {E_{\text{DFT}} \left( {R_{\alpha \beta } } \right) - E_{\text{bnd}} \left( {R_{\alpha \beta } } \right)} \right\}. $$
(12)

They are then fitted to reference calculations as polynomials using the FIT [34] program written in C with the following expression:

$$P\left( x \right) = \mathop \sum \limits_{i = 2}^{n} c_{i - 2} \left( {r_{\text{cut}} - x} \right)^{i} \quad \left( {{\text{if}}\; r_{\text{cut}} < x, P\left( x \right) = 0} \right),$$
(13)

Here, x is the distance between the atoms and r cut is the cutoff distance. r cut was chosen to minimize the errors in the bonding region, which is within the distances that correspond to 0.1 hartree in energy higher than the energy minimum. The region from the cutoff distance to the dissociation limit was not fitted; instead, it was corrected with the London dispersion energy in benchmarking or productive calculations. The interval between each point was 0.1 bohr; however, in the critical bonding region, an interval of 0.05 bohr was used to increase the fitting accuracy.

To be more specific, a DFT and a DFTB (with only the electronic part) single-point calculation was performed on each of the grid points. This is straightforward for molecular systems. For periodic systems, we took the experimental unit cell constants and the fractional coordinates of the atoms and multiplied with a contraction coefficient to scale the distances between the atom pairs to the target values. Since deMonNano has only Γ-point periodicity implemented, a 5 × 5 × 5 supercell was used for DFTB calculation of Mo (fcc) and 7 × 7 × 4 for Mo2C (hcp). The calculated energies of both methods for the entire region were extracted and fed into the FIT program. The weights of the data points were adjusted such that the region close to the equilibrium distance had higher weight, and the data points far away from the equilibrium distance had lower weight. A polynomial was then generated by the FIT program as the repulsive potential. This repulsive potential has to be tested by adding to the SK file and reproducing the DFT energy curve. Finally, these polynomials were converted to splines for use in other DFTB programs, as the latter is the only format that can be recognized by many DFTB programs.

In order to get the best performance of the new DFTB parameters in modeling the in situ oil upgrading, the molecules used for the fitting must be properly chosen so that they have similar interatomic bonding to the systems of interest. The reference molecules used for the fitting of the repulsive potentials are listed in Table 2.

Table 2 The Slater–Koster files and the reference molecules used for the fitting of repulsive potentials

The limitation of the DFTB method implies that it is hardly possible to develop a parameter set that works universally for all systems. We balanced accuracy and transferability in such a way that our parameters are able to predict the common hydrogenation reaction energies close to chemical accuracy. This was done by shifting the E rep curves for both the C–C and C–H curves simultaneously by values, which are named SHIFTC–H and SHIFTC–C, respectively. These two parameters are then optimized iteratively to get the best hydrogenation energies for DFTB. For the hydrogenation reactions on Mo2C, the accuracy of the energies on the potential energy surface is largely determined by the movement of the H atom on the Mo atoms, in other words, the Mo–H parameters. This parameter was optimized by using different reference molecules and searching for the best SHIFTMo–H such that DFTB could reproduce the topology of H atoms on the Mo2C surfaces (threefold hollow site, according to DFT calculations [37]) and the dissociative adsorption [37] energies of DFT. In the end, we found that the longer Mo–H bonds in MoH6 (prism, C3v) gave the best performance; the relevant benchmarkings are included in the results part.

4 Results and discussion

4.1 The electronic part

The electronic part of the parameters, as obtained by Wahiduzzaman et al. [25], produces band structures in close agreement with DFT (PBE) reference data. Examples are given in Fig. 1 for Mo (bcc) and MoC (rocksalt); other data can be found in the supporting information of ref. 25. For further validation, the band structures of Mo2C (space group Pbcn) were calculated and compared with the band structures obtained with DFT. Both calculations show that Mo2C has a highly metallic nature with no band gaps. DFTB reproduced very well the overlap of the conduction band and the valence band in Mo2C.

Fig. 1
figure 1

The band structure of selected Mo and molybdenum carbide phases calculated with DFTB and DFT/PBE; all energies are relative to the Fermi energy. a Mo (bcc), b Mo (fcc) and c MoC (rocksalt) are adapted with permission from (J. Chem. Theory Comput. 2013, 9(9), 4006–4017) Copyright (2014) American Chemical Society. d Mo2C (space group Pbcn) is from this work

As far as carbon is concerned, there are two types of atomic environment in the target systems. One is the carbidic carbon in molybdenum carbide, and the other is the organic carbon in the hydrocarbon molecules. Therefore, the transferability of DFTB to deal with both of these cases was tested. The applicability of our parameters to deal with carbidic carbon is manifested in the calculation of MoC and Mo2C band energies in Fig. 1. The extension of the DFTB parameters to organic systems is confirmed by comparing the molecular orbital energies of a variety of hydrocarbon molecules to first-principle calculations (Fig. 2). We use the occupied molecular orbital energies as our validation protocol for the electronic part because as Eq. (10) shows, the electronic energy (E bnd) is the sum of all the occupied molecular orbital energies. In all the cases tested, our new parameters match with the DFT PBE/DZVP results better than the published DFTB parameter sets such as mio-SCC and matsci (Dresden-SCC). Interestingly, the SCC correction does little to improve the calculated molecular orbital energies.

Fig. 2
figure 2

The occupied molecular orbital energies of selected organic molecules: methane, ethene, ethane and benzene calculated from different levels of theories. H-n represents the nth molecular orbital lower in energy than the highest occupied molecular orbital (HOMO). “Dresden-SCC” refers to the matsci SK set. All calculations used the DFT/PBE/DZVP optimized structure

To validate the quality of the electronic part of the Mo–H, Mo–O and Mo–Si parameters, we also calculated the orbital energies of MoH6 (prism, C3v), MoO3 and MoH5SiH3, respectively, using both DFT/PBE and DFTB. As Fig. 3 shows, DFTB could reproduce the trends of the orbital energies of DFT in all three molecules. For MoO3, DFTB overestimates the LUMO orbital energies compared to DFT and SCC-DFTB. The shift of the orbital energies down by ~0.5 hartree can be attributed to the decrease in the electronic repulsion energies due to the omission of multicenter interactions in DFTB. Although the Mo–H bonds in MoH6 should have some ionic nature, it is surprising that the SCC correction does little to improve the performance of DFTB.

Fig. 3
figure 3

The molecular orbital energies of a MoH6, b MoO3 and c MoH5SiH3 calculated from different levels of theories; H-n represents the nth molecular orbital lower in energy than the HOMO. All calculations used the PBE/RMCP optimized structure

4.2 Structures

Our DFTB parameterization was tested for its accuracy for structures by calculating the bond lengths of hydrocarbon molecules and some other non-hydrocarbon molecules. Compared to the reference bond distance of DFT/PBE, the error for our DFTB is <1.45 %, slightly better than the mio set (2.30 %). For non-hydrocarbon molecules, our new SK set is almost at the same quality as the matsci set. The RMS error of 5.25 % mostly comes from the MoC (3) calculation. This is not a surprise as standard DFTB is not good at treating open-shell systems. Also, the difference of the Mo–C bond between the reference compound and MoC (3) may also contribute to this error. Mo–C parameters are fitted to the Mo2C (hcp) lattice structure where the C atoms are sitting in the Mo octahedral vacancies. However, in MoC (3), the electronic configuration is 10σ211σ242, with much higher bond order (Table 3).

Table 3 Selected bond distances (Å)

To test the quality of the DFTB parameters for the treatment of organic hydrocarbons on molybdenum carbide clusters, the structures of the Mo3C–benzene pair and the Mo2C–ethylene pair were optimized with DFT/PBE and DFTB. Figure 4 shows the DFTB structures, and Tables 4 and 5 compares bond lengths of DFT, DFTB and SCC-DFTB. For the Mo3C–benzene pair (Table 4), both the distorted benzene ring and the bonds between the benzene and Mo3C are very close for the two methods. However, the two Mo–C bond distances in Mo3C are much closer to each other in DFT (they differ by 0.23 Å) than in DFTB (differ by 0.51 Å); this is the main discrepancy between the two methods. Also, DFTB tends to overestimate the dihedral angle between the Mo–Mo–Mo plane and the Mo–Mo–C plane. This is more obvious in the structure of the smaller Mo2C–ethylene complex. The DFTB chemisorption bond distance in the Mo2C–ethylene complex is different from DFT by 0.08 Å, while in the Mo3C–benzene complex the average error of the chemisorption bond distance is <0.004 Å (Table 5).

Fig. 4
figure 4

The optimized structures of the Mo3C–benzene pair (a) and the Mo2C–ethylene pair (b) with DFTB. The H atoms are in white, C in gray and Mo in cyan

Table 4 The optimized bond lengths of the Mo3C–benzene pair with DFT/PBE and DFTB
Table 5 The optimized bond lengths of the Mo2C–ethylene pair with DFT/PBE and DFTB

4.3 Vibrations

For the benzene molecule and the Mo3C–benzene complex, normal mode analyses were performed based on second derivative calculations. The generated vibrational frequencies are compared with DFT/PBE results as plotted in Fig. 5 (the values of the vibrational frequencies can be found in the supporting information). Generally, the DFTB results agree reasonably well with the DFT/PBE results. For benzene, the largest deviation comes from the 700- to 850-cm−1 region, which corresponds to the C–C–C out-of-plane modes, and the 1450- to 1650-cm−1 region, which corresponds to the associated C–C and C–H stretching motions. The average error for benzene is 106.6 cm−1. When benzene is bonded to the Mo3C molecule, the out-of-plane modes are restrained, giving better agreements with DFT results. The new modes appear in the region below 400 cm−1 correspond to the associated motions of the Mo3C–benzene complex. For the region above 2800 cm−1, the new C–H stretching modes are caused by the symmetry breaking of benzene after binding with Mo3C. The DFTB results agree quite well in both of these two regions. The average error of DFTB for the Mo3C–benzene complex is 96.7 cm−1.

Fig. 5
figure 5

Harmonic vibrational frequencies of Mo3C–benzene complex and benzene molecule with DFTB and DFT/PBE. The geometries of the molecules have been optimized at the respective levels of theory prior to the normal mode analysis

4.4 Hydrogen dissociative adsorption on molybdenum carbide

The quality of the Mo–H parameters is tested in describing the adsorption of H2 molecules on the Mo-terminated (0001) surface of a Mo38C19 cluster [14] cut from α-Mo2C (space group: Pbcn). Periodic DFT calculations [37] have shown that H2 adsorbs dissociatively on the threefold hollow sites of the (0001) surface of Mo2C; among the Hm (a Mo atom under the Mo–Mo–Mo hollow site), Hc (a C atom under the Mo–Mo–Mo hollow site) and Vc (a vacancy under the Mo–Mo–Mo hollow site) sites, H atoms occupy the Hm and Vc sites. Here, two H atoms are used for the adsorption in order to keep the system in a closed-shell state. We consider all the three combinations of the Hm and Vc sites: Vc–Vc sites, Vc–Hm sites and Hm–Hm sites (Fig. 6). The adsorption energies of the three adsorption configurations are compared to DFT in Table 6. The average error in the adsorption energy is only 0.9 kcal/mol, although DFTB could not follow exactly the trend of site preferences of DFT. The latter finding is not a surprise because the second layers of atoms which define the local environment of the sites are far away from the H atoms. Since the DFTB atom-centered orbitals are known to decay rapidly from the center of the atom, long-range interactions are essentially handled by the UFF-type dispersion term [38]. However, the PBE functional form in DFT contains a significant amount of correlation effect [38] in this range to be sensitive enough to the electron density difference in the second layer. Also, the correction of PBE by the damped dispersion term [39] may also contribute to the difference between DFT and DFTB. As such, the accuracy of the adsorption energies may be affected if dispersion corrections are accounted for in a different way.

Fig. 6
figure 6

The three dissociative adsorption configurations of H2 on Mo38C19: Vc–Vc sites, Vc–Hm sites and Hm–Hm sites calculated from DFTB. The Mo38C19 substrate cluster is fixed, and all other atoms are allowed to move

Table 6 The dissociative adsorption energies (kcal/mol) of H2 on the Mo-terminated (0001) surface of a Mo38C19 cluster

4.5 Hydrogenation/hydrocracking reaction energies and the transition states

As this parameter set has been developed mainly to study the hydrogenation and hydrocracking reactions of hydrocarbons, its accuracy in calculating the reaction energies of various hydrogenation/hydrocracking reactions has been tested further. We selected 10 reactions (Table 7), including the hydrogenation of C–C triple bonds (#1), and typical double bonds (#2 and #9), the successive hydrogenation of an aromatic ring (#3, #4 and #5), and the hydrocracking of C–C single bonds in aliphatic (#6, #7 and #10) and cyclic molecules (#8). In most cases, DFTB energies are close to corresponding DFT/PBE energies with errors of ~2 kcal/mol. One exception is the hydrogenation of acetylene to ethylene, for which the DFT/PBE overestimates the exothermicity of the reaction by ~10 kcal/mol, but DFTB gives reasonable results. The hydrocracking of ethane and cyclohexane DFTB has errors of ~5 kcal/mol compared to DFT/PBE (Table 7).

Table 7 ΔR E of selected hydrogenation reactionsa (in kcal/mol)

The quality of the DFTB parameters to describe the transition states of hydrogenation reactions on molybdenum carbide clusters is verified by comparing to DFT results the transition states of benzene hydrogenation reaction on a fixed Mo38C19 cluster. As Fig. 7 shows, our DFTB parameters can reproduce the reaction energies of the benzene hydrogenation reaction, as well as the reaction coordinates and the energy barriers of both hydrogenation and dehydrogenation reactions. As mentioned, further application of the DFTB parameters in the hydrogenation reactions on more realistic models of molybdenum carbide nanoparticles has been reported elsewhere (see Ref. [19]), using a biased molecular dynamics protocol.

Fig. 7
figure 7

The reaction paths of benzene hydrogenation on a Mo38C19 MCNP calculated by DFT (transition-state search) and DFTB (nudged elastic band). The Mo38C19 substrate cluster is fixed, and all other atoms are allowed to move. “Dispersion” indicates that the UFF dispersion coefficients are included in the DFT and DFTB calculations. The reaction coordinate is defined as the distance between the incoming H atom and the target C atom on the benzene ring

5 Conclusion

In this work, DFTB parameters for the elements Mo, C and H were developed to model catalytic hydrogenation/hydrocracking of hydrocarbons on MCNPs in hydrocarbon and clay-containing environments. This is a DFTB parameter set that is developed for hybrid inorganic material–organic systems. It takes advantage of the electronic part developed by a semiautomated parameterization scheme for DFTB for the periodic table. Comparison between DFT and DFTB of the band structures of various phases of Mo and molybdenum carbide and the molecular orbital energies of hydrocarbons proved the accuracy of the new DFTB parameter set in electronic structure calculations. The repulsive potentials were fitted to polynomials using appropriate reference molecules. Our parameterization was tested for a set of relevant molecules and shows an accuracy of <1.45 % error in bond distances for hydrocarbons and 4.86 % for non-hydrocarbons. It could reproduce the structure and vibrational frequencies of the tested hydrocarbon–molybdenum carbide complex from DFT calculations, as well as the dissociative adsorption of H2 on the molybdenum carbide surface. The energy benchmarking shows that the parameters are of good accuracy for the hydrogenation/hydrocracking reaction energies and transition states of the selected hydrogenation reactions. Of course, as with all semiempirical parameterizations, transferability is limited to systems and properties that are “similar” to the training set. Also, the errors in distances and frequencies limit the predictive capability of the method. We think that our parameters, which cover all of the above properties of molecular and meaningfully large clusters, including the activation energy, represent the state of the art, as does the application to molybdenum carbide nanocatalysts under working conditions reported in Ref. [19].