1 Introduction

During the last decade, correlating contracted (C) Gaussian-type function (GTF) sets in segmented form have been developed for the 103 atoms H through Lr of the periodic table [112]. They were determined to represent the space spanned by important accurate atomic natural orbitals (NOs) generated from large-scale configuration interaction (CI) calculations of atoms and were designed to incorporate significant contributions to the electron correlation efficiently. We constructed non-relativistic correlating sets for the atoms up to Xe and relativistic sets for the heavier atoms in which the relativistic effects are considered through the third-order Douglas–Kroll–Hess (DK) approximation [13, 14]. In spite of their compactness, these correlating sets, named Natural Orbital based Segmented Contracted Gaussian (NOSeC) basis sets, show high performance in molecular calculations when used together with literature segmented Hartree–Fock (HF) GTF sets. The NOSeC sets combined with Tatewaki–Koga (TK) HF CGTF sets [1519], named (DK)-TK/NOSeC-(C)V-nZP (n = D, T, Q), are available through the web site http://setani.sci.hokudai.ac.jp/sapporo/.

In this paper, we construct core–valence correlating basis sets for the second to fifth period p-block elements and the first and second d-block elements. In these atoms, especially in heavy atoms, the core–valence correlation effects as well as the relativistic effects are important to obtain reliable spectroscopic constants. In the InI molecule, for example, the coupled-cluster singles and doubles with non-iterative triples correction (CCSD(T)) calculation with DK-TK/NOSeC-V-QZP gave a nontrivial deviation of 0.07 Å in the bond length from the experimental value, which cannot be considered sufficiently small and is expected to be reduced by considering the core–valence correlation. The core–valence correlation effects are taken into account in the cc-pCVnZ basis sets for the second and third period atoms [20, 21], and d-block atoms of the fourth and fifth periods [22, 23]. For the s-block atoms from Li to Ca, Iron et al. [24] proposed core–valence cc-type basis sets. For the p-block atoms, the ANO basis sets were developed for the second to sixth period elements where the relativistic effects were considered through the DK approximation [25]. Dyall also constructed relativistic core–valence sets for the fourth to sixth period elements, where the DZP, TZP, and QZP basis sets were optimized in Dirac–Hartree–Fock calculations [2628]. In this paper, we consider the nsnp, and nd subshells as the core for p-block elements, and the ns and np subshells for d-block elements, where n′ stands for the principle quantum number of the second outermost shell. For d-block atoms, the (DK)-TK/NOSeC-CV-nZP sets do not consider the correlation effects of the ns and np subshells, which are important in the excitations of the nd electrons. Although for s-block atoms the (DK)-TK/NOSeC-CV-nZP sets include the correlation effects among electrons in the ns and np subshells, we decided to refine them by eliminating some similar primitive Gaussians in CGTFs which may cause linear dependency problems.

The next section outlines our computational procedures and presents new basis sets named Sapporo-(DK)-nZP. Atomic tests and molecular applications to 24 diatomics are given in Sects. 2 and 3, respectively. The following symbols are used throughout this paper: [ ] for CGTFs, () for primitive Gaussians, {} for contraction patterns of CGTFs where powers imply repetition of the same size.

2 Method

2.1 Standard accurate NO sets

In order to obtain accurate NOs, we performed CI calculation using well-tempered primitive sets by Huzinaga et al. [29, 30] extended by adding higher azimuthal quantum number (l) functions as follows: (12s8p8d8f8g) for the second period atoms, (23s16p9d9f9g) for the third period atoms, (26s20p16d9f9g9h) for the fourth period atoms, and (28s23p20d9f9g9h) for the fifth period atoms. For the p-block atoms, we performed separate CI calculations for core and valence electrons. In the core CI calculations, we considered the correlation of the 1s electrons for Li–Ne, of the 2s and 2p electrons for Na–Ar, and of the nsnp, and nd electrons for K–Kr (n′ = 3) and Rb–Xe (n′ = 4). In the valence CI calculations, we only correlated (n′ + 1)s and (n′ + 1)p electrons. For the d-block atoms, on the other hand, we took into account the correlation among the nsnpnd, and (n′ + 1)s electrons simultaneously. Non-relativistic CI calculations were performed for the second to fifth period atoms, and relativistic CI calculations, which consider scalar relativistic effects through the DK Hamiltonian [14] with a Gaussian nucleus model [31], were performed for the fourth and fifth period atoms. In all atomic calculations, we used the ATOMCI program package [32].

Standard nZP NO sets were constructed by truncating the accurate NOs generated from the CI wave functions. For the size of nZP set, we follow the definition of standard cc-pVnZ basis sets for each subshell. In the Ga atom, for example, we consider 3s, 3p, and 3d as core subshells, and 4s and 4p as valence subshells. We first define a minimal set of [4s3p1d] for occupied atomic orbitals. In the DZP set, [1s1p1d1f] and [1s1p1d] are considered as correlating sets of core and valence shells, respectively. Then, the total size of DZP is [6s5p3d1f]. Similarly, we define the standard sizes of TZP and QZP to be [8s7p5d3f1g] and [10s9p7d5f3g1h], respectively.

In Fig. 1, we show the core correlation energies of the DZP, TZP, and QZP sets for the p-block atoms. In the second period atoms, the DZP sets yield about 75% of the total core correlation energies by the full NOs. In the third period atoms, we find the core correlation energy is almost saturated at the TZP set. Thus, we construct the DZP core correlating sets for the second period p-block atoms, and the DZP and TZP core correlating sets for the third period p-block atoms. For the fourth and fifth period atoms, on the other hand, the core correlation energies are considerably larger than those of the second and third period atoms, and the QZP set gives nontrivial improvement over the TZP set. Thus, we generate all the DZP, TZP, and QZP core correlating sets for the fourth and fifth period p-block atoms.

Fig. 1
figure 1

Core correlation energies of p-block atoms

2.2 Determination of core correlating basis functions

In the previous papers [112], we reported general-purpose correlating basis sets, which are intended to be combined with an arbitrary HF basis set. In this work, however, we develop core correlating functions to complement a particular segment-type basis sets, which contain occupied and valence-correlating CGTFs. The procedure is summarized as follows:

  1. 1.

    Start from the minimal-type HF sets of Tatewaki-Koga’s non-relativistic sets [1518] or the relativistic DK sets constructed in the previous work [19]. Decontract occupied n′ and n′ + 1 subshell orbitals of minimal-type HF functions. Add the valence-correlating basis sets NOSeC-V-nZP for the p- and d-block atoms, and the core–valence-correlating sets of NOSeC-CV-nZP for the s-block atoms (n = D, T, Q).

  2. 2.

    Optimize higher l primitive GTFs, which describe the correlation effects among core electrons, to represent the standard accurate NO sets mentioned in the previous sub-section. If the numbers of decontracted s and p functions are deficient in the standard size of the nZP set, extra s or p-type primitive GTFs are supplemented and optimized.

  3. 3.

    Check the combined basis set from the procedures 1 and 2 so that analogous primitives do not appear in different CGTFs.

The procedure is performed independently for each l in the determination of the present basis sets. For the required optimization, we use the same procedure as our previous works [1], where the contraction coefficients and exponents were optimized as nonlinear parameters using the conjugate directions algorithm [33].

2.3 New DZP, TZP, and QZP basis sets

Following the procedure mentioned in the previous subsection, we developed non-relativistic DZP, TZP, and QZP sets for Li–Xe, and non-relativistic and relativistic DZP, TZP, and QZP sets for K–Xe, which are referred to as Sapporo-(DK)-nZP sets (n = D, T, Q).

For the second and third period p-block atoms, the sizes of the TZP and QZP sets are smaller than the standard sizes, because small core correlating sets were sufficient for these atoms as mentioned in Sect. 2.1. In QZP sets of the fourth and fifth period atoms, we removed one each of s-, p-, and d-type CGTF functions, because the contributions from these functions were found to be small. In Ga, for example, the size of QZP reduces to [9s8p7d5f3g1h] after the elimination of [1s1p1d]. We summarize the sizes of the present CGTF sets in Table 1.

Table 1 Sizes of Sapporo CGTF sets

In Table 2, we compare the sizes of Sapporo-TZP and available cc-CVTZ sets. The main difference of these sets is the description of occupied orbitals: The segmented contractions are used in the present sets, while the general contractions in cc-CVTZ. The numbers of CGTFs are similar in both sets, but the numbers of primitive GTFs are remarkably different. For the fourth period atoms, for example, the number of primitive GTFs of cc-CVTZ is four times larger than that of the Sapporo sets. Thus, the present sets are definitely compact compared with cc-type basis sets.

Table 2 Comparison of sizes of CGTFs, contraction patterns, and total numbers N GTF of primitive GTFs between Sapporo-TZP and cc-CVTZ

3 Atomic tests

The correlation energies calculated using the present non-relativistic basis sets are compiled in Table 3. For comparison, we show the reproduction percentage of correlation energies relative to those by accurate NOs of the standard size. For the p-block atoms, the correlation energies in the table are the sum of the core correlation energies among nsnp, and nd electrons and the valence correlation energies among (n′ + 1)s and (n′ + 1)p electrons obtained by separate CI calculations. The present basis sets reproduce more than 90% of the correlation energies among the nsnp, and (n′ + 1)s electrons for s-block atoms, nsnpnd, (n′ + 1)s, and (n′ + 1)p electrons for p-block atoms, and nsnpnd, and (n′ + 1) electrons for d-block atoms, with minor exceptions for the DZP and TZP sets of early transition atoms Sc–V and Y–Nb.

Table 3 Correlation energies in hartree

4 Molecular applications

To test the quality of Sapporo-(DK)-nZP basis sets, we have carried out self-consistent field (SCF) and CCSD(T) calculations to obtain spectroscopic constants of 12 hydrides of the s- and d-block atoms (LiH, BeH, NaH, MgH, KH, CaH, RbH, SrH, CrH, CuH, MoH, and AgH) and 12 diatomic molecules of the p-block atoms (BF, CO, N2, AlCl, SiS, P2, GaBr, GeSe, As2, InI, SnTe, and Sb2). The non-relativistic Sapporo-nZP were used for H and Li–Cl, while the relativistic Sapporo-DK-nZP sets for Ca–I. In all calculations, the relativistic effects were considered through the third-order DK approximation. All molecular calculations were performed using the Molcas6.4, Molcas7.4 [34], and Molpro2010 [35] program systems.

4.1 Diatomic hydrides

The resultant spectroscopic constants of the 12 hydrides are shown in Table 4 along with available experimental data. For s-block hydrides, the deviation from the experimental value decreases monotonically as the quality of the basis set increases, and the calculated values reach satisfactory agreement with experimental values at the QZP set, where the maximum deviations are 0.01 Å in the bond length r e, 35 cm−1 in the vibrational frequency ωe, and 0.1 eV in the dissociation energy D 0. For d-block hydrides, on the other hand, a different tendency is observed. As the quality of the basis set increases, the deviation from the experiment decreases, but we do not reach satisfactory agreement even at the QZP set. For example, the deviations in CrH are 0.02 Å in r e, 96 cm−1 in ωe, and 0.17 eV in D 0. This is not due to the basis set quality, but the limitation of CCSD(T) method based on a single reference theory. We carried out the complete active space SCF approach with additional dynamic correlation effects using multiconfigurational second-order perturbation theory (CASPT2) calculations with QZP set for CrH, where 3d, 4s, and 4p of Cr and 1s of H are included in a complete active space and the 3s and 3p electrons of Cr are correlated. The new results are 1.652 Å, 1678 cm−1, and 2.190 eV for r e, ωe, and D 0, respectively. The deviations of r e and ωe from the experiment reduce to 0.003 Å  and 22 cm−1. Thus, the present basis sets have sufficient quality to obtain reliable spectroscopic constants.

Table 4 Spectroscopic constants of diatomic hydrides by CCSD(T)

In Table 5, we compare the timing data of CASPT2 calculations on CrH using Sapporo-DK-QZP and cc-pwCVQZ-DK [22] basis sets. Both sets gave almost the same values in r e, ωe, and D 0, but we found considerable difference in timings, especially at generation of integral and SCF steps. Large cost in integral generation using cc-pwCVQZ-DK is due to the long expansion of occupied orbitals as shown in Table 2. The cc-pwCVQZ-DK basis set has six times larger primitive GTFs and need five times longer CPU times in integral generation.

Table 5 Timing data in seconds for CASPT2 calculations of CrH by Molcas7.4 on 3.33 GHz Intel Core 2 Duo E8600 CPU

4.2 Diatomic molecules of p-block atoms

In Table 6, we show the calculated spectroscopic constants of the 12 diatomic molecules of the p-block atoms and compare them with available experimental data. In Fig. 2, the deviations from experimental values are exemplified for r 0. In order to show the effects of correlation among core electrons, we plot the results with and without the core correlation effects, as denoted by nZP(+core) and nZP in the legend. For all molecules, smooth convergences on spectroscopic constants are found as the quality of the basis set increases when the core correlation is considered. At the QZP set, we reach a reasonable agreement with the experiment, where the deviations are smaller than 0.01 Å in r e, 10 cm−1 in ωe, and 0.2 eV in D 0. The effects of core correlations are remarkable in r e of diatomics of the fourth and fifth period atoms. In GeSe, for example, the deviation of calculated r e without the core correlation is 0.02 Å, while the inclusion of the core correlation reduces it to 0.0004 Å. We found analogous effects of core correlation for other molecules. For ωe, on the other hand, the inclusion of the core correlation increases the deviations from the experiment for some molecules, such as GaBr and Sb2, but the deviations at the QZP set are still reasonably small.

Table 6 Spectroscopic constants of p-block diatomic molecules by CCSD(T)
Fig. 2
figure 2

Error in calculated r e in Å

In this paper, we do not examine the basis set superposition error (BSSE), because we already reported that the effects of BSSE on the spectroscopic constants are quite small for the (DK)-TK/NOSeC-(C)V-nZP sets in the previous papers [11, 19].

5 Summary

We developed all electron non-relativistic and relativistic segmented basis sets for the atoms H through Xe. To describe the core correlation, we constructed correlating CGTF sets for the second to fifth period p-block and the first and second d-block atoms. These correlating CGFT sets are optimized to represent accurate atomic NOs generated from large-scale CI calculations under the condition that the minimal-type HF sets (non-relativistic and relativistic TK HF CGFT sets) and the valence-correlating sets (NOSeC-CV-nZP sets) exist. For s-block atoms, the (DK)-TK/NOSeC-CV-nZP sets were re-optimized to remove analogous primitives in CGTFs. The resultant basis sets are named Sapporo-(DK)-nZP (n = D, T, Q) and give more than 90 % of the core and valence correlation energies produced by accurate NOs of the standard size.

Test calculations for 12 hydrides of s- and d-block atoms and 12 diatomic molecules of the p-block atoms were performed at the CCSD(T) level of theory. For all molecules, smooth convergences of calculated spectroscopic constants to experimental values are found as the quality of the basis set increases when the core correlation is considered. At the QZP set, we reach a reasonable agreement with the experiment, for example, in the case of the p-block diatomic molecules, the deviations are smaller than 0.01 Å  in r e, 10 cm−1 in ωe, and 0.2 eV in D 0.

The present sets are available at the web site http://setani.sci.hokudai.ac.jp/sapporo/, where basis sets are provided in appropriately formatted forms for popular electronic program packages such as Gaussian, Gamess, Molpro, Molcas, Turbomole, Dirac, Nwchem, and Alchemy2. The Sapporo-(DK)-nZP sets have been also implemented in the Gamess program package and can be used by a simple key word in input data.