Introduction

The viability to investigate properties of electronic structure with the purpose to interpret and justify their molecular phenomena is, traditionally, one cornerstone of chemistry. In computational chemistry, the correct choice of an adequate ab initio basis to study a molecular system still is considered an intriguing point among the theoreticians. Although some works reports that an appropriate basis set can describe traditional phenomena or else propose new insights with plausible perspectives [13], there is not a procedure to elect the ideal basis set, what in some cases can lead to an arbitrary choice of this parameter. However, some statistical studies have revealed the remarkable influence of the basis sets and quantumchemical methods on analysis of molecular properties of several systems [410], including those formed by hydrogen bonds [11, 12]. Nevertheless, it is well-known that in comparison with results obtained from HF (Hartree-Fock) [13] calculations, the molecular properties of hydrogen complexes [14] are described more efficiently by means of thesecond-order Møller-Plesset perturbation theory (MP2) [15] method. On the other hand, it has been documented that density functional theory (DFT) [16, 17] is also an efficient method, where its potentiality has been satisfactorily demonstrated in several promulgated studies, as for example in analysis of properties of hydrogen-bonded complexes [1833]. Although it is well-established that both MP2 and DFT are the most popular methods used to evaluate the electronic correlation effects [34], it has been observed that DFT appears as an alternative computational scheme at the ab initio formalism [35] because the efficiency of exchange, correlation and hybrid functionals has been successfully demonstrated, in particular for studies of chemical, physical and biological systems [3638].

In practice, Hohenberg-Kohn-Sham protocol offers a great advantage in comparison with the ab initio methods, one of them is the low computational demand. In studies of molecular properties of hydrogen-bonded complexes [39, 40], for instance, it is vital to consider the effects of electronic correlation, although some theoreticians are still not convinced that DFT is an able method due to its limitation for describing dispersion forces [41]. In spite of this, we are regarding a traditional question related at studies of hydrogen-bonded complexes: what is the most appropriated method for modeling these systems, MP2 or DFT? Or else, the efficiency of these two methods is influenced by any ab initio basis sets? So, by taking into account these two highlighted questions, this current study was elaborated with the proposal of comparing the efficiency of MP2 and DFT methods through the analysis of R(n⋅⋅⋅HX) hydrogen bond distances and υ(n⋅⋅⋅HX) infrared intermolecular frequencies of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes with X = F, CN, NC, and CCH (see Fig. 1). However, it is essential not only to evaluate which method is the most efficient, but we are admitting the importance of the basis sets and its applicability will be also discussed here. Concomitantly, it is necessary to use a combination of 6-31ijGk split-valence basis set with i = triple-zeta (Tξ); j = diffuse basis set (++), and k = polarization functions (d,p). Thereby, it is generated an immense quantity of results for R(n⋅⋅⋅HX) and υ(n⋅⋅⋅HX), which will be treated through the chemometric techniques, such as HCA (Hierarchical Cluster Analysis) [42], TLFD (Two Levels Factorial Design) [43, 44] and PCA (Principal Component Analysis) [45]. As it is well-known that chemometric techniques has been applied successfully in an immensity of situations where questions of chemical, physical and/or biological nature needs to be solved [46], it is necessary to mention that regarding to the main purpose of this work, we thinks that a chemometrical study of intermolecular properties of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes seems to be interesting by one single reason: due the ability of three-membered rings to form intermolecular complexes [4750] and related systems [51, 52], an indication concerning of what level of theory can be appropriate to examine rings like C2H4O and C2H5N, in fact this is prerequisite to gain useful information about these systems.

Fig. 1
figure 1

Optimized geometries of the C2H4O⋅⋅⋅HF, C2H4O⋅⋅⋅HCN, C2H4O⋅⋅⋅HNC, C2H4O⋅⋅⋅HCCH, C2H5N⋅⋅⋅HF, C2H5N⋅⋅⋅HCN, C2H5N⋅⋅⋅HNC, and C2H5N⋅⋅⋅HCCH hydrogen-bonded complexes. In both R(n⋅⋅⋅HX) and υ(n⋅⋅⋅HX) terms, n symbolizes the lone electron pairs of the oxygen or nitrogen atoms in the C2H4O and C2H5N heterocyclics, respectively

Chemometric techniques

HCA

HCA is a statistical methodology that forms conglomerates patterns through the association among single samples. For the proposal of this work, the HCA similarity S ij of the theoretical methods and basis sets will be constructed on the basis of Euclidian criterion according to:

$${\text{S}}_{ab} = 1 - \frac{{{\text{d}}_{ab} }}{{{\text{d}}_{\max } }}$$
(1)

where d ab is the distance between a and b samples, as well as the dmax is the maximum distance for any connected point. Even though qualitatively, the application of HCA can give us a description of similarity among the theoretical levels used to study the R(n⋅⋅⋅HX) hydrogen bond distances and υ(n⋅⋅⋅HX) stretch frequencies of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes.

TLFD

To obtain a chemometrical model formed justly by relevant effects, the TLFD technique performs a complete variation in theoretical methods with the purpose to quantify the contributions of DFT and MP2, as well as of basis sets. Thus, we admitted the Pople’s 6-3ijGk [53] split-valence as our standard basis set and considering both DFT and MP2, the TLFD modifications are treated in four steps summarized below:

(I) i :

Using triple-zeta (Tξ) instead of double (Dξ) valence functions [54];

(II) j :

Including or not diffuse functions (++) [55];

(III) k :

Adding or not polarization functions (d,p) [56];

(IV) l :

Performing DFT instead of MP2 calculations.

Through these factors, it is built a simplified mathematical model that includes the most relevant effects Uk:

$${\text{U}}^{{\text{EST}}} = \overline {\text{U}} _{\text{X}} \; + \;\sum\limits_{{\text{m }} = {\kern 1pt} {\kern 1pt} {\text{1}}}^{\text{n}} {{\text{U}}_{\text{k}} } \;$$
(2)

In Eq. (2) UEST and ŪX corresponds to estimated and medium values of the intermolecular properties R(n⋅⋅⋅HX) or υ(n⋅⋅⋅HX) of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes. Moreover, the k coefficient is a dummy variable, which during the analysis is equal to +1 and −1 when the corresponding factor (i, j, k or l) is present or absent, respectively.

PCA

PCA are vectors obtained from the diagonalization of X t X covariance matrix, where X are the original data represented in a multidimensional space [57, 58]. In the present case, the X matrix has 16 rows (theoretical levels, see TLFD section) and 4 columns (C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes with X = F, CN, NC or CCH, see Fig. 1), where the largest eigenvalue of X t X is interpreted as statistical information explained by the first principal component (Θ1), which if possible, describes the maximum variance (MV) governed by Eq. (3).

$${\text{MV}} = {\text{v1}}{\text{.}}\Theta {\text{1}}\;\, + \,\,\,{\text{e}}_{\text{1}} $$
(3)

If the variance result is unsatisfactory, it is request a second principal component (Θ2), which is computed automatically. This procedure is repeated until to obtain the so desired maximum variance.

$${\text{MV}} = {\text{v1}}{\text{. $ \Theta $ 1}}\;\, + \,\,{\text{v2}}{\text{. $ \Theta $ 2}}\,\; + \;\;{\text{e2}}$$
(4)

Computational scheme and statistical programs

To optimize the geometries of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes, all theoretical calculations related to the MP2 and DFT methods, as well as the 6-3ijGk basis sets were performed on the GAUSSIAN 98W [59] software. For DFT calculations, it was used the B3LYP hybrid [60], whose formalism is defined by the Becke’s three functional parameter [61] combined with the LYP non-local correlation term [62]. The results of the TLFD, HCA and PCA were obtained by the programs FATORIAL 1.0 [63, 64], STATISTICA 5.0 [65] and UNSCRAMBLER 7.5 [66], respectively.

Results

Structural parameter: R(n⋅⋅⋅HX) hydrogen bond distance

HCA

The Table 1 presents the theoretical values of the R(n⋅⋅⋅HX) hydrogen bond distances of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes. Initially, the application of HCA provided tendencies in R(n⋅⋅⋅HX) values, as can be observed in Fig. 2. Named as G2, this cluster indicates a great similarity among data obtained by the (d,p) polarization functions. However, G1 and G3 clusters are also formed and some interesting informations can be extracted from them. Note that, G1 corresponds basically to (++) diffuse functions, whereas G3 to smaller basis sets at MP2 level. If B3LYP is concerned, small basis sets without (++) diffuse functions leads to results similar to MP2, so that the results fall in G1 rather than G3. This is known and, if we compare with post-HF methods [67], it is well established that B3LYP was implemented with purpose to execute calculations by using moderate size basis sets instead of larger ones.

Fig. 2
figure 2

HCA graph of the R(n⋅⋅⋅HX) theoretical hydrogen bond distances of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes

Table 1 Theoretical values of the R(n⋅⋅⋅HX) hydrogen bond distances of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes

TLFD

The TLFD results are listed in Table 2. We can observe that (d,p) polarizations and B3LYP functional are the most significant effects, whose contributions yields a systematic increases on R(n⋅⋅⋅HX) distances in 0.040 Å and 0.093 Å, as well as diminishes in −0.020 Å and -0.026 Å for the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes, respectively. However, even though by increasing R(n⋅⋅⋅HX) in 0.024 Å and 0.025 Å, the interaction effect [(B3LYP)-(d,p)] provides a significant contribution to the TLFD analysis. Thus, the TLFD models for R(n⋅⋅⋅HX) hydrogen bond distances of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes are represented by the Eqs. (5) and (6), respectively.

$$R_{_{\left( {n \cdots HX} \right)} }^{EST} = \overline R _{X{\kern 1pt} _{\left( {n \cdots {\kern 1pt} HX} \right)} } \; - \,\,0.0095{\kern 1pt} \Delta {\kern 1pt} \left( { + + } \right)\,\; + \,\;0.0050{\kern 1pt} {\kern 1pt} \Delta \left[ {\left( {T\xi } \right){\kern 1pt} - \left( { + + } \right)} \right]\;\, + \,\;0.0200{\kern 1pt} {\kern 1pt} \Delta {\kern 1pt} \left( {d,p} \right) + \,\,0.008{\kern 1pt} 0{\kern 1pt} {\kern 1pt} \Delta \left[ {\left( { + + } \right){\kern 1pt} - {\kern 1pt} \left( {d,p} \right)} \right]\;\; - \;\,0.0100{\kern 1pt} {\kern 1pt} \Delta {\kern 1pt} \left( {B3LYP} \right)\;{\kern 1pt} + \,{\kern 1pt} {\kern 1pt} 0.0055{\kern 1pt} {\kern 1pt} \Delta {\kern 1pt} \left[ {\left( {B3LYP} \right){\kern 1pt} - {\kern 1pt} {\kern 1pt} \left( { + + } \right)} \right]{\kern 1pt} + \,\,0.0120{\kern 1pt} {\kern 1pt} \Delta {\kern 1pt} \left[ {\left( {B3LYP} \right){\kern 1pt} - {\kern 1pt} \left( {d,p} \right)} \right]\,\; - \,\;0.0078{\kern 1pt} {\kern 1pt} \Delta {\kern 1pt} \left[ {{\kern 1pt} \left( {B3LYP} \right) - \left( { + + } \right){\kern 1pt} {\kern 1pt} - {\kern 1pt} \left( {d,p} \right)} \right]{\kern 1pt} $$
(5)
$${\text{R}}_{_{\left( {{\text{n}} \cdots {\kern 1pt} {\text{HX}}} \right)} }^{EST} = \overline R _{{\text{X}}{\kern 1pt} _{\left( {{\text{n}} \cdots {\kern 1pt} {\text{HX}}} \right)} } - \,\,{\text{0}}{\text{.0055}}{\kern 1pt} {\text{ $ \Delta $ }}\left( {T\xi } \right)\,\, - \,\,{\text{0}}{\text{.01}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left( { + + } \right)\,\, + \,\,{\text{0}}{\text{.0079}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} {\kern 1pt} \left[ {\left( {T\xi } \right){\kern 1pt} - {\kern 1pt} \left( { + + } \right)} \right] + \,{\kern 1pt} {\kern 1pt} {\text{0}}{\text{.0465}}\Delta {\kern 1pt} \left( {d,p} \right)\,{\kern 1pt} - \,\,{\text{0}}{\text{.0131}}\Delta \left( {B3LYP} \right)\,\, + {\kern 1pt} \,{\kern 1pt} {\text{0}}{\text{.012}}{\kern 1pt} {\text{5}}\Delta {\kern 1pt} {\kern 1pt} \left[ {\left( {B3{\kern 1pt} LYP} \right) - {\kern 1pt} \left( {d,p} \right)} \right]{\kern 1pt} $$
(6)

Note that, some other effects were also included in TLFD, such as (Tξ) valence and (++) diffuse functions. As can be seen in Fig. 3, the R2 linear correlation square coefficient of 0.99 indicates a good agreement between the \({\text{R}}_{\left( {{\text{n}} \cdots {\text{HX}}} \right)}^{{\text{EST}}} \) estimated results and \(\overline{{\text{R}}} _{{_{{{\text{X}}{\kern 1pt} _{{{\left( {{\text{n}}...{\kern 1pt} {\text{HX}}} \right)}}} }} }} \) medium values of the hydrogen bond distances of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes.

Fig. 3
figure 3

Plot of the \(R_{\left( {n \cdots HX} \right)}^{{\text{EST}}} \) estimated hydrogen bond distances versus the correspondent R(n⋅⋅⋅HX) theoretical values of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes

Table 2 Predominant effects of the TLFD analysis of the R(n⋅⋅⋅HX) hydrogen bond distances of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes

PCA

In terms of PCA, according to Figs. 4 and 5, it can be seen that (d,p) polarizations are described by Θ1 axis, which accounts for 80% and 81% of variance for the R(n⋅⋅⋅HX) hydrogen bond distance values of the correspondent C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes. It should be also noticeable that, as Θ2 explains 14% and 16% of variance, we would like to emphasize that meaningful effects are also explained by Θ2, for instance the (++) diffuse functions. Indeed, the PCA analysis yielded a satisfactory description of variance if the results of second component are included into the chemometrical analysis. As such, both Θ1 and Θ2 account for 94% and 97% of the total variance.

Fig. 4
figure 4

PCA scores of the R(n⋅⋅⋅HX) theoretical hydrogen bond distances of the C2H4O⋅⋅⋅HX hydrogen-bonded complexes

Fig. 5
figure 5

PCA scores of the R(n⋅⋅⋅HX) theoretical hydrogen bond distances of the C2H5N⋅⋅⋅HX hydrogen-bonded complexes

Not only in terms of (d,p) polarization functions, which appropriately describes the polarizability and strain energies of the C2H4O and C2H5N heterorings, but the great importance of (++) diffuse functions is related with its capacity of describing chemical systems with larger spatial region and trends of expansion due to their electrons are placed relatively far from the nuclei, i.e., molecules with electron pairs or even anions [68]. For the analysis of the R(n⋅⋅⋅HX) values, however, these concepts makes reference to n lone electron pairs of oxygen and nitrogen atoms on C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes, respectively. In regards to the acids, the loading Eqs. (7), (8), (9), and (10) demonstrates that hydrofluoric acid is the most important proton donor due to its great contribution for the variance data in both Θ1 and Θ2 axis.

$$\Theta {\text{1 }}\left( {{\text{C}}_{\text{2}} {\text{H}}_{\text{4}} {\text{O}} \cdots {\text{HX}}} \right){\text{ = 0}}{\text{.60 R}}_{\left( {n \cdots {\text{HF}}} \right)} {\text{ + 0}}{\text{.53 R}}_{\left( {n \cdots {\text{HCN}}} \right)} {\text{ + 0}}{\text{.22 R}}_{\left( {n \cdots {\text{HNC}}} \right)} {\text{ + 0}}{\text{.55 R}}_{\left( {n \cdots {\text{HCCH}}} \right)} $$
(7)
$$\Theta {\text{1 }}\left( {{\text{C}}_{\text{2}} {\text{H}}_{\text{5}} {\text{N}} \cdots {\text{HX}}} \right){\text{ = 0}}{\text{.69 R}}_{\left( {n \cdots {\text{HF}}} \right)} {\text{ + 0}}{\text{.45 R}}_{\left( {n \cdots {\text{HCN}}} \right)} {\text{ + 0}}{\text{.37 R}}_{\left( {n \cdots {\text{HNC}}} \right)} {\text{ + 0}}{\text{.41 R}}_{\left( {n \cdots {\text{HCCH}}} \right)} $$
(8)
$$\Theta {\text{2 (C}}_{\text{2}} {\text{H}}_{\text{4}} {\text{O}} \cdots {\text{HX) = 0}}{\text{.82 R}}_{\left( {n \cdots HF} \right)} {\text{ - 0}}{\text{.44 R}}_{\left( {n \cdots {\text{HCN}}} \right)} {\text{ - 0}}{\text{.002 R}}_{\left( {n \cdots {\text{HNC}}} \right)} {\text{ - 0}}{\text{.36 R}}_{\left( {{\text{n}} \cdots {\text{HCCH}}} \right)} $$
(9)
$$\Theta {\text{2 }}\left( {{\text{C}}_{\text{2}} {\text{H}}_{\text{5}} {\text{N}} \cdots {\text{HX}}} \right){\text{ = 0}}{\text{.70 R}}_{\left( {n \cdots {\text{HF}}} \right)} {\text{ - 0}}{\text{.56 R}}_{\left( {n \cdots {\text{HCN}}} \right)} {\text{ - 0}}{\text{.17 R}}_{\left( {n \cdots {\text{HNC}}} \right)} {\text{ - 0}}{\text{.40 R}}_{\left( {n \cdots {\text{HCCH}}} \right)} $$
(10)

However, the validation of the chemometrical model still requires a comparative study with experimental data. Some time ago, the experimental geometries of the C2H4O⋅⋅⋅HF and C2H4O⋅⋅⋅C2H2 hydrogen-bonded complexes were explained by Legon et al. [69, 70] through Fourier transform microwave spectroscopy (FTMS) [71] analysis. In these occasions, the values of 1.70 Å and 2.40 Å for the R(n⋅⋅⋅HX) hydrogen bond distances of the C2H4O⋅⋅⋅HF and C2H4O⋅⋅⋅C2H2 complexes were determined. By analyzing the values listed in Table 2, it is clearly perceived that (d,p) polarizations provide the best concordance of R(n⋅⋅⋅HX) theoretical hydrogen bond distances in comparison with the FTMS values. For instance, if we make a comparison between MP2/6-31G(d,p) and B3LYP/6-31G(d,p) calculations, which yielded results of 1.7112 Å and 1.7010 Å for C2H4O⋅⋅⋅HF, as well as 2.1363 Å and 2.1275 Å for the C2H4O⋅⋅⋅ C2H2 complexes, it must be noted that these theoretical results are in good agreement with the experimental data of 1.70 Å and 2.40 Å presented above. In this sense, it can be noted that MP2 and B3LYP are considered methods of comparable accuracy. As these are the available experimental data [69, 70], it is with the understanding that such direct comparison is the only test of theory.

Vibrational harmonic spectrum: υ(n⋅⋅⋅HX) hydrogen bond stretch frequencies

HCA

Table 3 presents the theoretical values of the υ(n⋅⋅⋅HX) hydrogen bond stretch frequencies of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes. Likewise to structural analysis, the same procedure was employed to examine the theoretical values of the hydrogen bonds frequencies, i.e, the HCA, FDTL and PCA techniques were applied in sequence. For HCA evaluation, the tree cluster presented in Fig. 6 characterizes a G1 group basically formed by (d,p) polarization functions. For remaining factors, neither similarity was observed as very important.

Fig. 6
figure 6

HCA graph of the theoretical υ(n⋅⋅⋅HX) stretch frequencies of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes

Table 3 Theoretical values of the υ(n⋅⋅⋅HX) hydrogen bond stretch frequencies of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes

TLFD

Nevertheless, Table 4 lists the TLFD results of the υ(n⋅⋅⋅HX) hydrogen bonds frequencies of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes. As we can see, the main effects are indicated by (d,p) polarizations and [(B3LYP)-(d,p)] interaction effects, which decreases systematically υ(n⋅⋅⋅HX) in −8.9 cm−1 and −29.3 cm−1, as well as in −5.4 cm−1 and −7.2 cm−1 for the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes, respectively. Rather, these effects and others were used to build two TLFD models represented by Eqs. (11) and (12), where the R2 linear correlation square coefficient of 0.98 indicates a good concordance between \(\upsilon _{\left( {n \cdots HX} \right)}^{{\text{EST}}} \) and \(\bar \upsilon {\kern 1pt} _{{\text{X}}_{\left( {{\text{n}} \cdots {\text{HX}}} \right)} }^{} \) values of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes, as can be seen in Fig. 7.

$${\text{ $ \upsilon $ }}{\kern 1pt} _{\left( {{\text{n}} \cdots {\text{HX}}} \right)}^{{\text{EST}}} {\text{ = }}\overline {\text{ $ \upsilon $ }} {\kern 1pt} _{{\text{X}}_{\left( {{\text{n}} \cdots {\text{HX}}} \right)} } \,{\text{ - }}\,\,{\text{3}}{\text{.0}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left( {{\text{T}}{\kern 1pt} {\text{ - }}{\kern 1pt} {\text{ $ \xi $ }}} \right)\,{\text{ - }}\,\,{\text{0}}{\text{.7}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left( {{\text{ + + }}} \right)\,\,{\text{ - }}\,\,{\text{0}}{\text{.7 $ \Delta $ }}{\kern 1pt} \left[ {\left( {{\text{T}}{\kern 1pt} {\kern 1pt} {\text{ $ \xi $ }}} \right){\kern 1pt} {\kern 1pt} {\text{ - }}\left( {{\text{ + }}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\text{ + }}} \right)} \right]\,\,{\kern 1pt} {\text{ - }}\,\,{\text{4}}{\text{.4}}{\kern 1pt} {\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} {\kern 1pt} \left( {{\text{d,p}}} \right){\text{ - }}\,\,{\text{2}}{\text{.0}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left[ {\left( {{\text{ + + }}} \right){\kern 1pt} {\text{ - }}\left( {{\text{d,p}}} \right)} \right]\,\,{\text{ - }}\,\,{\text{1}}{\text{.5}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left( {{\text{B3LYP}}} \right){\text{ }}{\kern 1pt} {\text{ - }}\,\,\,{\text{1}}{\text{.5}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left[ {\left( {{\text{B3LYP}}} \right){\kern 1pt} {\text{ - }}{\kern 1pt} \left( {{\text{ + + }}} \right){\kern 1pt} } \right]{\text{ + }}\,\,{\text{1}}{\text{.7}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left[ {\left( {{\text{B3LYP}}} \right){\kern 1pt} {\kern 1pt} {\text{ - }}{\kern 1pt} \left( {{\text{T}}{\kern 1pt} {\kern 1pt} {\text{ $ \xi $ }}} \right){\kern 1pt} {\text{ - }}\left( {{\text{ + + }}} \right)} \right]{\text{ - }}\,\,{\text{2}}{\text{.7}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} {\kern 1pt} \left[ {\left( {{\text{B3LYP}}} \right){\kern 1pt} {\text{ - }}{\kern 1pt} \left( {{\text{d}}{\text{,}}{\text{p}}} \right)} \right]{\kern 1pt} \,\,$$
(11)
$$\upsilon {\kern 1pt} _{\left( {{\text{n}} \cdots {\text{HX}}} \right)}^{{\text{EST}}} = \overline \upsilon {\kern 1pt} _{{\text{X}}_{{\text{(n}} \cdots {\text{HX)}}} } \,\, - \,\,{\text{1}}{\text{.9}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left( {{\text{T}}{\kern 1pt} {\text{ $ \xi $ }}} \right)\,\, - \,\,{\text{1}}{\text{.5}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left( { + + } \right)\,\, - \,\,{\text{1}}{\text{.9 $ \Delta $ }}{\kern 1pt} \left[ {\left( {{\text{T}}{\kern 1pt} {\text{ $ \xi $ }}} \right){\kern 1pt} - \left( { + + } \right)} \right] - \,\,{\text{14}}{\text{.6}}{\kern 1pt} {\text{ $ \Delta $ }}\left( {{\text{d,p}}} \right) - \,\,{\text{1}}{\text{.3}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left[ {\left( { + + } \right){\kern 1pt} - {\kern 1pt} {\kern 1pt} \left( {{\text{d,p}}} \right)} \right]{\kern 1pt} \,\, - \,\,{\text{0}}{\text{.9 $ \Delta $ 0}}{\text{.9 $ \Delta $ }}\left( {{\text{B3}}} \right){\text{ }} - \,\,{\text{1}}{\text{.2}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left[ {\left( {{\text{B3LYP}}} \right){\kern 1pt} - {\kern 1pt} {\kern 1pt} \left( { + + } \right)} \right] + \,\,{\text{0}}{\text{.2}}{\kern 1pt} {\text{ $ \Delta $ }}\left[ {\left( {{\text{B3LYP}}} \right){\kern 1pt} {\kern 1pt} - {\kern 1pt} \left( {{\text{T}}{\kern 1pt} {\kern 1pt} {\text{ $ \xi $ }}} \right){\kern 1pt} - {\kern 1pt} \left( { + + } \right)} \right] - \,\,{\text{3}}{\text{.6}}{\kern 1pt} {\text{ $ \Delta $ }}{\kern 1pt} \left[ {\left( {{\text{B3LYP}}} \right){\kern 1pt} - {\kern 1pt} \left( {{\text{d,p}}} \right)} \right]\,$$
(12)
Fig. 7
figure 7

Plot of the \(\upsilon _{\left( {n \cdots HX} \right)}^{{\text{EST}}} \) estimated hydrogen bond stretch frequencies versus the correspondent υ(n⋅⋅⋅HX) theoretical values of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes

Table 4 Predominant effects of the TLFD analysis of the υ(n⋅⋅⋅HX) stretch frequencies of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes

PCA

About PCA, however, the results shows that (d,p) polarization functions and [(B3LYP)-(d,p)] interaction are the most prominent effects, which describes 72% and 80% of variance through the Θ1 axis, as can be seen in Figs. 8 and 9, respectively. Both (d,p) and [(B3LYP)-(d,p)] are placed in left side of Θ1 and Θ2, although the second component contains 24% and 19% of variance for υ(n⋅⋅⋅HX) values of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes, respectively. Corroborating with the structural analysis, loading Eqs. (13), (14), (15), and (16) says that the hydrofluoric acid has the largest contribution on the variance data.

$$\Theta {\text{1 }}\left( {{\text{C}}_{\text{2}} {\text{H}}_{\text{4}} {\text{O}} \cdots {\text{HX}}} \right){\text{ = 0}}{\text{.75 $ \upsilon $ }}_{{\text{(}}n \cdots {\text{HF)}}} {\text{ + 0}}{\text{.37 $ \upsilon $ }}_{\left( {n \cdots {\text{HCN}}} \right)} {\text{ + 0}}{\text{.32 $ \upsilon $ }}_{\left( {n \cdots {\text{HNC}}} \right)} {\text{ + 0}}{\text{.43 $ \upsilon $ }}_{\left( {n \cdots {\text{HCCH}}} \right)} $$
(13)
$$\Theta {\text{1 }}\left( {{\text{C}}_{\text{2}} {\text{H}}_{\text{5}} {\text{N}} \cdots {\text{HX}}} \right){\text{ = 0}}{\text{.90 $ \upsilon $ }}_{\left( {n \cdots {\text{HF}}} \right)} {\text{ + 0}}{\text{.21 $ \upsilon $ }}_{\left( {n \cdots {\text{HCN}}} \right)} {\text{ + 0}}{\text{.36 $ \upsilon $ }}_{\left( {n \cdots {\text{HNC}}} \right)} {\text{ + 0}}{\text{.04 $ \upsilon $ }}_{\left( {n \cdots {\text{HCCH}}} \right)} $$
(14)
$$\Theta {\text{2 }}\left( {{\text{C}}_{\text{2}} {\text{H}}_{\text{4}} {\text{O}} \cdots {\text{HX}}} \right){\text{ = - 0}}{\text{.70 $ \upsilon $ }}_{\left( {n \cdots {\text{HF}}} \right)} {\text{ + 0}}{\text{.25 $ \upsilon $ }}_{\left( {n \cdots {\text{HCN}}} \right)} {\text{ + 0}}{\text{.14 $ \upsilon $ }}_{\left( {n \cdots {\text{HNC}}} \right)} {\text{ + 0}}{\text{.60 $ \upsilon $ }}_{\left( {n \cdots {\text{HCCH}}} \right)} $$
(15)
$$\Theta {\text{2 }}\left( {{\text{C}}_{\text{2}} {\text{H}}_{\text{5}} {\text{N}} \cdots {\text{HX}}} \right){\text{ = - 0}}{\text{.90 $ \upsilon $ }}_{\left( {n \cdots {\text{HF}}} \right)} {\text{ + 0}}{\text{.13 $ \upsilon $ }}_{\left( {n \cdots {\text{HCN}}} \right)} {\text{ + 0}}{\text{.31 $ \upsilon $ }}_{\left( {n \cdots {\text{HNC}}} \right)} {\text{ + 0}}{\text{.19 $ \upsilon $ }}_{\left( {n \cdots {\text{HCCH}}} \right)} $$
(16)

On the contrary of results found for the hydrogen bond distances, the (d,p) polarizations are identified in the left side of both Θ1 and Θ2 axis. In other words, our model can predict that weaker hydrogen bond frequencies are obtained through the application of (d,p) polarization functions. Comparatively, it was verified that MP2 contributes slightly to the statistical model. It is, in fact, not a surprise because DFT simulates molecular parameters of cyclic hydrogen-bonded complexes very satisfactorily if compared with MP2 [72], or else still more efficiently than HF calculations [73].

Fig. 8
figure 8

PCA scores of the υ(n⋅⋅⋅HX) theoretical stretch frequencies of the C2H4O⋅⋅⋅HX hydrogen-bonded complexes

Fig. 9
figure 9

PCA scores of the υ(n⋅⋅⋅HX) theoretical stretch frequencies of the C2H5N⋅⋅⋅HX hydrogen-bonded complexes

Discussion and perspectives

It is a tradition in our research group to use DFT calculations in studies of hydrogen-bonded complexes [74, 75] and related systems [76]. As reviewed on a number of occasions, it is widely known the efficiency of the B3LYP hybrid functional for describing intermolecular interactions [7780] but independent of that, it is well to bear in mind that MP2 is considered an unquestionable approach, wherein it was also used in some investigations developed by our research group [14, 81]. So, we have no desire to opine or to enter a discussion if DFT is better than MP2 and vice-versa, actually we just want to discern the efficiency of these two methods for describing the intermolecular properties of the C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes. Some time ago, we published some works wherein structural, electronic, and vibrational parameters of these complexes were debated. In these works, of course that the B3LYP was admitted but, it used the 6-311 +G(d,p) basis sets to effectuate the calculations, by which is a complete set because there is valence (Tξ) and diffuse (++) functions within its formulation. If we considered the results of our chemometrical analysis, by which it was concluded that (d,p) functions is the most important parameter, it was redundant to include (Tξ) and (++) into our works cited above? [72, 82]. In parts, actually, but on the other hand there is the problem of base sets superposition error (BSSE) [83], wherein if small basis are used, such as 6-31G(d,p), surely that BSSE amounts will be overestimated [84], e.g., the dispersion effects caused by moderate basis sets leads to a correction error on the interaction energy of 8.26–1.55 kJ mol−1, as recently reported by Šponer and Hobza [85]. It is by this statement that, at this moment we are developing a study wherein it is evaluated the dependence of BSSE with the same ab initio basis sets used in this work, properly not just related to the C2H4O⋅⋅⋅HX or C2H5N⋅⋅⋅HX systems but, we are extending this study to other hydrogen-bonded complexes with significant differences of intermolecular strength, such as those formed by π and pseudo-π interactions [25, 8394]. So, as soon as possible, we expect to obtain a measure of how the basis sets can contribute for BSSE calculations. It comes to what is the most important is that the relationship between BSSE and basis sets is well understood but, in a similar way at procedure performed here where DFT and MP2 methods were used, it is worth stressing that this perspective above presented can reveal interesting results, although there are some affirmations about a possible relationship between both DFT and MP2 with accurate BSSE results [95, 96].

Conclusions

The HCA, TLFD and PCA chemometric techniques were used to evaluate the effects of basis sets and quantumchemical methods on structural parameters and infrared harmonic spectrum of C2H4O⋅⋅⋅HX and C2H5N⋅⋅⋅HX hydrogen-bonded complexes. These three statistical methods were applied with the purpose to ensure that an appropriate level of theory can be elected to describe a property of interest, in such case, the intermolecular distance and its stretch frequency of the complexes mentioned above. In general, the results of our study show that the polarization functions are the most important effect, although other factors were also considered significant, for instance diffuse functions on the analysis of the hydrogen bonds distance. At the interpretation of the hydrogen bonds frequencies, polarization functions and B3LYP functional contributes equally for the statistical analysis upon removal of the other effects. In comparison with HCN, HNC, and HCCH, the PCA results demonstrated that the hydrofluoric acid is the most important variable. Finally, a decisive point is related to the quality of theoretical method, DFT or MP2. In this sense, it was verified that MP2 results alters insignificantly the statistical analysis. Finally we comment the relation of our current results with theoretical works documented recently [11, 12], where it was known that MP2 always furnished the major contribution to the chemometrical analysis. By taking into account our results debated in this work, DFT seems to be a suitable and efficient approach to study intermolecular properties of hydrogen-bonded complexes [97100]. Thereby, we would like to emphasize the importance of DFT as a useful method for studying electronic structures and in this sense, only two reasons deserve to be considered: i) the lower computational effort and, ii) satisfactory reproduction of the available experimental data.