Following the principles of systems analysis involving determination of relationships between separate constituents of any system, we consider as such constituents the levels of knowledge of oil systems, corresponding to separate approaches in their evolution.

Analysis of the most important papers published in the world (with 170 to 679 citations) and in Russia (for a number of reasons, the number of their citations is considerably lower, and this is by no means associated with their scientific significance) in the field of oil systems throughout the research period shows that these papers deal mainly with asphaltene-containing dispersions [18] and with their behavior in various processes of petroleum technology [913]. Analysis of the American Chemical Society (ACS) data reveals a number of important details: There are more than 3400 papers associated with asphaltenes keyword, starting from 1894 (it appears that there has been an almost half-century gap after publication of the well-known Boussingault’s paper in 1837 on the recovery of asphaltenes); as for nanoasphaltenes keyword, the first paper was published in 1981, and starting from 1996 there have been already 205 such papers. The search using the asphaltenes + nano + fractal keyword combination reveals only 37 papers, of which papers by Hoepfner et al. [14] should be particularly noted. It is known that the concept of fractals as applied to asphaltene-containing systems, which are the most abundant type of oil disperse systems (PDS), allows estimation of the topological parameter of their structure, fractal dimension [15]. That is the preliminary statistical analysis of the research activity in the field of our interest.

Numerous papers published up to 2000, including papers by Russian scientists who laid the foundation of this field (D.I. Mendeleev, V.N. Ipat’ev, S.S. Nametkin, A.A. Petrov, et al.), are mainly studies performed within the framework of the analytical approach, aimed at obtaining primary data on the chemical composition of oil systems and on its variation under the action of external factors (temperature, pressure, catalyst). Paying historical respects to these basic papers, we must note there decisive role in the development of the analytical approach in petroleum chemistry and its practical applications in the field of oil and gas exploration and in the development of oil and gas fields, oil and gas production, oil refining, gas processing, and petroleum chemistry; this is noted in Prof. Ryabov’s textbook, which ran into numerous editions.Footnote 1

In this paper, we pay attention to the state of the art in the field of petroleomics, a modern analytical approach to studying oil systems, furnishing extra information on the chemical composition of oil systems, and to physical theories that influenced the development of views on the colloidal structure of PDS on the micro and nano levels.

Specifically the initial steps of the phase formation in oil systems, in which nanoparticles of a new phase are formed, predetermine the controllable course of processes of petroleum technology and, in some cases, also possible undesirable concomitant processes (e.g., formation of undesirable deposits [16]).

Highly cited (562 citations) paper by Yen et al. [17] can be considered as one of the first papers containing data on asphaltene nanostructure, although in 1961 the notion of nanostructure was yet unknown and nanoasphaltenes were considered at that time as ultradispersed particles.

Gaining more detailed information and revealing relationships in the field of the PDS composition and behavior allow development of ways to control the PDS properties and behavior. Researchers’ attention is focused, as a rule, on studying the composition and properties of oil systems in various steps of natural and technological carbon cycle: from crude oil ripening and location in a stratum (in this step, PDS are commonly denoted, in accordance with historical tradition, as reservoir fluid), to extraction and transportation of partially degassed oil, termed live oil, and refining of the already pretreated oil, known as dead oil, including the properties of petroleum products in the course of their use and utilization of spent petroleum products. Actually, reservoir fluid, when passing from a stratum to the diurnal surface and then to an oil refinery, gradually transforms from a polyheterogeneous composite disperse system (liquid emulsion containing in the dispersed state gas, asphaltenes, salts, and mechanical impurities) into a conventionally degassed and almost completely dehydrated oil. At the inlet of the AVT installation of an oil refinery, the oil is an asphaltene-containing dispersion of a definite dispersity level. The researchers’ doubts concerning the nanodispersed state of asphaltene-containing systems in catalytic oil refining processes are already in the past, and Russian researchers reached appreciable success in hydroconversion of petroleum residues using inorganic nanocatalysts [13]. It can be briefly said that, to control the properties of a oil system when it occurs in a stratum, in a pipe, or in a process apparatus, it is necessary to find efficient ways to control the metastable state of the system under the conditions close to those of possible phase transitions of the first [16] or second [18] kind.

The apparatus of petroleomics allows gaining more detailed information on the chemical composition of oil systems and revealing 2–3 orders of magnitude more compounds in oil systems, compared to the known analytical procedures of the past century. The use of other modern physicochemical methods of investigation opens new possibilities for characterizing the fractal structure parameters [14, 19] and molecular-mass distribution (MMD) of nanoasphaltenes [20] of oil systems.

The chemo-informational approach (petroinformatics) involves application of methods for multivariate data analysis (MDA) to the corresponding base of the initial digital data on the composition, structure, and properties of oil systems, which predetermines the development of applied directions of oil science; in the future, they may become the mainstream.

In this connection, it seems topical from the standpoint of systems analysis to consider the evolution of views on petroleum asphaltene-containing systems since the discovery of the natural disperse phase of crude oils, asphaltenes (Boussingault, 1837) until now.

EVOLUTION OF SCIENTIFIC KNOWLEDGE OF OIL SYSTEMS

The evolution of approaches to the development of views on oil systems is primarily associated with fundamental discoveries in the field of physics and chemistry; it includes key events and certain achievements.

Since the discovery of petroleum asphaltenes and criticism of dispersoidology as a pseudo-scientific direction of colloid chemistry, long time passed until the colloid-chemical approach in studying oil systems was taken up by the scientific community along with the analytical approach.

From the viewpoint of modern physical concepts, the views on the structure of PDS are primarily associated with the initial presence of asphaltene nanostructures or their formation upon phase transitions and with their evolution in open systems. Nobel Prize winner P.-G. de Gennes termed multiphase heterogeneous media, including PDS, as soft materials. There is significant correlation between the properties of soft materials and their organization on the nano, micro, meso, and macro levels. The diversity of structural elements and the presence of the spectrum of intermolecular interaction (IMI) energies determine the morphological diversity of supramolecular structures on all scales. These self-assembling and self-organization processes determined the principles of nanochemistry as physical chemistry of IMI. The openness of systems suggests the possibility of formation of space–time structures within the framework of the synergistic approach developed by another Nobel Prize winner, I. Prigogine. The state of soft objects is determined by the tendency of systems to ordering under the action of IMI (attraction) forces and by the disordering factor, tendency of any system to disintegration and chaos. Specifically this competition of the order and chaos determines the state of soft objects. Intermolecular forces lead to diversity of supramolecular structures (complex structural units, CSUs; this term and the corresponding concept in petroleum science were suggested by Prof. Z.I. Syunyaev in the 1980s) in PDS. This concept, presented in highly cited (466 citations) book [21], consists in that the action of external factors in the course of oil and gas cycle processes accompanied by phase transitions in oil systems (Table 1) leads to changes in the disperse structure and to nonlinear nonmonotonic variation of physicochemical properties of PDS and of process parameters (Table 2). Nonlinear variations of virtually all the practically significant recorded macroscopic properties of PDS could be rationalized within the framework of the suggested hypothesis, revolutionary for that time, based on the dynamic model of variation of the ratio between the thicknesses of the CSU core and solvate shell. The size effects in PDS of various types and compositions were described by numerous authors.

Table 1. Classification of processes of oil extraction, transportation, and refining and of petroleum product properties with respect to phase transition types (abbreviations: g, gas; l, liquid; s, solid)
Table 2. Interconnection of PDS parameters

On the other hand, the modern apparatus of the fractal science uses its own methods for describing dynamic processes. These methods do not involve studying variations of the composition and structure of dispersed particles; they are considered as small geometric objects of preset size and unchanged composition in an n-dimensional space.

Indeed, from the standpoint of systems analysis, an oil system is complex, because it consists of a large number of interconnected elements interacting with each other, often under nonequilibrium conditions. The basic property of complex self-organizing PDS is continuous multilevel assembling and rearrangement of complex asphaltene nanoaggregates (CSUs) under the action of both technogenic and natural factors, including the oil ripening conditions. Fractal geometry is actively used for considering the growth of asphaltene nanoparticles, which can be simulated using the Monte Carlo method [14, 15, 19].

According to the results of computer simulation, fractal systems, being apparently chaotic and disordered, actually have internal order. The fractal dimension of a nanoaggregate is fractional, in contrast to the traditional integer dimension of dispersed particles, and reflects the degree of filling of the topological space. The lower is the fractal dimension, the softer (looser) is the nanodispersion, the more probable is rearrangement of its internal structure and structural elements themselves, and the weaker is the external action that can cause structural changes. The pulsed NMR relaxation method is the most informative for this purpose [22]. Using this method, Zlobin [19] made for the first time a nontrivial conclusion that the fractal dimension of the structure of asphaltene aggregates is an integral genetic criterion for differentiation of natural crude oils. This is a ripening indicator of a sort: The fractal dimension correlates with the oil stratification age and monotonically increases downwards along the section (groups 1–3 in Table 3). Comparison of the data obtained shows that the fractal parameters of crude oils with thermobaric ripening conditions in various stages of the Carboniferous (tens of Ma and more!) differ relatively insignificantly, but the fractal parameters of crude oils from Carboniferous deposits differ significantly (by 30%) from those of the lower occurring Devonian crudes. That is an example of the formulation of a hypothesis concerning the previously unknown mechanism of crude oil evolution as a particular case of self-organizing systems. It has been demonstrated by the example of a representative sample of crude oils that a nonlinear decrease in the fractal dimension with an increase in the radius of complex structural units characterizes the rearrangement of these units and influences the macroscopic properties of crude oils (Table 4).

Table 3. Fractal dimension of asphaltene-containing systems of native crude oils from Perm krai
Table 4. Correlation between the fractal dimension and asphaltene content of oils (with a sample of 353 crude oils from Perm krai as example)

As for processes of petroleum technology, occurring in other time intervals, studying changes in the fractal structure of asphaltene-containing systems, as a rule, is not topical, and researchers’ attention is focused on relationships between changes in the composition, dispersity, and macroscopic properties of the feed with the aim of controlling by external actions the process parameters such as the yield and quality of petroleum products, degree of deemulsification upon crude oil dehydration at an oilfield or oil refinery, well output in oil extraction, etc.). Control of the process characteristics of oil systems in the course of their production becomes topical. The response to this challenge appeared with the development of PAT (Process Analytical Technology) [23] in the XXI century. This is a system of automatic sensors for monitoring the quality of feed and product streams to ensure operation of process facilities without interruptions. This technology is also gradually brought into the petroleum practice [24]. The on-line monitoring is particularly important for preventing deposition of undesirable components (paraffins [25, 26], free stratal water [27], gas hydrates [28]) that give rise to problems in marine transportation and pretreatment of the multiphase hydrocarbon fluid in pipelines of underwater mining systems (UMSs).

It should be noted that the knowledge of the supramolecular structure of PDS by the moment of formulation of the colloid-chemical concept was insufficient for its complete acceptance and wide use in interpretation of research results and in practice. However, the development of this concept allowed many researchers to suggest a convincing interpretation of their experimental data on nonlinear variation of the PDS properties. Persistent interest in the field of the disperse state of asphaltene-containing dispersions arose simultaneously.

Such studies are constantly discussed within the framework of International Conference “Petroleum Phase Behavior and Fouling” (briefly PetroPhase) [29], held for the first time in 1999.

Today we can historically speak of studies of oil systems from the standpoints of analytical, colloid-chemical, model, and informational (chemometric) approaches (Table 5). Up to the beginning of the XXI century, the following approaches to studying oil systems were mainly considered: analytical (using methods of physical, organic, and analytical chemistry) and colloid-chemical (using mainly methods of physical and colloid chemistry). The model approach arose as an attempt to avoid problems associated with studying multicomponent asphaltene-containing systems [16, 30, 31]. The possibility of predicting the properties of oil systems based on digital data on their composition (petroleomics) became the matter of active discussions in the beginning of the XXI century [32, 33], although one of the first papers on digital images of crude oils based on chromatographic data was published as early as 1979 [34]. Let us consider the essence of the coexisting approaches and differences between them.

Table 5. Main approaches to studying PDS

ANALYTICAL APPROACH TO STUDYING PDS

Traditionally the problem of qualitative and quantitative composition of a multicomponent oil system is solved using the analytical approach involving separation of a mixture into components, their identification, and determination of the quantitative component content. Diverse methods for separating mixtures and identifying their components (chromatographic, spectroscopic, extraction, etc.) are used for this purpose and are being steadily improved. In this connection, it is appropriate to mention a comprehensive review of modern methods of molecular and ionic mass spectrometry as applied to analysis of multicomponent petroleum mixtures [35].

Speaking of the history of finding new compounds in crude oils, we should note certain important events. In 1933, previously unknown adamantane, the ancestor of a homologous series of polyhedral alicyclic hydrocarbons of diamond-like structure, was discovered; later it was synthesized [36]. The importance of discovering adamantane consists in that it stimulated the progress of studies in the field of crude oil genesis and in chemistry of organic polyhedranes as a separate field of synthetic organic chemistry. The discovery of alkenes in crude oils in 1952 is another example [37].

One of bright examples of the analytical approach in the XXI century is the discovery of tetrameric naphthenic acids (TNAs) in crude oils; they differ from monobasic naphthenic acids known for a long time [38]. Let us discuss the TNA discovery in more detail. In the course of marine transportation of crudes in North Sea, problems arose with the formation of highly stable naphthenate deposits that contained complex salts consisting of calcium as a prevalent cation in the aqueous phase and naphthenic acids from the oil phase containing cross-linking agents; these naphthenic acids were found to contain four carboxyl groups. It has been proved that TNAs are the main component promoting the formation of asphaltene–resin–wax deposits in marine pipelines of UMS. Figure 1 shows the TNA structure determined by Norwegian researchers.

Fig. 1.
figure 1

Molecular structure of tetrameric naphthenic acids (С80) [38]. (Copyright @ 2006, RSC).

The other brilliant examples of the triumph of the modern analytical approach are the results of studies in the field of petroleomics. Studies heralding a breakthrough in studying complex multicomponent hydrocarbon systems were published as early as 1990s [39, 40]. Marshall, the founder of petroleomics, made the first report on this novel research field in 2003 at the most widely known international conference on analytical chemistry, Pittcon [41]. Analysis by Fourier transform ion cyclotron resonance mass spectrometry (FT ICR MS) revealed the presence of several tens of thousands of compounds in crude oils, as illustrated by the example of the mass spectrum of South American heavy crude, recorded for the first time at Marshall’s laboratory (Fig. 2) [42]. The data obtained were processed in the form of a two-dimensional diagram with Kendrick mass defect, using the method suggested in [43] and based on the concept that an oil system is characterized by continuous distribution of molecular masses of the initial component (according to Boduszynski) [44].

Fig. 2.
figure 2

Example of one of the first mass spectra, containing more than 17 thousands of resolved negatively and positively charged components for a sample of South American high-viscosity crude (Venezuela) (the mass spectrum was recorded at the National Laboratory at the University of Florida) [42]. (Copyright @ 2002, ACS).

Interpretation of the mass spectra of oil system molecules, recorded using FT ICR MS technique, is a multiparameter model; it is solved using statistical methods. Application of various FT ICR MS procedures using soft methods for ionization of organic molecules allowed characterization of tens and hundreds of thousands of individual molecular components of oil systems [4549]. Various programs are also used for molecular identification of multicomponent systems (e.g., [50]).

In parallel, trying to determine the nature and structure of asphaltenes, analytical chemists used the mimicking technique and constructed hypothetical asphaltene molecules; the calculation principles were formulated in [30]. It has been found that differences in the chemical composition of such hypothetical asphaltene molecules are insufficient to account for changes in physicochemical (primarily rheological) properties of crude oils as asphaltene-containing systems. That was one of prerequisites for considering oil systems from the colloid-chemical standpoint. Following the principle of systems analysis, without dwelling on chronologically earlier papers that formulated other prerequisites for changes in approaches to interpretation of the results of analytical studies, let us pay attention in the next section to selected papers in which the results of studies are successfully interpreted specifically from the colloid-chemical standpoint for a large sample of PDS of different types.

COLLOID-CHEMICAL APPROACH TO STUDYING PDS

The following problems are considered within the framework of the colloid-chemical approach to studying PDS: classification of PDS with respect to the main characteristics of the dispersed state; colloid-chemical properties of PDS: stability, rheological and electrophysical properties; phase and structural transformations in oil systems under various conditions in a stratum, on the diurnal surface, in a process apparatus, and in the course of use as petroleum products meeting requirements of state standards and technical specifications.

The disperse state is characteristic of oil systems both in a stratum and in the course of extraction, transportation, refining, and other operations. The oil disperse systems include virtually all kinds of natural hydrocarbon resources (gas hydrates, gas condensates, crude oils, malthas, bitumens); various types of petroleum products, from motor fuels to cokes; hydrocarbon-based chemicals and process liquids used in oil extraction chemistry, etc. Figure 3 shows the dispersity “vector” characterizing oil systems; it is directed from nanosizes to the macrophase. The degree of dispersity is one of the most important quantitative characteristics of PDS, determining their physicochemical and process characteristics. However, experimental data on the true dispersity of PDS are contradictory because of the lack of nondestructive analytical methods. The majority of procedures involve the use of solvents, which significantly alters the true size of asphaptene particles as the PDS disperses phase.

Fig. 3.
figure 3

Dispersity “vector” of oil disperse systems.

Because determination of the particle size (the main characteristic of disperse systems) in such multicomponent systems as native crudes is difficult, studies are often performed with their solutions. The tendency of asphaltenes to association, well studied by the group of R&D coauthors and managers of research centers of world’s leading oil and gas companies [5], is manifested in the form of one of two possible scenarios of aggregation of dispersed asphaltene particles in a toluene solution with an increase in their concentration. The first scenario leads to the formation of a bonded disperse gel structure, and the second scenario, to the formation of free unstable disperse system tending to phase separation. A study using a set of physicochemical methods, including FT ICR MS [52], has shown that approximately 90% of asphaltenes in Heptole (a mixture of heptane and toluene in a definite ratio) are associated: At the mean molecular mass of nonassociated asphaltenes of 850 g mol–1, the curve of the molecular mass distribution of asphaltene nanoaggregates extends to 30000 g mol–1 with the mean values from 10000 to 20000 g mol–1. The diameter of these nanoaggregates, determined by different methods, varies from 5–9 to 20 nm. The authors concluded that asphaltenes in Heptole are weakly structured highly polydispersed nanoaggregates with low fractal dimension, which agrees with the results of other studies [14, 19].

Whereas differences between asphaltenes from the same crude oil were revealed previously using a combination of different solvents and precipitants [53, 54], an alternative asphaltene differentiation method based on the ability of asphaltenes to be adsorbed on different oil–water and oil–solid interfaces, termed extended group analysis method [55], revealed a subfraction of surface-active components in the initial asphaltene fraction. The extended group analysis method accounts for the results of a previous study [56] on separation of a large sample of 390 dead oils into two groups without frequently cited strong correlation of the properties (density, viscosity, congealing point) with the asphaltene content. The assumption that such subdivision of crude oils into two types is most probably associated with deasphalting via natural geochromatography of crude oil in the course of migration and leads to selective removal of the surface-active asphaltene subfraction is a brilliant example of adequate colloid-chemical interpretation of the results obtained in geochemical studies of asphaltene-containing systems.

Many researchers studied in detail the asphaltene dispersity by dynamic light scattering, but in all the cases the investigation objects were model solutions of asphaltenes preliminarily recovered from crude oils [5, 6]. To evaluate the dispersity of real PDS, it is preferable to use methods that to not involve alteration of the native state of asphaltenes by precipitation/aggregation. Such studies were not performed until Prof. Giddings [57] suggested a method for particle size fractionation by separation in a transverse force field (field flow fractionation, FFF) [57]. This method allows separation of particles of colloidal size (1–1000 nm) with high resolution.

Novikov et al. [20] have shown that crude oils strongly differing in the initial asphaltene content (from 0.5 to 14 wt %) can be well distinguished with respect to the molecular-mass distribution of asphaltenes determined by FFF; the studies were performed for toluene solutions of crude oils.

MODEL APPROACH TO STUDYING PDS

Recognizing the complexity of the composition and structure of asphaltene-containing systems, researches in parallel actively developed an approach to their study that we conventionally named the model approach.

The model approach is based on two main methods. The first method consists in constructing a hypothetical model, a structure of an averaged asphaltene molecule according to the data of elemental analysis, molecular mass determination, and 1H, 13C NMR and IR spectroscopy. Such molecule does not actually exist in the oil but corresponds to the results of measurements by the above methods [30]. This method is actively used in [58, 59].

The second approach consists in preparing simulated oils or model synthetic oils using specially synthesized compounds containing functional groups and aromatic rings that are commonly present in natural asphaltenes. This approach eliminates risks associated with indefinite multicomponent composition of natural asphaltenes in real crude oils. Figure 4 shows an example of the synthesized model asphaltene molecules.

Fig. 4.
figure 4

Examples of asphaltene models synthesized at the Ugelstad Laboratory, Norwegian University of Science and Technology. The ratio of aromatic and aliphatic moieties is fixed, and the radicals R include surface-active polar components (С5Pe, TP, PAP) and a nonpolar component (BisA) [16, 31]. (Copyright © 2015 Elsevier B.V.).

Both approaches demonstrate a search for adequate methods for constructing models that would be “convenient” to researchers, would correspond to a maximal extent to real crude oil samples, but would be free of their “drawbacks” associated with the lack of the required data on the structure and composition of multicomponent oil systems. The model approach was an inevitable step reflecting the difficulties to account for the differences between the results of studying real and model systems. This step lasted for almost 50 years, coexisted with the colloid-chemical approach, and historically preceded the development of the chemo-informational approach.

A US patent of the year 2019 [60] clearly shows that the model approach is still topical. To find a catalyst for degradation of oil macromolecules and reduction of its viscosity, the authors described a general scheme of using analytical and quantum-mechanical methods for determining the most probable molecular structure of asphaltenes.

CHEMO-INFORMATIONAL APPROACH TO STUDING PDS

The modern step of PDS studies is associated with the development of petroleomics and petroinformatics. Owing to pioneering studies by Marshall (Fig. 2) and other researchers in the field of analysis of the chemical composition of oil systems by FT ICR MS methods, representative samples of crude oils and heavy oil fractions have been characterized in detail [33, 35, 3943, 4551, 6165]. Asphaltene-containing fractions, whose detailed analysis on the molecular level was previously impossible, became the objects of close attention. Chacón-Patiño et al. [46] suggested finishing a many-year discussion [7, 66] concerning the molecular masses of asphaltenes and their assignment to one of known structures (“island” or “archipelago”); they showed that the molecular mass of asphaltenes varies from 250 to 1200 amu and that asphaltenes include molecules of both structures, with the prevalence of one of another structure being associated with the asphaltene origin.

A comprehensive review [51] describes the development of petroleomics as a chronological sequence of events since the moment of application of the KMD (Kendrick mass defect) concept to crude oil analysis in 1992 and up to the present time, when we can speak of identification of hundreds of thousands of components in heavy petroleum residues. In this connection, the following two papers dealing with the analysis of highly concentrated asphaltene-containing dispersions using FT ICR MS should be mentioned. Krajewski et al. [62] reached a record for an asphalt sample: 170000 identified peaks and empirical formulas, which allowed determination of the molecular masses in the range from 200 to 1000 Da for 126264 compounds taking into account the elemental analysis data. The next record belongs to Palacio Lozano et al. [63], who identified 244779 unique molecules in heavy petroleum residues.

The possibility of relatively complete characterization of high-molecular-mass crude oil components allowed finding correlations and reliably predicting their properties and behavior under the conditions of chemical processes and transformations. This is already the field of petroinformatics, which can be characterized as application of mathematical processing methods to arrays of multivariate data for oil systems. In so doing, FT ICR MS allows solution of such a complex problem as detailed characterization of one of the arrays: chemical composition of multicomponent oil systems.

It is known that the principal component analysis (PCA) for reducing the dimension of multivariate data and the hierarchic cluster analysis (HCA) are the main methods among statistical MDA methods. Inclusion of such reference parameters of oil systems as, e.g., definite threshold values characterizing undesirable phenomena in the initial steps of phase transitions (onset points) into a digital database opens prospects for classification and discrimination of the initial samples of oil systems with the compilation of a library of calibration models. Calibration models allow early diagonistics concerning, e.g., the possible development of undesirable phenomena. The term petroinformatics appeared for the first time in 2017 in a paper by Japanese authors [67] in connection with the describing the results of hydrodesulfurization of light thermal cracking gasoil, based on the detailed data on the chemical composition of the feed. Specifically this paper reports on the petroinformatics platform (PIP) at the Japan Petroleum Energy Center (JPEC), which includes large databases on the elemental composition, physical properties, and reactivity characteristics of various oil fractions. Today, according to [68], the number of petroleum molecules recorded in JPEC exceeds 25 mln; their physicochemical properties are calculated using the method of partial contributions (e.g., temperatures of phase transitions such as boiling and melting) and computational methods of quantum chemistry (e.g., energy of formation, heat capacity, etc.). The solution of two applied problems using this database has been demonstrated: simulation of the hydrodesulfurization of petroleum feedstock and of the formation of asphaltene deposits from petroleum feedstock at elevated temperatures.

Figure 5 illustrates how the use of databases (Big data) including detailed data on the chemical composition of oil components allows estimation of the reactivity of these components in certain chemical reactions and simulation of the reaction kinetics. These reactions determine the essence and efficiency of industrial processes such as, e.g., hydrodesulfurization of oils, including heavy crudes.

Fig. 5.
figure 5

Scheme of possible use of petroinformatics for estimating the reactivity of components in hydrodesulfurization [67, 68]. (Copyright © 2017, ACS).

There are good grounds to believe that petroinformatics based on detailed digital petroleoimics data on the chemical composition of oil system and reference properties will allow predicting not only the properties of new samples of known composition but also the phenomena associated with phase transformations of oil systems. Analysis of the UMS operation shows that up to a half of saving of investment and operational expenditures in marine extraction and transportation of crude oils is associated specifically with the prevention of undesirable phase formation phenomena [25, 26]. These directions in the field of PDS studies, in authors’ opinion, can be considered as mainstream.

Attempts, successful to different extents, to analyze the properties of crude oils by methods of chemometric analysis have been made recently. These studies are aimed at the development of express methods of analysis and at prediction of oil properties using spectroscopic data obtained by FT ICR MS, near-IR, IR, NMR spectroscopy, etc. [65, 6982]. One of the first successful attempts of the classification of crudes and their accurate assignment to geological occurrence zones, based on a study of a small set of crudes (14 samples) by FT ICR MS, was made in [71]. Korean researches demonstrated by the example of a set of 20 crude oil samples the possibility of obtaining good correlations (with respect to the Spearman coefficient) based on the FT ICR MS data and reference data on the content of heteroatoms (total sulfur, nitrogen, oxygen), acid number, density, and distillate yield at atmospheric pressure. A study by Brazilian authors [81] on the application of the chemometric approach to the analysis of a representative sample of more than 100 crude oils characterized by 1H NMR should also be noted; the authors obtained valid models allowing prediction of practically important properties (density, wax appearance temperature (WAT), viscosity, etc.). A study of a representative sample of Russian crudes led to the construction of reliable models for predicting a number of properties [7374]. Here we compare as an example the results of using various mathematical methods (partial least squares (PLS), linear regression, k-nearest neighbors (kNN), support vector machine (SVM), least absolute shrinkage and selection operator (LASSO) [83]) for processing multivariate spectral data obtained by near-IR spectroscopy as applied to a limited set of Russian dead oils (12 samples) and the calibration models, based on them, for determining the reference values of the kinematic viscosity of oils under normal conditions (Figs. 6, 7).

Fig. 6.
figure 6

Comparison of the root-mean-square errors (RMSE) of different models for reference data on the kinematic viscosity of crude oils (mm2 s–1).

Fig. 7.
figure 7

The best calibration model (SVM method) for the reference data on the kinematic viscosity (mm2 s–1 along both axes).

The potential of such approach can be revealed when applying MDA methods to the data on the chemical composition of crude oils on the molecular level, which stimulates the development in the future of the digital classification of crude oils, allowing prediction of reference properties of new oil samples. As for the existing classifications, it should be noted that they are based on using large reference data banks; some of the classifications have only historical significance or have a stating character, and other classifications are intended for process optimization based on reference data measured by standard methods (Table 6). Analysis of world’s publications [84] demonstrates the existence of crude oil databases such as Crude Oil Analysis Database and bases of Statoil and Total companies, for which integration into HYSYS and AspenPlus systems for automated designing of oil refineries is provided and various tools for comparative analysis and optimization of process operations are realized. Russian colleagues from the Institute of Petroleum Chemistry, Siberian Branch, Russian Academy of Sciences characterized crude oil samples from various depths and fields, mainly from the Siberian region of Russia and also from Kazakhstan and other CIS countries [85]. This collection includes more than 2000 crude oil samples with the known chemical composition and physicochemical properties; it can become extremely demanded for the development of the yet nonexistent digital classification of crude oils, exhibiting predictive potential with respect to reference properties of new samples of oils and their fractions.

Table 6. Classification of crude oils

CONCLUSIONS

The systems analysis of the development of views on PDS shows that both the analytical approach (including actually, but not chronologically, the model approach, in particular, petroleomics) and the colloid-chemical approach developing the concept of the fractal nature of asphaltene-containing systems preceded the development of petroinformatics, an chemo-informational approach. Modern researchers in the field of oil systems are on the doorstop of the latter approach. Petroleomics is presented as the most informative possible source of multivariate digital data on detailed composition of oil systems; it can be formed using the FT ICR MS method, which, however, is used today for oil composition analysis on the limited scale. The efficiency of various mathematical methods in prediction of reference properties in analysis of multivariate spectral and reference data for a limited set of Russian crude oils is compared.