INTRODUCTION

The rapid development of computer technology of varying performance, coupled with the widespread penetration of high-speed data networks, has led to the emergence of new service industries such as augmented and virtual reality [1], Industrial Internet [2], Internet of Things [3], Internet of Military Things [4], Internet of Vehicles [5], etc. The widespread distribution of 5G/6G mobile communication standards in the near future will provide users with the opportunity to access computing power of any level, regardless of their geographical location and without the need to own it, on the one hand, and will give computer network owners the ability to quickly distribute the load across geographically dispersed computing nodes, on the other hand. Such computer networks operate under the following conditions:

—high cost of creation and operation;

—strict demands of network users on the quality of services provided;

—strong competition between service provider networks.

These stringent requirements lead to the need to optimize the use of computer network resources.

The object of the study concerns the problems of distributing the computing load across nodes of different performance levels and allocating computing resources to different users. The ideal result from the provider’s point of view is to provide each user with the minimal computing resources sufficient to solve his tasks, taking into account the mandatory compliance with the terms of the service level agreement (SLA). However, a number of fundamental problems prevent the achievement of this goal.

First, there is uncertainty regarding user requests to perform calculations. Despite the fact that the SLA concluded between the user and the provider determines the list of tasks whose solution is provided by the computer network, the intensity of the receipt of user tasks is unknown in advance. This means that the amount of dedicated resources required to meet the SLA may vary upward. In addition, the complexity of individual tasks also contains uncertainty. The reasons for this are not only differences in task volumes. Sometimes the execution time of the same task on the same hardware and software platform can differ significantly for different input data. An example of this is the numerical solution of optimization problems [6, 7] under different initial conditions, using machine learning algorithms [8], compiling video landscapes of computer games [9], etc.

Second, most of the software used is proprietary, which excludes the possibility of its detailed algorithmic analysis in order to determine the complexity of performing a particular task. Regarding the procedure for completing tasks, it is possible to draw only the most general conclusions, based on knowledge of the functioning of similar open source programs, as well as algorithms for solving a particular class of problems. In some cases, there is direct and indirect statistical information about the execution time of tasks in the form of stress testing results.

Third, there is a known gap between user requirements for service quality as recorded in the SLAs and the control objects available to providers. The fact is that users determine the quality of their service through the maximum time it takes to complete tasks of a particular class. At the same time, to ensure the execution of user tasks, providers allocate a certain amount of computing resources: RAM and disk memory, processor cores, etc. The number of allocated resources affects the speed of task execution but does not determine the time of their completion due to the unknown time complexity.

The aim of this study is to propose stochastic models of the execution time of user tasks on computing nodes and methods for identifying the parameters of these models.

The article is organized as follows. Section 1 contains an informal description of the problems of estimating the probability of an SLA violation on a particular virtual computing node, which consists of exceeding the permissible execution time of a job. In this case, the scope of the user task is either known or the user profile is known, i.e., the probabilistic distribution of tasks of various sizes, characteristic for the given user.

Section 2 presents the basic principles for selecting task execution time models. An argument is given in favor of its probabilistic description, and useful properties that the selected models should have are proposed. In Section 3, the task completion time is proposed to be described by a family of random variables whose mean values \(\mathcal{M}\) and variance \(\mathcal{D}\) are functions of node resources, characteristics of the task being performed, and hardware and software environment variables. Similar to production functions [10], this section discusses the properties of functions \(\mathcal{M}\) and \(\mathcal{D}\) for different variables, and also some of their special cases are proposed. Here we can also find mathematically correct formulations and solutions to the problems of estimating the probability of an SLA violation.

In Section 4, it is proposed to use the weighted M-estimation methodology [11] to solve the problem of identifying the parameters of the proposed model of the complexity of user tasks. In this case, it is proposed to use one of the following convex functions as a loss function: quadratic, modulus, or convex piecewise linear.

Solving the problem of identifying parameters presupposes the availability of the necessary statistical information. Section 5 contains recommendations for its collection and processing. This information is accumulated in the process of stress testing, carried out on the hardware and software platform (hardware plus general system software (GSS)) on which the virtual computing node will subsequently operate. At the same time, real special software (SS) can be replaced by test software, and its brief classification is given in the section. In addition to the recorded execution time of tasks with various variable parameters, a list of recorded data is proposed, which can be useful for determining the variables of the hardware and software environment. The Conclusion contains brief conclusions on the model and prospects for further research.

1. DESCRIPTION OF THE PROBLEM

An informal description of the problem of characterizing the execution time of user tasks on computing nodes is presented. It is assumed that the node is virtual and is physically located on some computer hardware (CH): workstation, server, mainframe, etc. To ensure the functioning of the node, a certain number of processor cores, RAM, disk space, and other resources are allocated to the CH for their exclusive use. GSS and SS are deployed on the node. It is the SS that ensures the execution of user tasks arriving at the node.

All tasks that can be executed on the studied node are divided into types. Tasks are of the same type if they are performed by the same software using the same algorithms. For example, one type includes tasks for compressing video data, carried out by the same archiver program. If the same software is used to archive the text data, then this task belongs to a different type, since video and text information are compressed using different algorithms. At the same time, tasks for compressing video information using different software tools also belong to different types, since they use different software.

Another example of tasks of various types are the tasks of loading information into a certain database (DB), on the one hand, and unloading the data obtained as a result of a certain request, on the other hand. In the following presentation it is assumed that problems of one fixed type are being studied.

Tasks belonging to the same type are different from each other. These differences are described by some vector of parameters. The simplest characteristic of a task is the amount of data processed during its execution; for example, the volume of video files to be compressed, the number of pictures to be recognized, the amount of information to be loaded into the database, etc. However, the job characteristics are not limited to the amount of data. For example, in a neural network training task, the following parameters can be used:

—number of layers in the network;

—maximum number of nodes in a layer;

—volume of the training bank;

—a parameter that determines the condition for stopping the learning process.

As noted earlier, the time required to complete individual tasks of one type or another contains uncertainty due to various reasons. The first one is the “closed” nature of the software’s functioning. This type of uncertainty manifests itself in time fluctuations even when performing tasks with the same parameters.

The second reason lies in the uncertainty of the user’s choice of one or another vector of task parameters and input data. This uncertainty can be partially reduced by forming a so-called user profile based on historical information about the requests of the given user to complete tasks of the type being studied. For example, a task to download films from a certain storage in 80% of cases is related to full-length films with a duration of 90 to 240 minutes; in 15% of cases, to short films with a duration of 10 to 30 minutes; and in 5% of cases, to documentaries lasting from 40 to 90 minutes. In this way, the user profile defines the conditional shares of the task with various parameters.

To ensure optimization of the use of CH resources in the presence of user SLAs, the following practical problems need to be solved.

Problem 1. For the fixed hardware resources of a node, the state of the GSS and SS, and a user task of a known type and volume, estimate the probability of violating the SLA—exceeding the maximum execution time of a task of this type.

Problem 2. For a known user profile with fixed hardware resources of the node, the state of the GSS and SS, estimate the probability of violating the SLA—exceeding the maximum execution time for a task of this type.

A formal description of the proposed models, the formulation of the problem of identifying their parameters, a mathematically correct formulation of the problems, and solution of the problems presented above are given in the following sections.

2. BASIC PRINCIPLES FOR CHOOSING MODELS

The choice of mathematical models for the execution time of tasks on computing nodes is fundamentally influenced by the presence of uncertainty in this time. This means that repeating the same job on the same computer node will require different amounts of time. As a mathematical apparatus that describes this phenomenon and allows solving problems of system analysis for it, we can use the theory of fuzzy sets [12], game/guaranteed approach [13, 14], etc. However, the most promising approach for this seems to be the apparatus of probability theory. Thus, the task completion time will be considered a random value \(\tau \left( \omega \right)\), which means that the model that describes it belongs to the stochastic class.

A stochastic task execution time model should be selected based on the following principles.

1. Adequacy: the model specifies all those features of the actual execution time that are necessary to solve the full range of subsequent analysis problems, identify model parameters, and optimize them.

2. Versatility: The model provides the ability to adequately describe execution times for a wide range of computing nodes and user jobs.

3. Simplicity: the number of model parameters should be as minimal as possible while maintaining the required degree of adequacy and universality.

4. Flexibility with respect to a priori information: the model is adjusted in accordance with additional available information about the characteristics of CH, GSS, SS, and the presence of new or absence of any factors affecting the execution time of tasks of one type or another.

5. Development: for the selected models, there is an effective mathematical apparatus and algorithmic support for solving the entire range of applied problems related to the analysis of the studied time, identification of model parameters, and the possible subsequent optimization of the configuration of virtual computing nodes.

6. Possibility of using simulators: at the preliminary stage of identifying the model parameters when collecting statistical data on the task execution time, it is possible to use test software (TS) (benchmarks)—user load simulators and SS simulators.

3. STOCHASTIC MODELS FOR TIME COMPLEXITY: FORMAL DESCRIPTION, FORMULATION, AND SOLUTION OF THE ANALYSIS PROBLEM

According to the chosen stochastic approach, the execution time of a user task of some fixed type is proposed to be described by a random variable \(\tau \left( \omega \right)\), having a finite mathematical expectation \(\mathcal{M} \triangleq {\text{E}}\left\{ {{\tau }} \right\}\), and variance \(\mathcal{D} \triangleq {\text{E}}\{ {{\left( {\tau \left( \omega \right) - \mathcal{M}} \right)}^{2}}\} \). Both of these characteristics are unknown functions

$$\mathcal{M} = {{\mathcal{M}}_{z}}\left( {x,y} \right),\quad \mathcal{D} = {{\mathcal{D}}_{z}}\left( {x,y} \right),$$

where \(x\) is the vector of node resources, \(y\) is the vector of task parameters, and z is the vector of parameters of the hardware and software environment of the node. A detailed description of these vectors is given below.

The vector of variables \(x = \left( {{{x}_{1}}, \ldots ,{{x}_{N}}} \right)\) specifies the hardware and software composition of the virtual computing node, for example, \({{x}_{1}}\) is the number of processor cores, \({{x}_{2}}\) is the amount of available RAM, and \({{x}_{3}}\) is the amount of available disk space.

Note 1. The proposed model allows us to vary the set \(x\) depending on the composition and ability to configure the virtual node, adding some components to it, or excluding irrelevant ones. For example, the following ones can be added: \({{x}_{4}}\), the amount of disk space reserved for paging; \({{x}_{5}}\), the cache size; etc.

The vector \(y = \left( {{{y}_{1}}, \ldots ,{{y}_{M}}} \right)\) characterizes user jobs of a fixed type arriving at the given computing node. Components ym for different types of tasks can have completely different meanings, but they all affect the execution time. For example, if the node is configured to perform scientific calculations, e.g., to numerically solve mathematical physics equations, then the following variant is possible: M = 2, \({{y}_{1}}\) is the number of time layers in the numerical solution, and \({{y}_{2}}\) is the number of spatial grid nodes in the numerical solution.

Another example: a node hosts a database, and the user task running on it involves the input/output of data and execution of queries to solve some information problem. In this case M = 3, \({{y}_{1}}\) is the volume of the input data in the task, \({{y}_{2}}\) is the volume of the output data in the task, and \({{y}_{3}}\) is the number of requests made to the database to perform informational tasks in a task.

Vector \(z = \left( {{{z}_{1}}, \ldots ,{{z}_{K}}} \right)\) contains the current parameters of the hardware and software environment of the computing node. For example, if a database is hosted on the given computing node, then the following characteristics can be the environment’s parameters: \({{z}_{1}}\) is the maximum amount of RAM available on the node, \({{z}_{2}}\) is the current amount of useful data stored in the database, \({{z}_{3}}\) is the total current volume of service information stored in the database (transaction log, etc.), \({{z}_{4}}\) is the average amount of RAM of a computing node occupied by the GSS, etc.

Note 2. The following semantic differences exist between vectors x, y, and z. Vector x represents the control available to the provider. Vector y represents the control available to the user. Both x and y can vary independently of each other during stress testing and collection of statistical information for the subsequent identification of the model parameters. It should also be noted that due to the fact that different types of jobs can be executed on the same node, the dimension of vector x and the set of its possible values for the given node remains unchanged, in contrast to vector y, whose characteristics are different for tasks of different types.

Unlike x and y, parameters z of the hardware and software environment are not directly controlled by either the provider or the user. However, its components are available for direct or indirect observation during stress testing: information about it can be obtained either by executing service commands/queries or by analyzing system logs.

The set \(\mathfrak{A}\) of valid values of arguments (x, y) is limited and, moreover, finite: components xn, \(n = \overline {1,N} \), and \({{y}_{m}}\), \(m = \overline {1,M} \), can take values from some finite sets. Without loss of generality, it is assumed that the set of admissible vector values (x, y) is contained in the parallelepiped \(\mathcal{U} = \left[ {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} ,\bar {x}} \right] \times [\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} ,\bar {y}]\), on which the functions \({{\mathcal{M}}_{z}}\left( {x,y} \right)\) and \({{\mathcal{D}}_{z}}\left( {x,y} \right)\) have the following properties.

1. Nonnegativity: for any \(\left( {x,y,z} \right)\), the following inequalities are valid:

$${{\mathcal{M}}_{z}}\left( {x,y} \right) > 0,\quad {{\mathcal{D}}_{z}}\left( {x,y} \right) > 0.$$

These inequalities are obvious: the mathematical expectation and variance of any positive random variable with the first two moments finite are always positive.

2. Local nonincreasing \({{\mathcal{M}}_{z}}\left( \cdot \right)\) on x: for any fixed z, there is a subset \(\mathcal{U}' \subseteq \mathcal{U}\) such that for any \(\left( {x',y{\text{*}}} \right),\left( {x'',y{\text{*}}} \right) \in \mathcal{U}'\) such that \(x' \leqslant x''\), the following inequality is fulfilled component-by-component:

$${{\mathcal{M}}_{z}}\left( {x',y{\text{*}}} \right) \geqslant {{\mathcal{M}}_{z}}\left( {x'',y{\text{*}}} \right).$$

This inequality means that in some subdomain \(\mathcal{U}'\) an increase in the resources used leads to a decrease in the mean time for completing tasks; i.e., increasing the resources of a computing node makes sense.

3. Nondecreasing \({{\mathcal{M}}_{z}}\left( \cdot \right)\) on y: for any fixed z, the components of vector y can be defined so that for any \(\left( {x{\text{*}},y'} \right),\;\left( {x{\text{*}},y''} \right) \in \mathfrak{A}\) such that \(y' \leqslant y''\), the following inequality is valid component-wise:

$${{\mathcal{M}}_{z}}\left( {x{\text{*}},y'} \right) \leqslant {{\mathcal{M}}_{z}}\left( {x{\text{*}},y''} \right).$$

According to this property, vector y can be defined in such a way that all its components will have the meaning of the task volume, and with an increase in their values, the task completion time will increase on average.

4. Continuity by y: for any fixed x and z, functions \({{\mathcal{M}}_{z}}\left( {x,y} \right)\) and \({{\mathcal{D}}_{z}}\left( {x,y} \right)\) are continuous with respect to variable y. This property means that, with fixed resources, small variations in the size of a user task lead to small variations in the characteristics of its execution time.

5. Convex \({{\mathcal{M}}_{z}}\left( \cdot \right)\) on y: for any fixed z and for any \(\left( {x{\text{*}},y'} \right),\left( {x{\text{*}},y''} \right) \in \mathcal{U}\) and \(\lambda \in \left[ {0,1} \right]\), the following inequality is valid:

$${{\mathcal{M}}_{z}}\left( {x{\text{*}},\lambda y' + \left( {1 - \lambda } \right)y''} \right) \leqslant \lambda {{\mathcal{M}}_{z}}\left( {x{\text{*}},y'} \right) + \left( {1 - \lambda } \right){{\mathcal{M}}_{z}}\left( {x{\text{*}},y''} \right).$$

According to the inequality, it means that with fixed resources, the mean job completion time as a function of the size of the user job grows linearly or faster.

Note 3. The locality of the nonincrease in the mean task execution time for resource variables (Property 2) appears exotic only at first glance. First, the analogous behavior of the mean execution time is observed when analyzing the functioning of real applications, such as databases. A specific example of this phenomenon will be given in the second part of this study. Second, the fact that the task completion time does not decrease with an increase in the amount of resources provided can be partly explained by the indirect restrictions and differences in the intensity of use of various types of resources. If we imagine the operation of the entire application as a collection of some data processing pipelines of varying intensity exploiting common node resources, then the following situation is quite possible. Among the mentioned conveyors, the least productive conveyor, the bottleneck, ultimately determines the intensity of the work of the entire node. Increasing some resources above a certain threshold can lead to a situation where the bottleneck will no longer cope with processing the increased data flow, and this data will be placed in a queue or simply uselessly lost, requiring its recreation. Moreover, due to the sharing of resources between pipelines, some of them will be taken away from the bottleneck by faster pipelines, further reducing its productivity.

Note 4. To illustrate Property 3, consider again the user task, which consists of numerically solving a certain equation of mathematical physics. Simply put, the task execution time is an increasing function of the number of grid nodes at which the solution to the equation needs to be calculated. The task parameters can be described as follows: y1 is the grid step in spatial variable, y2 is the grid time step, y3 is the area of integration of the spatial variable, and y4 is the area of integration over time.

However, such a parameterization will not satisfy property 3: the mean task execution time will not increase with increasing grid steps (parameters y1 and y2). The parameterization can be suitably modified so that property 3 is satisfied: the correct version is given in this section above, immediately after Note 1.

The listed properties of functions \({{\mathcal{M}}_{z}}\left( {x,y} \right)\) and \({{\mathcal{D}}_{z}}\left( {x,y} \right)\) impose certain restrictions on them. It is proposed to use the following dependencies in this study.

The first model assumes an exponential dependence of \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) on the resources–task volume pair (x, y) and independence of parameters z of the hardware and software environment

$$f_{\varrho }^{1}\left( {x,y} \right) = \alpha + {\text{exp}}\left[ {\beta + \mathop \sum \limits_{n = 1}^N {{\gamma }_{n}}{{x}_{n}} + \mathop \sum \limits_{m = 1}^M {{\varepsilon }_{m}}{{y}_{m}}} \right],$$
(3.1)

where \(\varrho \triangleq {\text{vec}}\left( {\alpha ,\beta ,\{ {{\gamma }_{n}}\} _{1}^{N},\{ {{\varepsilon }_{m}}\} _{1}^{M}} \right)\) is a vector of unknown parameters to be subsequently identified separately for functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\). Properties 1–5 of functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) dictate the presence of the following restrictions on the parameters’ vectors:

(1) \(\mathop {\min }\limits_{\left\{ {\left( {{{x}_{n}},{{y}_{m}}} \right)} \right\}} \left( {\mathop \sum \limits_{n = 1}^N {{\gamma }_{n}}{{x}_{n}} + \mathop \sum \limits_{m = 1}^M {{\varepsilon }_{m}}{{y}_{m}}} \right) + \beta - {\text{ln}}\;\alpha \geqslant 0\) for functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) is the nonnegativity condition;

(2) \({{\gamma }_{n}} \leqslant 0\), \(\overline {1,N} \) for function \({{\mathcal{M}}_{z}}\) is the condition of the local nonincrease in variables \({{x}_{n}}\), \(\overline {1,N} \);

(3) \({{\varepsilon }_{m}} \geqslant 0\), \(\overline {1,M} \) for function \({{\mathcal{M}}_{z}}\) is the condition of the local nondecrease in variables \({{y}_{m}}\), \(\overline {1,M} \).

The exponential model can be used for the initial, so-called exploratory statistical analysis [15], necessary, for example, to determine the effective operation zone of a virtual node.

The second model assumes a power-law dependence on components (x, y) and independence of parameters z of the hardware and software environment. Thus, functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) of the second type have the form

$$f_{\varrho }^{2}\left( {x,y} \right) = \alpha + \beta \mathop \prod \limits_{n = 1}^N x_{n}^{{ - {{\gamma }_{n}}}}\mathop \prod \limits_{m = 1}^M y_{m}^{{{{\varepsilon }_{m}}}},$$
(3.2)

where \(\varrho \triangleq {\text{vec}}(\alpha ,\beta ,\{ {{\gamma }_{n}}\} _{1}^{N},\{ {{\varepsilon }_{m}}\} _{1}^{M})\) is the vector of unknown parameters to be subsequently identified separately for functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\). Properties 1–5 of functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) dictate the presence of the following restrictions on the parameters’ vectors, some of which can be set explicitly, and some of which depend on each specific identification task:

(4) \(\alpha \geqslant 0\) and \(\beta \geqslant 0\) for functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) is the nonnegativity condition;

(5) \({{\gamma }_{n}} \geqslant 0,\;n = \overline {1,N} \) for function \({{\mathcal{M}}_{z}}\) is the condition of the nonincrease in the variable \(x,\;n = \overline {1,N} \);

(6) \({{\varepsilon }_{m}} \geqslant 0,\;m = \overline {1,M} \), for function \({{\mathcal{M}}_{z}}\) is the condition of the nondecrease in the variable \(y,\;m = \overline {1,M} \).

The third model assumes a piecewise linear dependence on variable y. At the same time, subsets \({{\{ {{U}_{j}}\} }_{{j = \overline {1,J} }}}\) of set \(\mathcal{U}\), on which functions \({{\mathcal{M}}_{z}}\left( {x,y} \right)\) and \({{\mathcal{D}}_{z}}\left( {x,y} \right)\) are linear in variables y, form a partition \(\mathcal{U}\): \({{U}_{i}} \cap {{U}_{j}} = \varnothing \) for \(\forall \;i \ne j\) and \(U_{{j = 1}}^{J}{{U}_{j}} = \mathcal{U}\). Let us introduce indicator functions \({{{\mathbf{I}}}_{{{{U}_{j}}}}}\left( {x,y} \right)\) of sets \({{\{ {{U}_{j}}\} }_{{j = \overline {1,J} }}}\):

$${{{\mathbf{I}}}_{{{{U}_{j}}}}}\left( {x,y} \right) \triangleq \left\{ {\begin{array}{*{20}{l}} {1,}&{{\text{if}}\quad \left( {x,y} \right) \in {{U}_{j}},} \\ {0,}&{{\text{if}}\quad \left( {x,y} \right) \notin {{U}_{j}}.} \end{array}} \right.$$

To perform tasks of different sizes, a virtual computing node uses different amounts of resources. When tasks reach certain volumes, the node abruptly changes the procedure for processing them, while these changes are not directly observable. They can only be judged indirectly by analyzing the information stored in system logs. This jump is the reason for the change in the parameters of the linear dependence of the processing time on the job size. This phenomenon can be most clearly observed when varying the volume of the tasks and abrupt changes in the use of RAM. With small volumes, the entire task fits in the cache, the fastest-running area of RAM. Then, as the size increases, the task begins to fill the free part of RAM, and the processing speed decreases slightly. As soon as the amount of RAM is insufficient to fully accommodate the job, the paging mechanism is turned on, which dramatically increases the processing time.

Thus, functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) of the third type have the form

$$f_{\varrho }^{3}\left( {x,y} \right) = \alpha + \mathop \prod \limits_{n = 1}^N x_{n}^{{ - {{\beta }_{n}}}}\mathop \sum \limits_{j = 1}^J {{{\mathbf{I}}}_{{{{U}_{j}}}}}\left( {x,y} \right)\left( {\mathop \sum \limits_{m = 1}^M {{\gamma }_{{jm}}}{{y}_{m}} + {{\varepsilon }_{j}}} \right),$$
(3.3)

where \(\varrho \triangleq {\text{vec}}\left( {\alpha ,\{ {{\beta }_{n}}\} _{1}^{N},\{ {{\gamma }_{{jm}}}\} _{{1,1}}^{{J,M}},\{ {{\varepsilon }_{j}}\} _{1}^{J}} \right)\) is the vector of unknown parameters to be subsequently identified separately for functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\). It should be clarified that \({{\gamma }_{{jm}}}\) are the coefficients of linear dependence on the variable \({{y}_{m}},\;m = \overline {1,M} \), in area \({{U}_{j}},\;j = \overline {1,J} \).

Properties 1–5 of functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) define the following restrictions on the estimated parameters:

(1) \(\alpha \geqslant 0\) for functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) is the nonnegativity condition;

(2) \({{\beta }_{n}} \geqslant 0,\;n = \overline {1,N} ,\) for functions \({{\mathcal{M}}_{z}}\) is the local nonincreasing by variable x;

(3) \(\gamma _{{jm}}^{\mathcal{M}} \geqslant 0,\;m = \overline {1,M} ,\;j = \overline {1,J} ,\) is the nondecreasing \({{\mathcal{M}}_{z}}\) by variable y;

(4) joint restrictions on \(\{ {{\gamma }_{{jm}}}\} _{{1,1}}^{{J,M}}\) and \(\{ {{\varepsilon }_{j}}\} _{1}^{J}\), ensuring the continuity of functions \({{\mathcal{M}}_{z}}\) and \({{\mathcal{D}}_{z}}\) by variable y;

(5) joint restrictions on \(\{ {{\gamma }_{{jm}}}\} _{{1,1}}^{{J,M}}\) and \(\{ {{\varepsilon }_{j}}\} _{1}^{J}\), providing convexity of \({{\mathcal{M}}_{z}}\) by variable y.

Note 5. The models of the mean task completion time (3.2) and (3.3) allow the following economic interpretation. Let us assume that the algorithm that implements the solution of the user task belongs to the complexity class P. This means that the number of operations ensuring the execution of the task is limited from above by some power, in a particular case, by the linear function of the vector components y. Due to the use of the paging mechanism of pages, when there is insufficient RAM, the number of operations can increase, which is described using a power-law or convex piecewise linear function of the task’s parameters. The task’s execution intensity, which determines the number of algorithm operations performed on a computing node per unit of time, is close in meaning to the production function [10], in which the components of vector x act as resources. One of the popular functions, the Cobb-Douglas function, is the product of components x with some positive indicators. Knowing the total number of operations and the intensity of their execution, the task completion time is determined by their quotient. Models (3.2) and (3.3) represent the ratio of the upper estimates of the number of required operations to the intensity of their execution, described by the Cobb-Douglas function.

The proposed stochastic runtime models allow us to correctly formulate Problems 1 and 2 from the previous section as analysis problems and solve them.

Analysis task 1. On the virtual node, tasks of a fixed type are executed, and functions of the mean \({{\mathcal{M}}_{z}}\left( {x,y} \right)\) and variance \({{\mathcal{D}}_{z}}\left( {x,y} \right)\) time \(\tau \left( \omega \right)\) of task completion are known. In addition, the triple node configuration–task parameters–hardware and software environment parameters \(\left( {x,y,z} \right)\) are fixed and known. Let \(\bar {T}\) be the SLA parameter that determines the maximum allowable time for completing the user task y: estimate from above the probability of time \(\tau \left( \omega \right)\) exceeding threshold \(\bar {T}\).

The solution to the problem can be obtained using the Chebyshev inequality [16]:

$$P\left\{ {\tau \left( \omega \right) > \bar {T}} \right\} = P\left\{ {\tau \left( \omega \right) - {{\mathcal{M}}_{z}}\left( {x,y} \right) > \bar {T} - {{\mathcal{M}}_{z}}\left( {x,y} \right)} \right\}$$
$$ \leqslant \frac{{{{\mathcal{D}}_{z}}\left( {x,y} \right)}}{{{{\mathcal{D}}_{z}}\left( {x,y} \right) + {{{(\bar {T} - {{\mathcal{M}}_{z}}\left( {x,y} \right))}}^{2}}}}.$$
(3.4)

Analysis task 2. The parameters of the configuration of node x are fixed and known and the parameters of the user task \(Y\left( \omega \right)\) are random. However, the user task profile is known and specified in the form of a probability distribution \(P\left( {dy} \right)\) of vector \(Y\left( \omega \right)\). In addition, there is a known dependence connecting the value of parameter z on the pair \(\left( {x,y} \right)\): \(z = z\left( {x,y} \right)\). Let \(\bar {T}\) be the SLA parameter that determines the maximum execution time of a user task with parameters Y: estimate from above the probability of time \(\tau \left( \omega \right)\) exceeding threshold \(\bar {T}\).

The mean time to complete a user job \(\bar {\mathcal{M}}\) is calculated using the total probability formula [14]:

$$\bar {\mathcal{M}} = E\left\{ {{{\tau }}\left( {{\omega }} \right)} \right\} = E\left\{ {E\left\{ {{{\tau }}\left( {{\omega }} \right)|{\text{Y}}} \right\}} \right\} = E\left\{ {{{\mathcal{M}}_{{z\left( {x,Y} \right)}}}\left( {x,Y} \right)} \right\} = \int {{{\mathcal{M}}_{{z\left( {x,y} \right)}}}\left( {x,y} \right)P\left( {dy} \right).} $$
(3.5)

In a similar way, we can calculate the variance \(\bar {\mathcal{D}}\):

$$\begin{gathered} \bar {\mathcal{D}} = E\left\{ {{{{\left( {\tau \left( \omega \right) - \bar {\mathcal{M}}} \right)}}^{2}}} \right\} \\ = E\left\{ {E\left\{ {\left( {(\tau \left( \omega \right) - {{\mathcal{M}}_{z}}\left( {x,Y} \right)} \right) + {{{\left( {{{\mathcal{M}}_{z}}\left( {x,Y} \right) - \bar {\mathcal{M}}} \right)}}^{2}}|Y)} \right\}} \right\} \\ = E\left\{ {E\left\{ {{{{\left( {\tau \left( \omega \right) - {{\mathcal{M}}_{z}}\left( {x,Y} \right)} \right)}}^{2}}|Y} \right\}} \right\} + E\left\{ {E\left\{ {{{{\left( {{{\mathcal{M}}_{z}}\left( {x,Y} \right) - \bar {\mathcal{M}}} \right)}}^{2}}|Y} \right\}} \right\} \\ + \,\,2E\left\{ {E\left\{ {\left( {\tau \left( \omega \right) - {{\mathcal{M}}_{z}}\left( {x,Y} \right)} \right)\left( {{{\mathcal{M}}_{z}}\left( {x,Y} \right) - \bar {\mathcal{M}}} \right)|Y} \right\}} \right\} \\ = \int {{{\mathcal{D}}_{z}}\left( {x,y} \right)P\left( {dy} \right) + \int {{{{\left( {{{\mathcal{M}}_{z}}\left( {x,y} \right) - \bar {\mathcal{M}}} \right)}}^{2}}P\left( {dy} \right)} } \\ = \int {{{\mathcal{D}}_{z}}\left( {x,y} \right)P\left( {dy} \right) + \int {\mathcal{M}_{z}^{2}\left( {x,y} \right)P\left( {dy} \right) - {{{\left( {\bar {\mathcal{M}}} \right)}}^{2}}.} } \\ \end{gathered} $$
(3.6)

The solution to the problem can again be obtained using Chebyshev’s inequality:

$$P\left\{ {\tau \left( \omega \right) > \bar {T}} \right\} \leqslant \frac{{\bar {\mathcal{D}}}}{{\bar {\mathcal{D}} + {{{\left( {\bar {T} - \bar {\mathcal{M}}} \right)}}^{2}}}}.$$
(3.7)

Thus, the proposed stochastic model of the execution time of user tasks allows us to effectively solve an important analysis problem: estimating from above the probability of the SLA being violated.

Estimates (3.4) and (3.7), based on Chebyshev’s inequalities, are very conservative. If there is additional information about the time distribution \(\tau \left( \omega \right)\), then the proposed estimates can be significantly refined (see the works [17, 18] and links within them). Note that these assumptions are not overly onerous and are almost always met in practice.

The first assumption is that the distribution function \(\tau \left( \omega \right)\) is concave on the positive semiaxis. In this case, the solution of the analysis task 1 can be obtained from Gauss’s inequality:

$$P\left\{ {\tau \left( \omega \right) > \bar {T}} \right\} \leqslant \left\{ {\begin{array}{*{20}{l}} {1 - \frac{{\bar {T}}}{{\sqrt {3(\mathcal{M}_{z}^{2}\left( {x,y} \right) + {{\mathcal{D}}_{z}}\left( {x,y} \right))} }},\quad {\text{if}}\quad 0 \leqslant \bar {T} \leqslant \frac{2}{{\sqrt 3 }}\sqrt {\mathcal{M}_{z}^{2}\left( {x,y} \right) + {{\mathcal{D}}_{z}}\left( {x,y} \right)} ,} \\ {\frac{{4(\mathcal{M}_{z}^{2}\left( {x,y} \right) + {{\mathcal{D}}_{z}}\left( {x,y} \right))}}{{9{{{\bar {T}}}^{2}}}},\quad {\text{if}}\quad \frac{{2\sqrt {\mathcal{M}_{z}^{2}\left( {x,y} \right) + {{\mathcal{D}}_{z}}\left( {x,y} \right)} }}{{\sqrt 3 }} < \bar {T}.} \end{array}} \right.$$
(3.8)

The second assumption is that \(\tau \left( \omega \right)\) has a distribution density that is a unimodal function on the positive semiaxis. In this case, the solution of the analysis task 1 can be obtained from Cantelli’s inequality:

$$\begin{array}{*{20}{c}} {P\left\{ {\tau \left( \omega \right) > \bar {T}} \right\}} \\ { \leqslant \left\{ {\begin{array}{*{20}{l}} {\frac{{3{{\mathcal{D}}_{z}}\left( {x,y} \right) - {{{\left( {\bar {T} - {{\mathcal{M}}_{z}}\left( {x,y} \right)} \right)}}^{2}}}}{{3{{\mathcal{D}}_{z}}\left( {x,y} \right) + 3{{{\left( {\bar {T} - {{\mathcal{M}}_{z}}\left( {x,y} \right)} \right)}}^{2}}}},\quad {\text{if}}\quad {{\mathcal{M}}_{z}}\left( {x,y} \right) \leqslant \bar {T} \leqslant {{\mathcal{M}}_{z}}\left( {x,y} \right) + \sqrt {\frac{5}{3}{{\mathcal{D}}_{z}}\left( {x,y} \right)} ,} \\ {\frac{4}{9}\frac{{{{\mathcal{D}}_{z}}\left( {x,y} \right)}}{{{{\mathcal{D}}_{z}}\left( {x,y} \right) + {{{\left( {\bar {T} - {{\mathcal{M}}_{z}}\left( {x,y} \right)} \right)}}^{2}}}},\quad {\text{if}}\quad {{\mathcal{M}}_{z}}\left( {x,y} \right) + \sqrt {\frac{5}{3}{{\mathcal{D}}_{z}}\left( {x,y} \right)} < \bar {T}.} \end{array}} \right.} \end{array}$$
(3.9)

Note 6. The type of functions \({{f}_{\varrho }}\left( \cdot \right)\) used to describe momentary characteristics of time \(\tau \left( \omega \right)\) of the execution of a task is not limited to functions (3.1)–(3.3). The reasons for choosing these models in this study are as follows. The exponential dependence (3.1) can be used for the exploratory analysis of the behavior of the described moment characteristics and determination of the area (x, y) in which the node efficiently performs jobs of the specified type. The power model (3.2) has a successful economic interpretation. Model (3.3), which represents a power-law dependence on the resource parameters and a piecewise linear dependence on the volume of the task being performed, also has its strengths. First, it is simple and economical, as it includes a small number of parameters. Second, this model allows us to describe the abrupt change in the tasks’ processing discipline depending on their volume.

4. PARAMETER IDENTIFICATION PROBLEMS

Let some fixed set \({{\{ ({{x}^{r}},{{y}^{r}})\} }_{{r = \overline {1,R} }}}\) of variables (x, y), for which the set of stress testing of the studied computational node was carried out, be given. The result of testing is the set of vectors \({{\{ {{\tilde {\mathcal{M}}}^{r}},{{\tilde {\mathcal{D}}}^{r}},{{\tilde {\mathcal{Z}}}^{r}}\} }_{{r = \overline {1,R} }}}\) obtained by processing the statistical information about the operation of the node for fixed values of the pairs \(({{x}^{r}},{{y}^{r}})\). The first component \({{\tilde {\mathcal{M}}}^{r}}\) represents the sample mean time \(\tau \left( \omega \right)\), the second is its variance, and the third component, possibly block component, consists of some known sample characteristics of the hardware-software environment corresponding to the pair \(({{x}^{r}},{{y}^{r}})\). The set of positive weights \({{\{ {{w}_{r}}\} }_{{r = \overline {1,R} }}}\), determining the individual significance of the results of each experiment, is also given.

The selection of models that describe the mathematical expectation and dispersion of the task completion time among possible functions (3.1)–(3.3), as well as the identification of their parameters, are carried out independently. Therefore, we will describe in detail the formulation of the identification problem only for the mathematical expectation function: the problem of identifying the parameters of the dispersion function will look absolutely similar. The value

$$\Delta _{r}^{{\mathcal{M},i}}\left( \varrho \right) \triangleq {{\tilde {\mathcal{M}}}_{r}} - f_{\varrho }^{i}\left( {{{X}_{r}},{{Y}_{r}}} \right),\quad i = 1,2,3,$$

represents the error in estimating the mean job execution time when describing the configuration \(({{x}^{r}},{{y}^{r}})\) of functions \(f_{\varrho }^{i}\left( \cdot \right)\) calculated with the parameter values \(\varrho \).

To compare the quality of models with different parameters, we will use the loss function \(\pi \left( u \right):\mathbb{R} \to \mathbb{R}\), satisfying the following properties [11]: \(\pi \left( 0 \right) = 0\), \(\pi \left( u \right) \geqslant 0\) for all \(u \in \mathbb{R}\), and function \(\pi \left( u \right)\) is nonincreasing at \(u < 0\) and nondecreasing at \(u > 0\).

The problem of the optimal identification of a model for the mean task execution time lies in finding

$$\left( {i{\text{*}},\varrho {\text{*}}} \right) \in {\text{Argmi}}{{{\text{n}}}_{{i,\varrho }}}\mathop \sum \limits_{r = 1}^R {{w}_{r}}\pi (\Delta _{r}^{{\mathcal{M},i}}\left( \varrho \right)).$$
(4.1)

The problem of the optimal identification of a model for the task execution time variance consists of defining

$$\left( {i{\text{*}},\varrho {\text{*}}} \right) \in {\text{Argmi}}{{{\text{n}}}_{{i,\rho }}}\mathop \sum \limits_{r = 1}^R {{w}_{r}}\pi (\Delta _{r}^{{\mathcal{D},i}}\left( \varrho \right)).$$
(4.2)

If the loss function is a quadratic function, i.e., \(\pi \left( u \right) = {{u}^{2}}\), then \(\left( {i{\text{*}},\varrho {\text{*}}} \right)\) is called the least squares estimate, if the absolute value is used, i.e., \(\pi \left( u \right) = \left| u \right|\), then \(\left( {i{\text{*}},\varrho {\text{*}}} \right)\) is an estimate of the smallest modules [19, 20]. In the more general case, when π(u) is a piecewise linear function satisfying the properties presented above, \(\left( {i{\text{*}},\varrho {\text{*}}} \right)\) is called a quantile estimate [21]. Obviously, (4.1) and (4.2) represent a weighted version of the problems of constructing M-estimates [19].

Note 7. Using different scales \({{\{ {{w}_{r}}\} }_{{r = \overline {1,R} }}}\) makes practical sense. First, the accuracy of the individual values of the sample means and variances \(({{\tilde {\mathcal{M}}}^{r}},{{\tilde {\mathcal{D}}}^{r}})\) for different parameters \(({{x}^{r}},{{y}^{r}})\) may be different, and this must be taken into account during the subsequent identification. This situation may arise in the case when, with a certain hardware and software composition of the node xr, the task completion time yr is quite large and there is no way to find a test sample of sufficient size for subsequent averaging and obtaining \(({{\tilde {\mathcal{M}}}^{r}},{{\tilde {\mathcal{D}}}^{r}})\) with acceptable accuracy. Second, in practice the following phenomenon is observed: in the case of scarce resources of x, there is a large variation in time \(\tau \left( \omega \right)\), which leads to the low accuracy of sample moments \({{\tilde {\mathcal{M}}}^{r}}\) and \({{\tilde {\mathcal{D}}}^{r}}\). Since models \(\mathcal{M}\left( \cdot \right)\) and \(\mathcal{D}\left( \cdot \right)\) are constructed uniformly for all values of x, the inequalities of observations \(({{\tilde {\mathcal{M}}}^{r}},{{\tilde {\mathcal{D}}}^{r}})\) can be taken into account by choosing the appropriate weights \({{w}_{r}}\). Third, the owner of a computing node may have some additional a priori information about the user’s preference to perform tasks with certain parameters more often than others. In relation to this, the task may arise to identify parameters in such a way that the model more accurately describes the execution time of the tasks preferred by the user, possibly to the detriment of the description of others.

5. STATISTICAL INFORMATION FOR IDENTIFYING MODELS: PRINCIPLES OF DATA MINING AND PROCESSING

The problem of identifying the parameters of tasks’ time complexity model when they are executed on a certain virtual node is practical, and therefore the statistical data mining for its solution should be carried out on the same hardware and software platform, which will then be used in reality. A difficulty can arise only when equipping a node with the planned SS: for various reasons it may not be available until the moment of actual operation. In this case, this SS should be replaced by a test software that is close in terms of purpose to the planned SS.

Currently, there is a fairly large selection of available TS, differing from each other in the following way:

—subject of testing (general computer performance, processor performance, graphics coprocessors, disk system, etc.),

—set of simulated applications (office programs, databases, scientific computing, etc.);

—level of cross-platform and readiness for stress testing (readiness out of the box/the need for preliminary compilation of code for the hardware and software platform being tested),

—level of service for collecting statistical information during stress testing (there is/is not a built-in ability to log the resources used),

—type of license (paid/free).

In the studied identification problem, the first two features are key when choosing a particular TS. The following types of TS allow us to simulate the load generated:

—office applications [22, 23],

—compilers, interpreters [24],

—data compression, including video [24],

—scientific computing (hydrodynamics, atmospheric modeling, molecular dynamics, etc.) [24],

—3D rendering [24],

—database transactions [25].

The use of TS instead of SS is acceptable at the initial stage of identification; however, final clarification of the model parameters is possible only based on the statistical information obtained during the operation of real SS. Further, regardless of the use of SS/TS, it should be taken into account that in order to solve the problems of the optimal identification of the mean time (4.1) and dispersion (4.2) of completing a task, it is necessary to repeat tests multiple times with the same values of the \(({{x}^{r}},{{y}^{r}})\) pairs with the subsequent calculation of the corresponding sample moments \(({{\tilde {\mathcal{M}}}^{r}},{{\tilde {\mathcal{D}}}^{r}})\).

Beyond measuring time \(\tau \) during stress testing, the data related to the state of the hardware and software environment should be recorded at some intervals, i.e., the information, whose processing will allow us to determine the components of vector z. Typically, the following information is recorded:

—percentage of CPU time used;

—amount of RAM;

—volume of recorded data;

—volume of data read;

—amount of page memory;

—amount of data written to the secondary page memory storage (swap);

—amount of data read from the swap, etc.

Typically, the frequency of the data recording is 1 s. For further processing of the obtained raw data, standard operations of averaging, finding the minimum/maximum/median values, etc., are applied to them. Additional information in the form of vector z allows us to build a partition \(\mathcal{U}\) for subsequent application of the piecewise linear model (3.3). For example, using such data processing, we can determine the minimum size of a job from which the memory paging mechanism begins to be used, which radically slows down the execution of jobs.

In conclusion, it should be noted that the practical implementation of identifying the parameters of the described models is related to solving the following problems. The first one is more theoretical in nature and is related to the correct planning of a stress testing experiment [26, 27], ensuring the identifiability of the parameters [28], constructing consistent estimates for them, and characterization of the rate of convergence to the true values [29]. The second problem is largely algorithmic. As the identification problems (4.1) and (4.2) are optimization problems, in which the criteria can be not only nonconvex in terms of the variables being optimized but also nonsmooth. The set of strict statements and theoretically based algorithms for solving such optimization problems is quite scarce, which leads to the need to use heuristic and neural network algorithms. The problem lies in the rational choice of the optimization algorithm in accordance with the specifics of the criterion and type of constraints or the development of one’s own version of the algorithm. A fairly complete review and description of such optimization algorithms is given in the monograph [30].

CONCLUSIONS

The first part of this paper studies the theoretical aspect of constructing a mathematical model of the execution time of user tasks on a virtual computing node. An argument is given in favor of using a probabilistic apparatus to describe the fundamental property of the variability inherent in this time. Based on the need for subsequent identification of the parameters of the probabilistic time distribution model, it is proposed to limit ourselves to its first two moments. The arguments of the functions that describe the selected points include the characteristics of the resources, user tasks, and the hardware and software environment. This paper presents the properties of these functions, as well as some of their variants. Obviously, the functions contain unknown parameters that must be identified for each computing node and type of task based on the statistical data obtained as a result of stress tests. Among the various types of parameter estimates, it is proposed to use weighted M-estimates. The principles of collecting and processing statistical information based on the results of stress testing carried out to identify the parameters of the proposed model are presented.

Research in this area cannot be considered complete. The following directions are considered promising. First, parametric model identification is a fairly resource-intensive task that requires lengthy and varied stress testing. In relation to this, the issues of optimizing the test plan and adaptive refinement of the model parameters during the operation of the node seem to be very relevant. This area should also include the possible expansion and refinement of the class of functions that describe the momentary characteristics of the task execution time. Second, it is important to detail the model, allowing it to take into account some additional features of the functioning of virtual nodes: sharing of hardware resources by different virtual nodes, simultaneous access of different users to one virtual node, etc.

However, it is of paramount importance to check the possibility of using the proposed methodology for constructing models of the complexity of computational tasks of various types: implementation of information processing in databases, scientific computing, image processing, data compression, etc. This is the goal of the subsequent parts of this study.