МС2Е: The Meta-Cloud Computing Environment for Interdisciplinary Studies

Smelyanskii, R. L.

doi:10.1134/S1019331622010051

МС2Е: The Meta-Cloud Computing Environment for Interdisciplinary Studies

From the Researcher’s Notebook
Open access
Published: 24 March 2022

Volume 92, pages 40–48, (2022)
Cite this article

Download PDF

You have full access to this open access article

Herald of the Russian Academy of Sciences Aims and scope Submit manuscript

МС2Е: The Meta-Cloud Computing Environment for Interdisciplinary Studies

Download PDF

R. L. Smelyanskii¹

1060 Accesses
Explore all metrics

Abstract

The Meta-Cloud Computing Environment (MC2E) is a Russian‒Chinese project dedicated to the study of methods and tools for building an information and computing environment for scientific computational experiments and interdisciplinary research, which ended in 2020. Work in this area was focused on the federation principle of organizing computations and data management, providing for the creation of a specialized heterogeneous ecosystem of data centers with virtualized infrastructure and high-performance computers, united by telecommunication resources—a data transmission network. One of the key problems of building such an ecosystem, central to the MC2E project, is the integration of high-performance computers and data processing centers—cloud computing environments based on server clusters. A brief overview of the main results of the project are presented.

Introduction to the 3rd International Workshop on Cloud Computing and Scientific Applications (CCSA’13)

Grid/Cloud Computing as New Paradigm for Collaborative Problem Solving and Shared Resources Management in Environmental Sciences

Changes and Challenges at the JINR and Its Member States Cloud Infrastructures

Article 06 June 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Scientific interdisciplinary research in various fields of science involves the cooperation of several geographically separated research groups and access to scientific collections of data and high-performance computing resources, which requires an information, computing, and communication infrastructure accounting for the specifics of each project. Its construction in the traditional way (a local cluster of servers, or a data processing center—a DPC with means of computer virtualization^{Footnote 1} and a data transmission network) is associated with a number of problems:

• It requires significant financial and material investments and highly qualified IT specialists. The problem becomes even more complicated if experiments in the project are conducted by independent research teams that use different working methods, often have different internal business processes and hardware and software preferences, and may be located far from each other. In addition, experiments in different subject areas can take different amounts of time.

• At the initial stage, requirements on the infrastructure of the project are known approximately and, as a rule, are overestimated, which leads to a loss of investment efficiency.

• Work becomes more difficult when scientific data are distributed and are used by different teams at the same time.

• Data needed by one team may belong to another; without a specialized infrastructure management system, such issues are difficult to regulate.

• Different research teams of the same project may use different tools and software for processing, collecting, and storing data. The creation or mastering of new tools is usually unacceptable for them; therefore, it is necessary to provide an opportunity to bring already implemented developments into the information computation environment of the project.

High-Performance Computing (HPC) has always been the primary tool for numerical experiments and modeling. Computing resources for them are provided by supercomputers and specialized server clusters. However, the analysis of the data presented on the site of supercomputing systems TOP500.org [1] suggests that the number of scientific applications is growing faster than the number of supercomputers and high-performance computing units.^{Footnote 2} At the same time, we see a rapid rise in the popularity of cloud computing, with the use of Data Center Networks (DCNs) to increase the computing power of cloud computing platforms. The EGI association [2] is a good example of this.

Supercomputers (HPC-S) and DPC cloud environments, or the HPC Server Cluster (HPC-C), have different computational capabilities and utilization. Most applications will run faster on a supercomputer than on a server cluster in a data center. However, the total time to obtain the result of a computational experiment, that is, waiting for the application in the queue plus the execution time on the HPC-S supercomputer, may be longer than the execution time plus the waiting time in the queue in the cloud on the HPC-C server cluster. We will call this total delay program time in the system (PTS criterion).

The mission of the MC2E project was to investigate how an environment should be organized to make it possible to create an effective information and computing infrastructure meeting the listed features of a specific interdisciplinary project. One of the still little-studied problems of this mission is the integration of two rather different high-performance computing environments—supercomputers (HPC-S) and cloud environments (HPC-C), differing in many parameters: the level and methods of resource management, virtualization techniques, the composition of parameters and the form of specifying program execution request, scheduling, and resource allocation policy. Thus, characteristic of resource allocation in an HPC-S environment is the use of reservation and isolation methods, that is, when different calculations cannot share the same physical resource. HPC-C environments are prone to provisioning on demand. A cloud environment offers more flexibility and convenience for working with resources than HPC-S, providing virtualized resources that are customized for specific purposes. The above-listed differences between the HPC-S and HPC-C platforms make it difficult to switch tasks between them automatically if either of them becomes heavily loaded. Thus, to change the target platform, researchers need to spend time and resources tuning up their software.

Another challenge on the path to integrating HPC-S and HPC-C is how to choose the right computer in a heterogeneous integrated environment of two platforms for executing an MPI program,^{Footnote 3} using the MPI (Message Passing Interface) library—a tool for interaction between parallel branches of computers of the same program. In other words, for each MPI program in the queue, it is necessary to decide where its execution will be more efficient from the point of view of the PTS criterion. Dealing with the problem of integration, it is necessary to substantiate the hypothesis that the joint use of physical resources in an HPC-C environment by several MPI programs will simultaneously reduce the total execution time; that is, this time will be less than the sum of the sequential execution times of each.

Another difficulty lies in the ability of the environment to aggregate the resources of the data transmission network between DPCs in the DCN (Data Communication Network). The key problem here is the implementation of the Bandwidth on Demand (BoD) service. This service must allocate, upon request, channels between two or more DPCs with the appropriate quality, that is, capable of transmitting a certain amount of data in a certain time interval through a data transmission network, for example, the Internet. Note that such a service does not imply a dedicated channel between interacting computers. Moreover, it should be created dynamically by combining existing network resources (transmission lines, switches, etc.).

After the introductory theses, let us present the main results of the MC2E project: principles of building the structure of the MC2E environment; an experimental study of the influence of a network in a DPC on the sharing of processors in the clouds and the possibility of joint use of these resources by MPI programs; new approaches to predicting the execution time of programs on a certain set of computers to determine the most efficient place for program execution; approaches to the implementation of the BoD service. For a detailed report on the project, see [3].

THE MC2E STRUCTURE

Let us list the basic principles of organizing the MC2E environment:

• The infrastructure is a combination of locations equipped with computers, storage, and network resources, called federates.

• The federation controls all resources (central processing unit, memory, data network, software) provided by the federates.

• The resources of one federate can simultaneously be used by different projects.

• Federated resources can be virtualized.

• Resources at the user level are characterized by a high level of abstraction, and their use should not imply high user qualifications.

• The results of experiments should always be stored and can be used by others to reproduce or continue the experiment.

• The federation provides data processing services; in other words, a federation is a computer.

Figure 1 shows the organization of the MC2E environment with the distribution of responsibilities among the international project participants.

An example of a federated computing resource can be a high-performance computer or a DPC. Each federation has its own policy that governs the allocation of resources among project participants. The idea of building an information and computing infrastructure as a federation of heterogeneous computing facilities has already been used in many existing projects. Some of them were intended for experiments in the field of computer networks. For example, the GENI project—Global Environment for Network Innovation [4], initiated by the US National Science Foundation (NSF)—is a virtual laboratory for network experiments on an international scale. Today, over 200 US universities are connected to the GENI environment. Another NSF-supported project, FABRIC [5], offers an adaptive programmable infrastructure for computer science research. Similar but less-well-known projects include Ophelia [6] and Fed4Fire [7]. The former was implemented with the support of the EU, while the latter was supported by 17 companies from eight European countries. There are also others that provide an environment for computational experiments regardless of the applied field [8–10]. However, all these projects have several significant disadvantages:

• poor integration between HPC-S and HPC-C, which does not allow automatic selection of an effective computing resource;

• the inability to build a chain of services for conducting experiments;

• the impossibility to transfer existing software automatically to a new environment;

• resource scheduling does not account for the scaling of service performance;

• lack of DCN control and management services such as monitoring and BoD services.

The basic technologies for the MC2E project in the development of a virtual infrastructure for interdisciplinary research were software-defined networking (SDN) and network function virtualization (NFV) [11]. These technologies make it possible to increase the level of abstraction of resources and ensure their consistent optimization and automation of infrastructure management. Instead of separate resources, users get virtual infrastructures fully equipped with the necessary resources—computing power, communication channels, and storage systems—with guaranteed performance and quality of service (QoS) based on a service level agreement (SLA).

In fact, the MC2E project used two cloud environments: Docklet [12] and Cloud Conductor (C2) [13]. The goal of Docklet is to provide a personalized workspace in the cloud [14]. Docklet is based on the LXC virtual cluster [15]. Docklet users interact directly in their workspace, using the browser to develop, debug, and test their software with the help of the tools that the environment provides. Docklet makes it possible to build customized small virtual datacenters by creating virtualized clusters and then providing experimenters with a customized workspace in the cloud. Users only need a modern browser to access their federated workspace from anywhere on the Internet, anytime.

The architecture of the C2 platform is based on the reference implementation of the ETSI NFV MANO model [13]. C2 provides full support for the life cycle of virtualized services (VSs)—initialization, configuration, execution, and deinitialization, which is carried out using the so-called templates in the TOSCA language [13]. All that is needed to install and start a VS is a TOSCA template, which includes a description of the structure of the cloud application; the application management policy; and the image of the operating system and scripts for starting, stopping, and configuring the application that implements the VS. TOSCA is formed by the service administrator as a zip or tar archive.

Thus, a federated environment has the following advantages:

• the simplicity and speed of allocation, customization, and scaling of resources;

• application development as a chain of services based on the NFV technology;

• the unification of infrastructures of different research groups and the coordination of their access policies;

• automated resource scheduling to fulfill user requests based on access policy and SLA requirements;

• a templating language for describing applications that allows one to abstract oneself from low-level system details;

• a decentralized resource accounting system for mutual settlements between project participants;

• ample opportunities for tracking and monitoring experiments;

• a high efficiency of network virtualization thanks to the technology of software-defined networks (SDNs), which makes it possible to configure virtualized network channels for each specific experiment;

• a common specification language required for porting existing research software to the MC2E environment.

Let us list the main components or subsystems of the MC2E environment:

• meta-cloud, the orchestration of custom applications, their distribution and scheduling among federates;

• interface, which provides users with a unified API for submitting their applications and a federated administrator for the purpose of managing and controlling federation resources;

• network, which regulates the use of network resources and provides the BoD service;

• monitor, which monitors and records the resource consumption of all federates in the MC2E;

• quality of service and administrative control and management, which ensure compliance with the resource use policy based on user requirements (SLA) and guarantee the reliability of the resources.

These components are described in detail in [16]. The general process for executing a user request in MC2E is as follows:

(1) Using a single MC2E interface, the user sends the application and data to the front-end server.

(2) The front-end server calls the meta-cloud scheduler and monitor to select a federate to run the application.

(3) The meta-cloud analyzes the queue and predicts application execution times and data transfer times for all available federates.

(4) Based on the forecast, the meta-cloud chooses a federate that minimizes the total time the application spends in the system (the PTS criterion is the waiting time in the queue + the execution time).

(5) The meta-cloud contacts the network control loop to lay the channel to the federate selected in step 4 and sends the application and its data there.

(6) The federate launches the application and returns the results to the user.

(7) In the event of a failure, the QoS monitoring and management subsystem redirects the application using the meta-cloud to another federate.

THE CLOUD AS A HIGH-PERFORMANCE COMPUTING ENVIRONMENT

Even though the computing speed in cloud environments is lower than in specialized server clusters or supercomputers [17], this technology is becoming increasingly popular and is chosen as a platform for high-performance computing due to its low cost and ease of access. Several articles [18, 19] show that one of the main bottlenecks in HPC-C performance is related to delays in the data transmission network (DTN) in the data center. Supercomputers use fast DTNs based on specialized tools [20, 21]. HPC-Cs mainly rely on the conventional Ethernet. DTN delays can lead to underutilization of processors (CPUs) by applications with a high data exchange rate since they can experience long “pauses” in computations because of waiting for data transfer over the network between application branches.

One of the important results of the MC2E project is the experimental substantiation of the hypothesis that the performance of HPC applications may degrade slightly when the CPU cores are shared. This hypothesis was tested within the MC2E project using the HPC‒NAS Parallel Benchmarks (NPB) [22] in a cloud environment of a mini data center with a QEMU/KVM hypervisor with 64 virtual machines (VM) (Ubuntu 16.04, 1 vCPU, 1024 MB RAM), MPI version 3.2. The master server had 16 VMs, the rest having eight VMs each. The average latency between different virtual machines is about 400 μs. The bandwidth was 18.2 GB/s on one server and 5.86 GB/s on different servers.

A series of experiments was conducted to investigate the effect of the network bandwidth on CPU utilization. Five NPB MPI programs with 2, 4, 8, 16, 32, and 64 processes on each virtual machine were consecutively launched in a network with three channel bandwidths: 100 MB/s and 1 and 10 GB/s. The experiments showed an almost linear dependence of the degradation of the CPU load on the increase in the number of MPI processes. This happened because different MPI processes were executed on different VMs, and data were transmitted between different servers. CPU utilization also dropped when the program was running on the same physical server (2, 4, and 8 CPUs). Reducing CPU usage is exactly what allows the cores of the same CPU to be used between different programs. This series of experiments with graphs is presented in detail in [16].

Another series of experiments examined the ability of different HPC applications to share cores of the same CPU. The experimental methodology was as follows: five pairs of identical NPB MPI programs with the same number (2, 4, 8, 16, 32, and 64) of processes in each were studied sequentially. To assess the ability of a program to share CPU resources, the following metric was used:

$${\text{Queue metric}} = \frac{{T_{{{\text{pure}}}}^{1} + ~T_{{{\text{pure}}}}^{2}}}{{\max (T_{{{\text{sharing}}}}^{1},T_{{{\text{sharing}}}}^{2})}},$$

where $T_{{{\text{pure}}}}^{i}$ (i = 1.2) is runtime without sharing resources and $T_{{{\text{sharing}}}}^{i}$ (i = 1.2) is runtime when two programs were using the same cores of the same CPU.

Obviously, if the metric value is greater than 1, the execution of two programs launched at the same time will take less time than when they are launched sequentially. The experiments showed that even in an environment with a slow network (100 MB/s), one can get up to 20% faster execution. However, not all programs can efficiently share physical resources. This series of experiments is described in detail in [16].

PREDICTING PROGRAM EXECUTION TIME

Recall that one of the main criteria for the efficiency of a cloud computing environment for HPC applications is the time spent by the program in this environment (PTS criterion). This value depends on the algorithms for resource allocation in the cloud environment (mapping of virtual computers to physical ones) and the discipline of servicing the program queue, considering the heterogeneity of physical computers.

From the analysis of the user request execution process presented at the end of the “MC2E Structure” section, we can see that stages 3 (metaplanning) and 4 (local planning) are optimization points by the PTS criterion. An important part of the MC2E project was the R&D of the algorithm, with the selection of the most efficient computer in the federation for a specific program based on the criterion of the minimum execution time—an important component of the PTS criterion. To this end, specialists studied and developed several algorithms for predicting the execution time of a program on a certain set of computers with account for the history of program execution on different computers (for a detailed description of the algorithms, see [23]).

The problem of predicting the execution time of a program on a certain computer is well known and belongs to the classical ones. For example, the execution time of a program on a particular computer, as well as the waiting time in a queue, can be predicted based on the history of its runs on this computer [24–27], for which one can use many algorithms of extrapolation, for example [28, 29], or regression [30], or more complex algorithms, in particular, an ensemble of decision trees (random forest) [31]. The main disadvantage of these algorithms is that they can only be applied to the same computer. Yet the essence of this task in MC2E was to predict the execution time of a program on a certain set of computers. Of course, the above-mentioned algorithms can be used to estimate the execution time of programs on several computers. However, this requires the history of the launch of each program on each computer from the set. This information is unavailable.

The choice of the computer is carried out as follows. One of the well-known algorithms [24–27] estimates the execution time of a program from the histories of its execution on each computer from a certain set. An example of such a history can be the program execution trace [32]. Based on the data obtained, one can either build a schedule for a group of programs or, following a greedy strategy, send each program to a computer where it has the minimum execution time. However, this requires all the execution histories of all programs on each of the computers in the set under consideration. As a rule, there is no such information.

One of the important and new results of the MC2E project is the development of a method for predicting the program execution time, which makes it possible to weaken the requirement for data on all execution histories of all programs on each of the computers in the set under consideration: to predict the program execution time on computers from a certain set, only the execution histories of this program on some of them are necessary (for a detailed description of the method, see [23]). In other words, there is no need to run every program on every computer.

The main idea of the new approach to the problem of predicting the program execution time was that the problem under consideration is like that solved in recommendation systems, or recommender systems (RSs) [33]. An RS belongs to the subclasses of an information filtering system that seeks to predict the rating or preference of the user of some object [34]. Examples of such an object could be films, books, or other goods. Thus, the recommender system restores and predicts the relationship between users and products based on individual user preference ratings.

An RS has a rating matrix where rows (or columns) correspond to movies, books, or goods, and columns (or rows) correspond to users. This matrix is often sparse because there are many users and items in the list, which prevents users from physically evaluating all of the items in question. The system tries to predict the preferences of each user for each item based on individual user ratings for some items. In other words, the RS must fill in the empty cells in the rating matrix. In these terms, consider the following analogy: users are computers, items are programs, and user ratings are program execution times. Thus, the computers “evaluate” the programs, and the lower the rating (execution time), the better.

As a result, the problem of predicting the program execution time was reduced to the problem of filling empty cells in the “Program–Computer” matrix, built for a given set of programs and a given set of computers. The cells of this matrix indicate the execution time of a specific program with specific data sets corresponding to a specific computer.

Clearly, the prediction accuracy depends on the number of known execution histories of programs on a computer from a certain set. We have investigated two approaches to this problem. The first one was based on a grouping of computers based on the Pearson correlation [35] and showed [23] that this approach is expedient to apply in the case of a densely filled (at least 95% of cells) “Program–Computer” matrix.

The second approach, developed for a sparse “Program‒Computer” matrix, meant its decomposition into vector representations of computers and programs—so-called embeddings. Embeddings are elements of a relatively small space into which larger vectors can be transformed. The embedding technique simplifies machine learning for big data cases such as sparse vectors representing words. Ideally, it partially reflects the semantics of the data by placing semantically similar objects close to each other in the embedding space [36]. In [23], it is shown in detail how to use embedding programs and embedding computers to predict the execution time of a program on a particular computer. The technique of decomposing a “Program–Computer” matrix to calculate embedding is given in [37].

The proposed approach to predicting the program execution time requires minimal knowledge about the program; it is usually collected on all modern computers. Another important advantage of the approach is the ability to decompose the “Program–Computer” matrix to provide embeddings for both programs and computers of dimension 1. This fact makes it possible to order totally both computers and programs, which greatly simplifies the selection of a computer with an effective execution time. For more details on testing prediction methods, see [23].

“BANDWIDTH ON DEMAND” (BoD) SERVICE

As was noted at the beginning of this article, one idea investigated in the MC2E project was the problem of dynamically creating a channel between data centers on demand with a given communication quality. An environment designed for interdisciplinary research should have flexible and powerful mechanisms for allocating, scheduling, and administering network resources. Otherwise, the overhead costs of network resources in the data center network will be very high since the load on this resource is sporadic. Naturally, the question arose about the possibility of creating a BoD service capable of forming a channel of proper quality between data centers on demand, that is, transmitting a certain amount of data over a certain time interval through the TCP/IP transport network. It should be emphasized that such a service does not imply a dedicated channel between the interacting parties. Moreover, it should be created dynamically by aggregating existing network resources.

Article [38] details the protocols for aggregating network resources and algorithms for creating a BoD service. Only a logical diagram of the approach proposed in the article cited is presented here. The creation of a BoD service can be divided into two main parts: the identification and aggregation of routes in the data center network and the distribution of data flows between them according to the BoD quality-of-service requirements for each flow.

The words building and aggregating routes mean that, to transfer the data flow between data centers, different routes are used at the same time and several physical lines are aggregated into a logical one. There are many protocols and technologies that make this possible [38]. However, they all leave the problem of quality of service unsolved.

To build routes, it is necessary to solve several problems.

First, how many and what routes are needed to meet BoD’s quality-of-service requirements? Routes should have minimal overlap between links and switches/routers. This limitation stems from the way congestion control algorithms work at the transport layer in the network. The number of edge-disjoint paths between two vertices in a graph is determined by Menger’s theorem [39], which states that the largest number of edge-disjoint routes from a vertex u to a vertex v in a graph is equal to the least number of edges in the section <u, v> of this graph.

Lack of crossing routes is not always critical. For example, if a physical line at an intersection has sufficient capacity to meet the quality-of-service requirements for all flows passing through it, then this line, being an intersection of routes, is not a critical point. It is possible that there are no disjoint alternative routes between the source and destination in the network topology. However, if you have high-bandwidth physical links, you can transform the network topology graph so that an edge corresponding to such a line is replaced by several edges with lower bandwidth. An alternative solution may be to search for routes with the least number of intersections, as is done by the algorithm Min Cost Max Flow—MCMF [40].

Upon finding the value k, i.e., the number of nonoverlapping routes, the network topology graph is processed by a special algorithm to determine k-routes between the source and the destination. It turned out that not every algorithm is suitable for this. For example, the greedy algorithm [41] will not solve this problem. The MCMF algorithm [42] turned out to be the correct choice. It reduces the original problem to finding the maximum network flow. As a result, a set of routes with sufficient resources to provide the BoD service will be identified.

The second question is how to distribute the resources of these routes when transmitting an application flow, that is, how to solve the AFLD (Application Flow Load Distribution) problem. It was divided into three parts: resource estimation, resource allocation, and the implementation of the calculated allocation between the routes found. The solution of the first one provides an answer to the question whether the available bandwidth of the identified routes is sufficient to provide the required quality of BoD service. If it is, then the solution to the second part of the problem provides an answer to the question of how to distribute the load of the application data flow between these routes. When solving the third part, we deal with multithreaded protocols, that is, those using several routes at the same time. These protocols split the application flow into multiple transport subflows, each of which uses its own route.

There are two main approaches to multithreaded routing at the transport level: static and dynamic. MPTCP is a static approach, [42] which assumes a priori allocation of a certain number of transport subflows (that is, routes), between which segments of the application flow are allocated. A dynamic approach, such as FDMP [43], uses dynamic route allocation for a subflow at the request of a transport agent, depending on how the total bandwidth of the current set of subflows matches the quality-of-service requirement.

To solve the AFLD problem, a mathematical model was developed for multithreaded data transfer on demand between data centers under the following assumptions:

• to use the BoD service, a pair of data centers must conclude a contract, which stipulates the maximum allowable volumes of transmitted data and the maximum allowable time for this, the quality of communication, etc.;

• under the same contract, it is impossible for two or more requests for the BoD service to appear simultaneously;

• the probability of distribution of flows of requests for a service for each contract is known.

Based on this model, the AFLD problem was formulated as a discrete-time integer linear programming (ILP) problem [16]. Its solution in the form of the ILP provided answers to the following questions: Is the available bandwidth on the identified routes sufficient to satisfy all request flows for a given set of contracts at a certain time interval? How should the bandwidth on each route be allocated between on-demand flows?

Based on Juniper VMX routers, a prototype of the above approaches to the implementation of the BoD service was built on the basis of the VPN technology, tested between data centers at the Faculty of Computational Mathematics and Cybernetics of Moscow State University and the data centers of Peking University [3]. In April 2021, a pilot project for the implementation of BоD service between data centers in Moscow and Novosibirsk was successfully completed [44].

* * *

The international project MC2E is aimed at investigating methods for constructing an environment for academic interdisciplinary research. The MC2E environment was built as a federation of local computing environments called federates. Each federate is a high-performance cluster—either a data center or a supercomputer. The advantages of the proposed method include the following:

• a high level of resource management and flexible options for defining virtual environments;

• the opportunity to use the existing software of the researchers;

• high quality planning and the efficiency of resource use according to the PST criterion;

• freeing the user from routine system administration tasks, as well as defining a unified way to describe the life cycle of a virtualized service in a data center (or in an HPC cluster).

In the course of the project, the hypothesis was experimentally substantiated that it is possible to reduce the average execution time of programs in the cloud. Experiments have shown that, for certain classes of applications, one can reduce the time by up to 20%.

A new solution to the problem of choosing a suitable computer for an MPI program in a heterogeneous environment based on the PST criterion has been developed. To this end, a new approach to predicting the execution time of an MPI program on a computer has been proposed, even if it might never be executed on it. Two algorithms have been constructed and analyzed: one based on a grouping of computers using the Pearson correlation (for a dense “Program–Computer” matrix) and one based on the decomposition technique of such a matrix, which makes it possible to obtain vector representations (embedding) of a program and computer (for a sparse “Program–Computer” matrix).

Note that the proposed approach to predicting the program execution time requires the minimum generally available set of data on the program execution history. Its other important advantage is that, as a result of matrix decomposition, embedding is provided both for programs and for computers of dimension 1. This fact makes it possible to establish a total ordering both on a given set of computers and on a given set of programs, which greatly simplifies the choice of a computer with the effective execution time. As a hypothesis, the project formulates the assumption that this method of predicting the execution time can be applied not only to MPI programs.

In addition, in the course of the MC2E project, a solution for building the BоD service was proposed and investigated. This approach is divided into the problems of route aggregation and distribution of application flows. Various options for solving the problem of route aggregation were studied, depending on the network topology and the capabilities of the network equipment. The question of the distribution of applied flows between aggregated routes is formulated and solved in the form of ILP problems.

It would be naive to believe that the results presented fully cover all the problems arising in the creation of information and computing environments for interdisciplinary research. Let us list just a few of those that remained outside the scope of the MC2E project: automation of launching programs on various computers in an environment where there are HPC-C and HPC-S resources; coordinated management of data and network resources in real time; monitoring and analytics, administration and security in such environments [45]; accounting of resources consumed; settlements between the members of the federation; and the use of peripheral computing technologies [46].

Notes

Hereinafter, a local cluster of servers, or data centers with virtualization tools, will be abbreviated as HPC-C (High-Performance Computing Cloud).
In what follows, this class of high-performance computers will be denoted HPC-S for brevity.
In what follows, for brevity, instead of MPI program, we will use the term program.

REFERENCES

http://www.top500.org/.
D. Kranzlmüller, J. M. de Lucas, and P. Öster, “The European grid initiative (EGI),” in Remote Instrumentation and Virtual Laboratories (Springer, Boston, 2010), pp. 61–66.
Google Scholar
R&D report “Research and development of a meta-cloud computing environment,” Registration no. AAAA-A19-119011190172-9. https://rosrid.ru.
T. Hwang, “NSF GENI cloud enabled architecture for distributed scientific computing,” in 2017 IEEE Aerospace Conference. March 4–11, 2017 (IEEE, Big Sky, MT, 2017), pp. 1–8.
I. N. Baldin, A. Griffioen, J. Monga, et al., “FABRIC: A National-Scale Programmable Experimental Network Infrastructure,” IEEE Internet Computing 23 (6), 38–47 (2019).
Article Google Scholar
R. G. Dewar, L. M. MacKinnon, R. J. Pooley, et al., “The OPHELIA project: Supporting software development in a distributed environment,” in Proceedings of the IADIS International Conference WWW\Internet 2002 (ICWI, Lisbon, 2002), pp. 568–571.
Fed4Fire project 2019. https://www.fed4fire.eu/the-project/.
R. L. Grossman, Y. Gu, J. Mambretti, et al., “An overview of the open science data cloud,” in Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (ACM, 2010), pp. 377–384.
H. E. Bal, R. Bhoedjang, R. Hofman, et al., “Performance evaluation of the Orca shared-object system,” ACM Trans. Comput. Syst. 16 (1), 1–40 (1998).
Article Google Scholar
R. Brun, L. Urban, F. Carminati, et al., GEANT: Detector Description and Simulation Tool (CERN Program Library Long Writeup W5013, Geneva, 1993). https://cds.cern.ch/record/1082634?ln=en.
R. L. Smelyanskii and V. A. Antonenko, Concepts of Programmatic Management and Virtualization of Network Services in Modern Data Transmission Networks (Kurs, Moscow, 2020) [in Russian].
Google Scholar
D.-G. Cao, B. An, P.-C. Shi, and H.-M. Wang, “Providing virtual cloud for special purposes on demand in joint cloud computing environment,” J. Comput. Sci. Technol. 32 (2), 211–218 (2017).
Article Google Scholar
V. Antonenko, R. Smeliansky, A. Ermilov, et al., “C2: General purpose cloud platform with NFV life-cycle management,” in 2017 IEEE 9th International Conference on Cloud Computing Technology and Science. December 11–14, 2017, Hong Kong, China (IEEE, 2017), pp. 353–356.
B. An, X. Shan, Z. Cui, et al., “Workspace as a service: An online working environment for private cloud,” in 2017 IEEE Symposium on Service-Oriented System Engineering (SOSE). April 6–9, 2017, San Francisco, CA, USA (IEEE, 2017), pp. 19–27.
D.-G. Cao, P. Liu, W. Cui, et al., “Cluster as a service: A resource sharing approach for private cloud,” Tsinghua Sci. Technol. 21 (6), 610–619 (2016).
Article Google Scholar
V. Antonenko, A. Chupakhin, A. Kolosov, et al., “On HPC & cloud environments integration,” in Performance Evaluation Models for Distributed Service Networks, Ed. by G. Bocewicz, J. Pemoera, and V. Toporkov (Springer Nature Switzerland AG, 2020), pp. 159–185.
Google Scholar
M. A. Netto, R. N. Calheiros, E. R. Rodrigues, et al., “HPC cloud for scientific and business applications: Taxonomy, vision, and research challenges,” ACM Comput. Surveys 51 (1), 8 (2018).
Google Scholar
A. Gupta, P. Faraboschi, F. Gioachin, et al., “Evaluating and improving the performance and scheduling of HPC applications in cloud,” IEEE Trans. Cloud Comput. 4 (3), 307–321 (2016).
Article CAS Google Scholar
A. Gupta and D. Milojicic, “Evaluation of HPC applications on cloud,” in Sixth Open Cirrus Summit. October 12–13, 2011, Atlanta, GA, USA (2011), pp. 22–26.
Infiniband in supercomputer systems. https://www.businesswire.com/news/home/20181112005379/en/Mellanox-InfiniBand-Ethernet-Solutions-Accelerate-Majority-TOP500.
Gigabit Ethernet in supercomputer systems. https://www.mellanox.com/solutions/high-performance-computing/top500.php.
NAS parallel benchmarks. https://www.nas.nasa.gov/ publications/npb.html.
A. Chupakhin, A. Kolosov, A. Bahmurov, et al., “Application of recommender systems approaches to the MPI program execution time prediction,” in IEEE Proceedings of 3rd International Conference “Modern Network Technologies-2020” (MoNeTec-2020). October 27–29, 2020, Moscow, Russia.
R. Gibbons, “A historical application profiler for use by parallel schedulers,” Proceedings of the Job Scheduling Strategies for Parallel Processing (Springer, 1997).
N. H. Kapadia, J. A. Fortes, and C. E. Brodley, “Predictive application performance modeling in a computational grid environment,” in Proceedings of the Eighth International Symposium on High Performance Distributed Computing (IEEE, 1999), pp. 47–54.
H. Li, D. Groep, J. Templon, and L. Wolters, “Predicting job start times on clusters,” in CCGRID’04: Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid (2004), pp. 301–308.
B. Mohr and F. Wolf, “Kojak—a tool set for automatic performance analysis of parallel programs,” in Euro-Par 2003 Parallel Processing (Springer, 2003), pp. 1301–1304.
Google Scholar
M. A. Iverson, F. Özgüner, and L. Potter, “Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment,” IEEE Trans. Comput. 48 (12), 1374–1379 (1999).
Article Google Scholar
X. Liu, J. Chen, K. Liu, and Y. Yang, “Forecasting duration intervals of scientific workflow activities based on time-series patterns,” in Proceedings of the IEEE Fourth International Conference on eScience (2008), pp. 23–30.
Ridge regression. https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Ridge_Regression.pdf. Cited June 21, 2020.
Random forest algorithm. https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm. Cited June 21, 2020.
R. L. Smeliansky, “Model of distributed computing system operation with,” Program. Comput. Softw. 39 (5), 223–241 (2013).
Google Scholar
https://www.coursera.org/specializations/recommender-systems.
https://en.wikipedia.org/wiki/Recommender_system#: ~:text=A%20recommender%20system%2C%20or%20a, would%20give%20to%20an%20item.
Pearson’s correlation coefficient. https://springerlink.bibliotecabuap.elogim.com/referenceworkentry/10.1007%2F978-1-4020-5614-7_2569. Cited June 21, 2020.
https://developers.google.com/machine-learning/crash- course/embeddings/video-lecture.
C. M. Cheng and X. Q. Jin, “Matrix decomposition,” in Encyclopedia of Social Network Analysis and Mining, Ed. by R. Alhajj and J. Rokne (Springer, New York, 2018).
Google Scholar
E. P. Stepanov and R. L. Smeliansky, “On bandwidth on demand problem,” in Proceedings of the 27th International Symposium Nuclear Electronics and Computing (NEC’2019), Budva, Becici, Montenegro, September 30–October 4 2019 (2019), pp. 402–407. http://ceur-ws.org/Vol-2507/402-407-paper-74.pdf.
B. Thomas, F. Göring, and J. Harant, “Menger’s theorem,” J. Graph Theor. 37 (4), 35–36 (2001).
Article Google Scholar
E. Stepanov and R. Smeliansky, “On analysis of traffic flow demultiplexing effectiveness,” in 2018 International Scientific and Technical Conference Modern Computer Network Technologies (MoNeTeC) (IEEE, 2018).
N. Kukreja, G. Maier, R. Alvizu, and A. Pattavina, “SDN based automated testbed for evaluating multipath TCP,” in IEEE International Conference on Communication, ICC 2015. June 8–12, 2015, London, UK. Workshop Proceedings (2016), pp. 718–723.
C. Raiciu, C. Paasch, S. Barre, et al., “How hard can it be? Designing and implementing a deployable multipath ECP,” Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI-12). April 25–27, 2012, San Jose, CA (USENIX, 2012), pp. 399–412.
E. Chemeritskiy, E. Stepanov, and R. Smeliansky, “Managing network resources with flow (de) multiplexing protocol,” Math. Comput. Methods Electr. Eng. 53, 35–43 (2015).
Google Scholar
Demonstration of the Bandwidth on Demand application based on the RunOS controller. https://youtu.be/XjggLW1LKKg.
J. Burke, What is the role of machine learning in networking? https://searchnetworking.techtarget.com/ answer/What-is-the-role-of-machine-learning-in-networking.
R. Smelyansky, “Hierarchical edge computing,” in International Conference Proceedings “Modern Network Technologies,” MoNeTec-2018 (Moscow, 2018), pp. 97–105.

Download references

ACKNOWLEDGMENTS

The author is grateful to all participants in the joint work in the MC2E project: Laboratory Head Senior Researcher V. Antonenko; Leading Researcher A. Bakhmurov; Postgraduates I. Petrov, A. Chupakhin, and A. Kolosov; Master’s student G. Ishelev (Moscow State University, Russia); Professor (Principal Investigator) Xiangqun Chen; Professor Chen Ming; Research Professor Donggang Cao; PhD student Junming Ma (Peking University, China); Researcher Wenlai Zhao (Tsinghua University, China); and Professor Min Chen (Huazhong University of Science and Technology, China).

Funding

This work was supported by grant no. 05.613.21.0088 (unique identifier RFMEFI61318X0088) from the Ministry of Science and Higher Education of Russia and the National Key R&D Program of China (2017YFE0123600).

Author information

Authors and Affiliations

Moscow State University, Moscow, Russia
R. L. Smelyanskii

Authors

R. L. Smelyanskii
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. L. Smelyanskii.

Ethics declarations

The author declares that he has no conflicts of interest.

Additional information

Translated by B. Alekseev

RAS Corresponding Member Ruslan Leonidovich Smelyanskii is Head of the Department of Computing Systems and Automation of the Faculty of Computational Mathematics and Cybernetics of Moscow State University.

Rights and permissions

Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Smelyanskii, R.L. МС2Е: The Meta-Cloud Computing Environment for Interdisciplinary Studies. Her. Russ. Acad. Sci. 92, 40–48 (2022). https://doi.org/10.1134/S1019331622010051

Download citation

Received: 29 June 2021
Revised: 06 July 2021
Accepted: 18 August 2021
Published: 24 March 2022
Issue Date: February 2022
DOI: https://doi.org/10.1134/S1019331622010051

Keywords:

Use our pre-submission checklist

Avoid common mistakes on your manuscript.