Enhancing System Utilization by Dynamic Reallocation of Computing Nodes

Lee, Seungmin; Jang, Hee Jin; Kim, Min Ah

doi:10.1007/978-981-99-1252-0_24

Seungmin Lee³⁹,
Hee Jin Jang³⁹ &
Min Ah Kim³⁹

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1028))

Included in the following conference series:

International Conference on Computer Science and its Applications and the International Conference on Ubiquitous Information Technologies and Applications

502 Accesses

Abstract

The field has expanded as supercomputers deal with different characteristics of workloads such as traditional scientific computing and data intensive computing. A batch queue-based Parallel Batch System (PBS) scheduler that manages high performance computing (HPC) tasks and a Kubernetes platform for managing data intensive applications are applied. Computing nodes are currently divided into static partitions for each workload. However, it provides better overall resource utilization of supercomputer as dynamically reallocating computing nodes to partitions according to the number of waiting jobs. In this work, we propose an approach to dynamic resource reallocation of computing nodes. We distinguish our approach from previous works in that our approach provides isolation of software stack and reallocates resources with a bare-metal environment suitable for preventing conflicts between two heterogeneous platforms. We considered node level reallocation which means that allocation is done to the partition in computing node level rather than sharing components of computing resources. A test scenario is used to demonstrate the process and feasibility of this approach and the result shows that it can improve system utilization.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Dynamic load balancing in distributed exascale computing systems

Article 19 May 2017

Resource Scheduling Algorithm for Heterogeneous High-Performance Clusters

PAARes: an efficient process allocation based on the available resources of cluster nodes

Article 08 February 2023

Keywords

1 Introduction

The field has expanded as supercomputers deal with different characteristics of workloads such as traditional scientific computing and data intensive computing. Data intensive computing uses a data parallel approach to process huge data sets with a great diversity of types so that it requires deferent data processing approaches. A batch queue-based Parallel Batch System (PBS) scheduler that manages high performance computing (HPC) tasks for traditional scientific computing and a Kubernetes platform for managing data intensive applications, especially big data analysis, are applied in one of the largest supercomputer, NURION in KISTI [1].

Computing nodes are currently divided into static partitions to process HPC workloads and big data analytics workloads. However, computing nodes become idle when jobs are not enough for one partition, as a consequence waste resources if there are jobs waiting computing resources in another partition. It provides better overall resource utilization of supercomputer, therefore, as dynamically reallocating computing nodes to partitions according to the number of waiting jobs as shown in Fig. 1.

2 diagrams for dynamic redistribution. Jobs from the queue are sent to the resource manager by allocating more computing nodes to H P C or D I C partitions, via P B S and K 8 S schedulers. — **Fig. 1**

There is a study that applied dynamic resource partitioning to the Athena development system using Mesos [2] and reported experiences on mixed workload [3]. In our case, problems arise when frameworks using different software stacks are installed and used on the local disk of computing nodes. A machine oriented mini-server (MOM) daemon of PBS scheduler not only runs and manages jobs and monitors resource usage on each computing node but also prohibit user daemon from listening port. However, user daemon is inevitable to enforce the principle of least privilege in Jupyter notebook service that leverage big data tools [4].

In this paper, we propose an approach to dynamic resource reallocation of computing nodes. We distinguish our approach from previous solutions in that it provides isolation of software stack and reallocates resources with a bare-metal environment suitable for preventing problems mentioned above. The remainder of this paper is organized as follows: Sect. 2 introduces the proposed approach. Section 3 shows the experimental results and analysis. And Sect. 4 concludes this paper.

2 Methodology

In this section, we describe the key components and the reallocation process of computing nodes. Since HPC and DIC workloads have different characteristics, we first considered node level reallocation which means that allocation is done to the partition in computing node level rather than sharing components of computing resources (i.e., memory and network) so that each application can be executed according to its own characteristics without interruption. Second, the matching between tasks that request resources and available resources is determined by the number of CPUs. Last but not the least, the minimum resource of the partition managed by each scheduler is set as default (i.e., non-swappable and non-reallocatable) to prevent starvation of tasks in one partition. We briefly describe main components of dynamic resource reallocation as follows:

Task monitoring to gather information about waiting tasks periodically
A computing node manager to add and delete computing nodes to and from partitions managed by the scheduler
A database (DB) that stores information and status about computing nodes
A dynamic node manager that determines the need for reallocation through the overall system situation and by combining decision method according to information about available resources

Figure 2 the main components for dynamic resource reallocation and a sequence in the process shows a relation among main components and a sequence in the process. First, the dynamic node manager is a component that performs decision making, and collects information on waiting tasks and available resources through the local scheduler periodically. Then, it calculates the number of nodes required in each partition based on the collected information and decides whether to reallocate currently allocated computing nodes.

A process flow for dynamic resource reallocation. The 5 stages are gathering resource usage, determining nodes for reallocation, assigning nodes to a new partition, attaching nodes to the scheduler of the new partition, and updating node information. — **Fig. 2**

When a situation meets the resource reallocation policy, the suitable nodes are selected from the database, and the nodes are rebooted to the new disk image so that the selected computing nodes are allocated to a new partition. After assigned to a new partition, finally the database information is updated.

3 Results and Discussion

Figure 3 shows a testbed to demonstrate the feasibility of our approach. In order to manage heterogeneous platform-based task execution environments and services, bare-metal environment is advantageous in providing stable services. Though the testbed was configured in a virtual machine, we can easily apply this environment to the real computing node of supercomputer by changing computing node manager from a hypervisor of VirtualBox to a MaaS (Metal as a Service) management tool.

An architecture diagram of the testbed environment. The computing node manager is at the bottom, and 6 C P Us with pairs of K 8 S and P B S workers are above. A controller, K 8 S master, and P B S master are on the left. — **Fig. 3**

Figure 4 confirms that overall throughput is improved by dynamically reallocating resources for the test scenario. In the initial environment, the node managed by the PBS scheduler and the node managed by Kubernetes are 3 nodes respectively, and 2 cpu processors per node. The High Performance LINPACK (HPL) benchmark requests 4 cpu resources per job, and the pbs-worker2 and pbs-worker3 nodes are assigned as shown in Fig. 4a. Among all idle resources, the resources allocated to the HPC partition are not sufficient to process the HPL job requiring 4 cores, so the jobs are waiting in the queue. At this time, based on the resource reallocation policy, the node of the DIC partition is reallocated (pbs-worker4) to the HPC partition. As in Fig. 4b, another job is allocated resources and executed, indicating that it is changed to the running state.

4 tables of throughput data. The tables for waiting because of no resources and running jobs after resource re-allocation in the P B S and K B S schedulers have rows highlighted. The columns include job name, session I D, and elapsed time for P B S, and name, status, and restarts for K 8 S. — **Fig. 4**

Two jobs using spark framework that request resources of DIC partition through Kubernetes are submitted in Fig. 4c. Each creates one driver and one work process, and each process requires 2 cores. Since there are 4 available cpu cores, 2 nodes with 2 cores are available, so one job can be executed, but the driver process preempts 2 cores in each job with the Kubernetes scheduler as non-preemption, so the worker process cannot be allocated resources result in pending of the task. After making a decision about the resource reallocation, the pbs-worker3 node in the HPC partition is reassigned to the DIC partition and switched as k8s-worker3. And then the resource is allocated to another worker of the previous pending task in DIC partition and all task changed to running state (Fig. 4d).

4 Conclusion

This work enhances overall system utilization of supercomputer as dynamic reallocating resources and solves the conflict problems between heterogeneous platforms executed on the same computing node by applying node level reallocation with a bare-metal environment. We implement a testbed and simulate a test scenario to demonstrate the feasibility of dynamic resource reallocation and the result shows that our approach can improve system utilization.

References

Lee S, Park JW, Jeong K, Hahm J (2021) Implementation of a container-based interactive environment for big-data analysis on supercomputer. In: Park JJ, Fong SJ, Pan Y, Sung Y (eds) Advances in computer science and ubiquitous computing (Lecture notes in electrical engineering), vol 715. Springer, Singapore. https://doi.org/10.1007/978-981-15-9343-7_58
Hindman B, Konwinski A, Zaharia M, Ghodsi A, Joseph AD, Katz RH et al. (2011) Mesos: a platform for fine-grained resource sharing in the data center. In: USENIX Symposium on Networked Systems Design and Implementation (NSDI)
Google Scholar
Ayyalasomayajula, West K (2017) Experiences running different work load managers across cray platforms. In: Cray User Group conference (CUG’17)
Google Scholar
Jupyter Notebook. [Online] Available: https://jupyter.org

Download references

Acknowledgements

This work was supported by the National Research Council of Science & Technology (NST) grant by the Korea government (MSIT) (No. CRC21011).

Author information

Authors and Affiliations

Korea Institute of Science and Technology Information, Daejeon, 34141, Republic of Korea
Seungmin Lee, Hee Jin Jang & Min Ah Kim

Authors

Seungmin Lee
View author publications
You can also search for this author in PubMed Google Scholar
Hee Jin Jang
View author publications
You can also search for this author in PubMed Google Scholar
Min Ah Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seungmin Lee .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Jeonju University, Jeonju-si, Korea (Republic of)
Ji Su Park
Department of Computer Science, St. Francis Xavier University, Antigonish, NS, Canada
Laurence T. Yang
Department of Computer Science, Georgia State University, Atlanta, USA
Yi Pan
Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, Korea (Republic of)
Jong Hyuk Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, S., Jang, H.J., Kim, M.A. (2023). Enhancing System Utilization by Dynamic Reallocation of Computing Nodes. In: Park, J.S., Yang, L.T., Pan, Y., Park, J.H. (eds) Advances in Computer Science and Ubiquitous Computing. CUTECSA 2022. Lecture Notes in Electrical Engineering, vol 1028. Springer, Singapore. https://doi.org/10.1007/978-981-99-1252-0_24

Download citation

DOI: https://doi.org/10.1007/978-981-99-1252-0_24
Published: 03 June 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1251-3
Online ISBN: 978-981-99-1252-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Enhancing System Utilization by Dynamic Reallocation of Computing Nodes

Abstract

Similar content being viewed by others

Dynamic load balancing in distributed exascale computing systems

Resource Scheduling Algorithm for Heterogeneous High-Performance Clusters

PAARes: an efficient process allocation based on the available resources of cluster nodes

Keywords

1 Introduction

2 Methodology

3 Results and Discussion

4 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Enhancing System Utilization by Dynamic Reallocation of Computing Nodes

Abstract

Similar content being viewed by others

Dynamic load balancing in distributed exascale computing systems

Resource Scheduling Algorithm for Heterogeneous High-Performance Clusters

PAARes: an efficient process allocation based on the available resources of cluster nodes

Keywords

1 Introduction

2 Methodology

3 Results and Discussion

4 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation