Abstract
With the growing complexity of workflows brought by the recent integration of machine learning, deep learning and big data analytics techniques, there is an ever increasing demand for compute, network and storage resources which require innovative approaches to their management, as well as their easy access and use (including the cloud model). Although there is an abundance of resources in today’s HPC infrastructures, they remained shared across the users, and for certain use cases (e.g., urgent computing applications) they may be still not enough to fulfil the workflow requirements. Also, specific computing resources (e.g., hardware accelerators) may be accessible only within certain datacentres. To cope with these challenges, a secure interconnection among multiple HPC datacentres that allows mutual access to their resources (federation) is considered. This paper focuses on the extension of the SimGrid software library, a C++ based simulation framework, for evaluating the jobs allocation strategies that lay at the core of a federated execution platform. A greedy-based allocation strategy has been evaluated against random and round-robin approaches; then, this greedy allocation strategy has been integrated within the main orchestration service developed in the context of the LEXIS federated execution platform. Tests with real workflows showed the capability of this greedy allocation strategy to dynamically select the best suitable execution cluster for different jobs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
References
Golasowski, M., et al.: The LEXIS platform for distributed workflow execution and data management. In: Big Data, and AI Convergence Towards Exascale. Taylor & Francis, HPC (2022)
Cohen, M.C., Keller, P.W., Vahab , M., Zadimoghaddam, M.: Overcommitment in cloud services: bin packing with chance constraints. Manag. Sci. 1–17 (2019)
Madni, S.H., et al.: Hybrid gradient descent cuckoo search (HGDCS) algorithm for resource scheduling in IaaS cloud computing environment. Clust. Comput. 22(1), 301–334 (2019)
Mazumdar, S., Scionti, A., Kumar, A.S.: Adaptive resource allocation for load balancing in cloud. In: Antonopoulos, N., Gillam, L. (eds.) Cloud Computing. CCN, pp. 301–327. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54645-2_12
Singh, S., Chana, I.: A survey on resource scheduling in cloud computing: Issues and challenges. J. Grid Comput. 14(2), 217–264 (2016)
Rahman, M., et al.: Adaptive workflow scheduling for dynamic grid and cloud computing environment. Concurr. Comput. Pract. Exper. 25(13), 1816–1842 (2013)
Quarati, A., et al.: Scheduling strategies for enabling meteorological simulation on hybrid clouds. J. Comput. Appl. Math. 273, 438–451 (2015)
Korpela, E.J.: SETI@ home, BOINC, and volunteer distributed computing. Ann. Rev. Earth Planet. Sci. 40, 69–87 (2012)
Tsaregorodtsev, A., et al.: DIRAC: a community grid solution. J. Phys. Conf. Ser. 119(6), 062048 (2008)
Casanova, H., et al.: Teaching parallel and distributed computing concepts in simulation with WRENCH. J. Parallel Distrib. Comput. 156, 53–63 (2021)
Bak, S., et al.: Gssim -a tool for distributed computing experiments. Sci. Program. 19(4), 231–251 (2011)
Buyya, R., Murshed, M.: Gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concurr. Comput. Pract. Exper. 14, 13–15 (2002)
Mansouri, N., Ghafari, R., Zade, B.M.H.: Cloud computing simulators: a comprehensive review. Simul. Model. Pract. Theory 104, 102144 (2020)
Acknowledgements
This work by the LEXIS project funded by the EU’s Horizon 2020 research and innovation programme (2014–2020) under grant agreement No. 825532.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Vitali, G., Scionti, A., Viviani, P., Vercellino, C., Terzo, O. (2022). Dynamic Job Allocation on Federated Cloud-HPC Environments. In: Barolli, L. (eds) Complex, Intelligent and Software Intensive Systems. CISIS 2022. Lecture Notes in Networks and Systems, vol 497. Springer, Cham. https://doi.org/10.1007/978-3-031-08812-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-08812-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08811-7
Online ISBN: 978-3-031-08812-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)