Abstract
The self-tuning dynP scheduler for modern cluster resource management systems switches between different basic scheduling policies dynamically during run time. This allows to react on changing characteristics of the waiting jobs. In this paper we present enhancements to the decision process of the self-tuning dynP scheduler and evaluate their impact on the performance: (i) While doing a self-tuning step a performance metric is needed for ranking the schedules generated by the different basic scheduling policies. This allows different objectives for the self-tuning process, e.g. more user centric by improving the response time, or more owner centric by improving the makespan. (ii) Furthermore, a self-tuning process can be called at different times of the scheduling process: only at times when the characteristics of waiting jobs change (half self-tuning), i.e. new jobs are submitted; or always when the schedule changes (full self-tuning), i.e. when jobs are submitted or running jobs terminate.
We use discrete event simulations to evaluate the achieved performance. As job input for driving the simulations we use original traces from real supercomputer installations. The evaluation of the two enhancements to the decision process of the self-tuning dynP scheduler shows that a good performance is achieved, if the self-tuning metric is the same as the metric used measuring the overall performance at the end of the simulation. Additionally, calling the self-tuning process only when new jobs are submitted, is sufficient in most scenarios and the performance difference to full self-tuning is small.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Feitelson, D.G.: A Survey of Scheduling in Multiprogrammed Parallel Systems. Research report rc 19790 (87657), IBM T.J. Watson Research Center, Yorktown Heights, NY (1995)
Feitelson, D.G., Naaman, M.: Self-Tuning Systems. IEEE Software 16(2), 52–60 (1999)
Feitelson, D.G., Nitzberg, B.: Job Characteristics of a Production Parallel Scientific Workload on the NASA Ames iPSC/860. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 337–360. Springer, Heidelberg (1995)
Gehring, J., Ramme, F.: Architecture-Independent Request-Scheduling with Tight Waiting-Time Estimations. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 65–80. Springer, Heidelberg (1996)
Hovestadt, M., Kao, O., Keller, A., Streit, A.: Scheduling in HPC Resource Management Systems: Queuing vs. Planning. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 1–20. Springer, Heidelberg (2003)
Keller, A., Reinefeld, A.: Anatomy of a Resource Management System for HPC Clusters. In: Annual Review of Scalable Computing, vol. 3, pp. 1–31. Singapore University Press (2001)
Lifka, D.A.: The ANL/IBM SP Scheduling System. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)
Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D., Freund, R.F.: Dynamic Mapping of a Class of Independent Tasks onto Heterogeneous Computing Systems. Journal of Parallel and Distributed Computing 59(2), 107–131 (1999)
Mu’alem, A., Feitelson, D.G.: Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Trans. Parallel & Distributed Systems 12(6), 529–543 (2001)
Muthukrishnan, S., Rajaraman, R., Shaheen, A., Gehrke, J.E.: Online Scheduling to Minimize Average Stretch. In: Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science, pp. 433–442 (1999)
The pling Itanium2 Cluster at the Paderborn Center for Parallel Computing (PC2) (April 2004), http://www.upb.de/pc2/services/systems/pling/index.html
The PSC Pentium3 Cluster at the Paderborn Center for Parallel Computing (PC2) (April 2004), http://www.upb.de/pc2/services/systems/psc/index.html
Ramme, F., Kremer, K.: Scheduling a Metacomputer by an Implicit Voting System. In: 3rd Int. IEEE Symposium on High-Performance Distributed Computing, pp. 106–113 (1994)
Skovira, J., Chan, W., Zhou, H., Lifka, D.: The EASY — LoadLeveler API Project. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 41–47. Springer, Heidelberg (1996)
Streit, A.: A Self-Tuning Job Scheduler Family with Dynamic Policy Switching. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 1–23. Springer, Heidelberg (2002)
Streit, A.: The Self-Tuning dynP Job-Scheduler. In: Proc. of the 11th International Heterogeneous Computing Workshop (HCW) at IPDPS 2002 (book of abstracts, paper only on CD), p. 87. IEEE Computer Society Press, Los Alamitos (2002)
Streit, A.: Evaluation of an Unfair Decider Mechanism for the Self-Tuning dynP Job Scheduler. In: Proc. of the 13th International Heterogeneous Computing Workshop (HCW) at IPDPS (book of abstracts, paper only on CD), p. 108. IEEE Computer Society Press, Los Alamitos (2004)
Parallel Workloads Archive (April 2004), http://www.cs.huji.ac.il/labs/parallel/workload/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Streit, A. (2005). Enhancements to the Decision Process of the Self-Tuning dynP Scheduler. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2004. Lecture Notes in Computer Science, vol 3277. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11407522_4
Download citation
DOI: https://doi.org/10.1007/11407522_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25330-3
Online ISBN: 978-3-540-31795-1
eBook Packages: Computer ScienceComputer Science (R0)