Treffer: Adaptive and virtual reconfigurations for effective dynamic job scheduling in cluster systems

Title:
Adaptive and virtual reconfigurations for effective dynamic job scheduling in cluster systems
Source:
Distributed computing systems (Vienna, 2-5 July 2002)Proceedings of the ... International Conference on Distributed Computing Systems. :35-42
Publisher Information:
Los Alamitos CA: IEEE, 2002.
Publication Year:
2002
Physical Description:
print, 14 ref
Original Material:
INIST-CNRS
Document Type:
Konferenz Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Department of Computer Science, College of William and Mary, Williamsburg, VA 23187-8795, United States
ISSN:
1063-6927
Rights:
Copyright 2004 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Accession Number:
edscal.15759706
Database:
PASCAL Archive

Weitere Informationen

In a cluster system with dynamic load sharing support, a job submission or migration to a workstation is determined by the availability of CPU and memory resources of the workstation at the time [3]. In such a system, a small number of running jobs with unexpectedly large memory allocation requirements may significantly increase the queuing delay times of the rest of jobs with normal memory requirements, slowing down executions of individual jobs and decreasing the system throughput. We call this phenomenon as the job blocking problem because the big jobs block the execution pace of majority jobs in the cluster. Since the memory demand of jobs may not be known in advance and may change dynamically, the possibility of unsuitable job submissions/migrations to cause the blocking problem is high, and the existing load sharing schemes are unable to effectively handle this problem. We propose a software method incorporating with dynamic load sharing, which adaptively reserves a small set of workstations through virtual cluster reconfiguration to provide special services to the jobs demanding large memory allocations. This policy implies the principle of shortest-remaining-processing-time policy. As soon as the blocking problem is resolved by the reconfiguration, the system will adaptively switch back to the normal load sharing state. We present three contributions in this study. (1) we quantitatively present the conditions to cause the job blocking problem. (2) We present the adaptive software method in a dynamic load sharing system. We show the adaptive process causes little additional overhead. (3) Conducting trace-driven simulations, we show that our method can effectively improve the cluster computing performance by quickly resolving the job blocking problem. The effectiveness and performance insights are also analytically verified.