Treffer: Using model-based clustering to improve predictions for queueing delay on parallel machines

Title:
Using model-based clustering to improve predictions for queueing delay on parallel machines
Source:
Clusters and computational grids for scientific computingParallel processing letters. 17(1):21-46
Publisher Information:
Singapore: World Scientific Publishing, 2007.
Publication Year:
2007
Physical Description:
print, 29 ref
Original Material:
INIST-CNRS
Subject Terms:
Control theory, operational research, Automatique, recherche opérationnelle, Electronics, Electronique, Computer science, Informatique, Sciences exactes et technologie, Exact sciences and technology, Sciences appliquees, Applied sciences, Informatique; automatique theorique; systemes, Computer science; control theory; systems, Logiciel, Software, Systèmes informatiques et systèmes répartis. Interface utilisateur, Computer systems and distributed systems. User interface, Organisation des mémoires. Traitement des données, Memory organisation. Data processing, Traitement des données. Listes et chaînes de caractères, Data processing. List processing. Character string processing, Analyse amas, Cluster analysis, Analisis cluster, Analyse comportementale, Behavioral analysis, Análisis conductual, Calcul réparti, Distributed computing, Cálculo repartido, Classification, Clasificación, Complétude, Completeness, Completitud, File attente, Queue, Fila espera, Groupage, Grouping, Agrupamiento, Haute performance, High performance, Alto rendimiento, Machine parallèle, Parallel machines, Parallélisme, Parallelism, Paralelismo, Procédé discontinu, Batch process, Procedimiento discontínuo, Production par lot, Batch production, Producción por lote, Retard, Delay, Retraso, Système réparti, Distributed system, Sistema repartido, Série temporelle, Time series, Serie temporal, Temps réel, Real time, Tiempo real, Model-based clustering, batch-queuing, parallel systems, queueing delay, trace-based simulation, wait time prediction
Document Type:
Konferenz Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Dept. of Computer Science, University of California Santa Barbara, Santa Barbara, California, United States
ISSN:
0129-6264
Rights:
Copyright 2007 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Accession Number:
edscal.18702950
Database:
PASCAL Archive

Weitere Informationen

Most space-sharing parallel computers presently operated by production high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch-queued resources may choose among different queues (charging different amounts) potentially on a number of machines to which they have access. In such a situation, the amount of time a user's job will wait in any one batch queue can be a significant portion of the overall time from job submission to job completion. It thus becomes desirable to provide a prediction for the amount of time a given job can expect to wait in the queue. Further, it is natural to expect that attributes of an incoming job, specifically the number of processors requested and the amount of time requested, might impact that job's wait time. In this work, we explore the possibility of generating accurate predictions by automatically grouping jobs having similar attributes using model-based clustering. Moreover, we implement this clustering technique for a time series of jobs so that predictions of future wait times can be generated in real time. Using trace-based simulation on data from 7 machines over a 9-year period from across the country, comprising over one million job records, we show that clustering either by requested time, requested number of processors, or the product of the two generally produces more accurate predictions than earlier, more naive, approaches and that automatic clustering outperforms administrator-determined clustering.