Treffer: Adaptive execution techniques for SMT multiprocessor architectures

Title:
Adaptive execution techniques for SMT multiprocessor architectures
Source:
PPoPP'05 (Proceedings of the 2005 ACM SIGPLAN symposium on principles and practice of parallel programming). :236-246
Publisher Information:
New York NY: ACM Press, 2005.
Publication Year:
2005
Physical Description:
print, 27 ref 1
Original Material:
INIST-CNRS
Document Type:
Konferenz Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Electronics and Telecommunication Research Institute, Yuseong-Gu, Daejeon, 305-530, Korea, Republic of
Dept. of Computer Science and Engineering University of California, San Diego 9500 Gilman Drive, La Jolla, CA 92093-0114, United States
School of Computer Science and Engineering Seoul National University, Seoul 151-744, Korea, Republic of
Rights:
Copyright 2006 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Accession Number:
edscal.18182687
Database:
PASCAL Archive

Weitere Informationen

In simultaneous multithreading (SMT) multiprocessors, using all the available threads (logical processors) to run a parallel loop is not always beneficial due to the interference between threads and parallel execution overhead. To maximize performance in an SMT multiprocessor, finding the optimal number of threads is important. This paper presents adaptive execution techniques to find the optimal execution mode for SMT multiprocessor architectures. A compiler preprocessor generates code that, based on dynamic feedback, automatically determines at run time the optimal number of threads for each parallel loop in the application. Using 10 standard numerical applications and running them with our techniques on an Intel 4-processor Hyper-Threading Xeon SMP with 8 logical processors, our code is, on average, about 2 and 18 times faster than the original code executed on 4 and 8 logical processors, respectively.