Result: Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications

Title:
Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications
Source:
PPoPP'05 (Proceedings of the 2005 ACM SIGPLAN symposium on principles and practice of parallel programming). :153-163
Publisher Information:
New York NY: ACM Press, 2005.
Publication Year:
2005
Physical Description:
print, 14 ref 1
Original Material:
INIST-CNRS
Document Type:
Conference Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Department of Computer Sciences The University of Texas at Austin, Austin, TX 78712, United States
Rights:
Copyright 2006 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Accession Number:
edscal.18182679
Database:
PASCAL Archive

Further Information

We show how to exploit high-level information, available as part of the derivation of provably correct algorithms, so that SMP parallelism can be systematically identified. Recent research has shown that loop-based dense linear algebra algorithms can be systematically derived from the mathematical specification of the operation. Fundamental to the methodology is the determination of loop-invariants (in the sense of Dijkstra and Hoare) from which correct loops can be systematically derived. We show how the high-level specification of the operation together with these loop-invariants can be exploited to detect the independence of loop iterations. This in turn then allows a Workqueuing Model to be used to implement and parallelize the algorithms using a feature proposed for OpenMP 3.0, task queues. Although performance is not the main feature of this paper, performance is reported on a 4 CPU Itanium2 server for a concrete example, the symmetric rank-k update operation.