Treffer: A technique for overlapping computation and communication for block recursive algorithms

Title:
A technique for overlapping computation and communication for block recursive algorithms
Contributors:
The Pennsylvania State University CiteSeerX Archives
Publication Year:
1998
Collection:
CiteSeerX
Document Type:
Fachzeitschrift text
File Description:
application/postscript
Language:
English
Rights:
Metadata may be used without restrictions as long as the oai identifier remains attached to it.
Accession Number:
edsbas.7E047CDD
Database:
BASE

Weitere Informationen

This paper presents a design methodology for developing efficient distributed-memory parallel programs for block recursive algorithms such as the fast Fourier transform (FFT) and bitonic sort. This design methodology is specifically suited for most modern supercomputers having a distributed-memory architecture with circuit-switched or wormhole routed mesh or hypercube interconnection network. A mathematical framework based on the tensor product and other matrix operations is used for representing algorithms. Communication-efficient implementations with effectively overlapped computation and communication are achieved by manipulating the mathematical representation using the tensor product algebra. Performance results for FFT programs on the