Treffer: Coprocessor design to support MPI primitives in configurable multiprocessors

Title:
Coprocessor design to support MPI primitives in configurable multiprocessors
Source:
Integration (Amsterdam). 40(3):235-252
Publisher Information:
Amsterdam: Elsevier Science, 2007.
Publication Year:
2007
Physical Description:
print, 45 ref
Original Material:
INIST-CNRS
Subject Terms:
Electronics, Electronique, Sciences exactes et technologie, Exact sciences and technology, Sciences appliquees, Applied sciences, Electronique, Electronics, Electronique des semiconducteurs. Microélectronique. Optoélectronique. Dispositifs à l'état solide, Semiconductor electronics. Microelectronics. Optoelectronics. Solid state devices, Circuits intégrés, Integrated circuits, Conception. Technologies. Analyse fonctionnement. Essais, Design. Technologies. Operation analysis. Testing, Circuits intégrés par fonction (dont mémoires et processeurs), Integrated circuits by function (including memories and processors), Circuits électriques, optiques et optoélectroniques, Electric, optical and optoelectronic circuits, Propriétés des circuits, Circuit properties, Circuits électroniques, Electronic circuits, Circuits numériques, Digital circuits, Matériel informatique, Hardware, Equipements d'entrée-sortie, Input-output equipment, Accès à distance, Remote access, Acceso remoto, Architecture reconfigurable, Reconfigurable architectures, Circuit intégré, Integrated circuit, Circuito integrado, Codage, Coding, Codificación, Coprocesseur, Coprocessor, Coprocesador, Etude comparative, Comparative study, Estudio comparativo, Evaluation performance, Performance evaluation, Evaluación prestación, Horloge, Clock, Reloj, Implémentation, Implementation, Implementación, Interconnexion, Interconnection, Interconexión, Logiciel, Software, Logicial, Multiprocesseur, Multiprocessor, Multiprocesador, Multitraitement, Multiprocessing, Multitratamiento, Ordinateur parallèle, Parallel computer, Ordenador paralelo, Processeur 32 bits, 32 bit Processor, Procesador 32 bits, Programmation parallèle, Parallel programming, Programación paralela, Routeur, Router, Réseau porte programmable, Field programmable gate array, Red puerta programable, Système parallèle, Parallel system, Sistema paralelo, Système sur puce, System on a chip, Sistema sobre pastilla, Traitement parallèle, Parallel processing, Tratamiento paralelo, Configurable system, FPGA, MPI
Document Type:
Fachzeitschrift Article
File Description:
text
Language:
English
Author Affiliations:
Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ 07102, United States
Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, United States
ISSN:
0167-9260
Rights:
Copyright 2007 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Electronics
Accession Number:
edscal.18661404
Database:
PASCAL Archive

Weitere Informationen

The Message Passing Interface (MPI) is a widely used standard for interprocessor communications in parallel computers and PC clusters. Its functions are normally implemented in software due to their enormity and complexity, thus resulting in large communication latencies. Limited hardware support for MPI is sometimes available in expensive systems. Reconfigurable computing has recently reached rewarding levels that enable the embedding of programmable parallel systems of respectable size inside one or more Field-Programmable Gate Arrays (FPGAs). Nevertheless, specialized components must be built to support interprocessor communications in these FPGA-based designs, and the resulting code may be difficult to port to other reconfigurable platforms. In addition, performance comparison with conventional parallel computers and PC clusters is very cumbersome or impossible since the latter often employ MPI or similar communication libraries. The introduction of a hardware design to implement directly MPI primitives in configurable multiprocessor computing creates a framework for efficient parallel code development involving data exchanges independently of the underlying hardware implementation. This process also supports the portability of MPI-based code developed for more conventional platforms. This paper takes advantage of the effectiveness and efficiency of one-sided Remote Memory Access (RMA) communications, and presents the design and evaluation of a coprocessor that implements a set of MPI primitives for RMA. These primitives form a universal and orthogonal set that can be used to implement any other MPI function. To evaluate the coprocessor, a router of low latency was designed as well to enable the direct interconnection of several coprocessors in cluster-on-a-chip systems. Experimental results justify the implementation of the MPI primitives in hardware to support parallel programming in reconfigurable computing. Under continuous traffic, results for a Xilinx XC2V6000 FPGA show that the average transmission time per 32-bit word is about 1.35 clock cycles. Although other computing platforms, such as PC clusters, could benefit as well from our design methodology, our focus is exclusively reconfigurable multiprocessing that has recently received tremendous attention in academia and industry.