Treffer: Scalable subgraph mapping for acyclic computation accelerators

Title:
Scalable subgraph mapping for acyclic computation accelerators
Source:
CASES 2006 (International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, October 22-25, 2006, Seoul, Korea, embedded systems week 2006). :147-157
Publisher Information:
New York NY: ACM Press, 2006.
Publication Year:
2006
Physical Description:
print, 31 ref 1
Original Material:
INIST-CNRS
Document Type:
Konferenz Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Advanced Computer Architecture Lab University of Michigan, Ann Arbor, MI, United States
ARM Ltd, Cambridge, United Kingdom
Rights:
Copyright 2007 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Accession Number:
edscal.19105491
Database:
PASCAL Archive

Weitere Informationen

Computer architects are constantly faced with the need to improve performance and increase the efficiency of computation in their designs. To this end, it is increasingly common to see acyclic computation accelerators appear in embedded processor designs. One major problem with adding accelerators to a design is that it is difficult to generate high-quality code utilizing them. Hand-written assembly code is typical, and if compiler support does exist, it is implemented using only greedy algorithms. In this work, we investigate more thorough techniques for compiling to processors with acyclic accelerators. Where as greedy solutions only explore one possible solution, the techniques presented in this paper explore the entire design space, when possible. Intelligent pruning methods are employed to ensure compilation is both tractable and scalable. Overall, our new compilation algorithms produce code that performs on average 10%, and up to 32% better than standard greedy methods. These algorithms also run in less than one second for more than 98% of basic blocks tested.