Treffer: An energy efficient processor array and memory controller for accurate processing of convolutional neural network-based inference engines.

Title:
An energy efficient processor array and memory controller for accurate processing of convolutional neural network-based inference engines.
Source:
Scientific Reports; 11/12/2025, Vol. 15 Issue 1, p1-19, 19p
Database:
Complementary Index

Weitere Informationen

Exploiting unstructured sparsity in the hardware accelerator of a Convolutional Neural Networks (CNNs) based inference can improve energy efficiency. However, it needs a complex controller for indexing and load-balancing. A controller for managing unstructured sparsity in Fully Connected (FC) layers is designed. In a pre-trained Visual Geometry-Group-16 (VGG-16) model, a ~ 20% sparsity is introduced using an induced sparsity mechanism. ImageNet dataset-based analysis of this model provides 95% classification accuracy and 0.96 harmonic mean of precision and recall. Each Input Feature Map (IFM) and its corresponding weight vector of an FC layer are arranged in a row of memory. A Combined IFM & Weights - Zero Valued Compression (CIW-ZVC) controller permits only the valid data from off-chip to on-chip memory. This is improving the data-movement rate with minimum hardware overhead. A processor array of 256 Convolution Operators (COs) and parallel computations with zero-gating on weights is used to compute in a 16-tiles per on-chip memory cycle. IFM is stationary for all the tiles which allows load-balancing with ease. This implementation with 14 nm accomplished a peak performance and energy efficiency of 256 × 10<sup>9</sup> Operations/Second (OPS) and 15 × 10<sup>12</sup> OPS/Watt per FC (VGG-16) layer respectively. Also, it improves energy efficiency to a maximum of 6.08 times and area efficiency to 7.6 times compared to the existing processors. [ABSTRACT FROM AUTHOR]

Copyright of Scientific Reports is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)