Google Scholar

Sense: Model-hardware codesign for accelerating sparse CNNs on systolic arrays

W Sun, D Liu, Z Zou, W Sun, S Chen…�- IEEE Transactions on�…, 2023 - ieeexplore.ieee.org

W Sun, D Liu, Z Zou, W Sun, S Chen, Y Kang

IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2023•ieeexplore.ieee.org

Sparsity is an intrinsic property of convolutional neural networks (CNNs), worth exploiting for CNN accelerators. However, the extra processing involved comes with hardware overhead, resulting in only marginal profits for most architectures. Meanwhile, systolic arrays have become increasingly competitive on CNN acceleration for its high spatiotemporal locality and low hardware overhead. However, the irregularity of sparsity induces imbalanced workloads under the rigid systolic dataflow, causing performance degradation. Thus, this article proposed a systolic-array-based architecture, called Sense, for sparse CNN acceleration by model-hardware codesign, enabling large performance gains. To balance input feature map (IFM) and weight loads across the processing element (PE) array, we applied channel clustering to gather IFMs with approximate sparsity for array computation and codesigned a load-balancing weight pruning method to keep the sparsity ratio of each kernel at a certain value with little accuracy loss, improving PE utilization and overall performance. In addition, adaptive dataflow configuration was applied to determine the computing strategy based on the storage ratio of IFMs and weights, lowering – dynamic random access memory (DRAM) access compared with Swallow and further reducing system energy consumption. The whole design was implemented on ZynqZCU102 with 200 MHz and performs at 471, 34, 53, and 191 image/s for AlexNet, VGG-16, ResNet-50, and GoogleNet, respectively. Compared with sparse systolic-array-based accelerators, Swallow, fusion-enabled systolic architecture (FESA), and SPOTS, Sense achieves – , – , and – energy efficiency (image/J) on these CNNs, respectively.

ieeexplore.ieee.org

Show moreShow less

Save Cite Cited by 11 Related articles All 5 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Sense: Model-hardware codesign for accelerating sparse CNNs on systolic arrays