• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Sun, Mengshu (Sun, Mengshu.) | Xu, Kaidi (Xu, Kaidi.) | Lin, Xue (Lin, Xue.) | Hu, Yongli (Hu, Yongli.) | Yin, Baocai (Yin, Baocai.) (学者:尹宝才)

收录:

EI Scopus SCIE

摘要:

Being capable of extracting more information than 2-D convolutional neural networks (CNNs), 3-D CNNs have been playing a vital role in video analysis tasks like human action recognition, but their massive operations hinder the realtime execution on edge devices with constrained computation and memory resources. Although various model compression techniques have been applied to accelerate 2-D CNNs, there are rare efforts in investigating hardware-friendly pruning of 3D CNNs and acceleration on customizable edge platforms like FPGAs. This work starts from proposing a kernel group row- column (KGRC) weight sparsity pattern, which is fine-grained to achieve high pruning ratios with negligible accuracy loss, and balanced across kernel groups to achieve high computation parallelism on hardware. The reweighted pruning algorithm for this sparsity is then presented and performed on 3-D CNNs, followed by quantization under different precisions. Along with model compression, FPGA-based accelerators with four modes are designed in support of the kernel group sparsity in multiple dimensions. The co-design framework of the pruning algorithm and the accelerator is tested on two representative 3-D CNNs, namely C3D and R(2+1)D, + 1)D, with the Xilinx ZCU102 FPGA platform for action recognition. The experimental results indicate that the accelerator implementation with the KGRC sparsity and 8-bit quantization achieves a good balance between the speedup and model accuracy, leading to acceleration ratios of 4.12x for C3D and 3.85x for R(2+1)D compared with the 16-bit baseline designs supporting only dense models.

关键词:

Quantization (signal) edge device inference Parallel processing model compression Convolutional neural networks Kernel Computational modeling Field programmable gate arrays Three-dimensional displays FPGA 3-D convolutional neural network (CNN) weight pruning

作者机构:

  • [ 1 ] [Sun, Mengshu]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 2 ] [Hu, Yongli]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 3 ] [Yin, Baocai]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 4 ] [Sun, Mengshu]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 5 ] [Hu, Yongli]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 6 ] [Yin, Baocai]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 7 ] [Xu, Kaidi]Drexel Univ, Dept Comp Sci, Philadelphia, PA 19104 USA
  • [ 8 ] [Lin, Xue]Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA

通讯作者信息:

  • [Yin, Baocai]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China;;[Yin, Baocai]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China;;

查看成果更多字段

相关关键词:

来源 :

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS

ISSN: 0278-0070

年份: 2024

期: 10

卷: 43

页码: 3027-3040

2 . 9 0 0

JCR@2022

被引次数:

WoS核心集被引频次:

SCOPUS被引频次:

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 1

归属院系:

在线人数/总访问数:412/4952383
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司