• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Huang, Zhangqin (Huang, Zhangqin.) (学者:黄樟钦) | Zhang, Shuo (Zhang, Shuo.) | Gao, Han (Gao, Han.) | Zhang, Xiaobo (Zhang, Xiaobo.) | Yang, Shengqi (Yang, Shengqi.)

收录:

EI SCIE

摘要:

To reduce DMA utilization for multiple algorithm IPs on FPGA, a channel configurable and multiplex DMA device (CMDMA) is proposed for asynchronous and heterogeneous algorithm IPs. Firstly, we abstract the entities and data-flow in CMDMA system with a formal description for function definition and work-flow analysis. Then based on the functions and work-flow, we design and implement a prototype of CMDMA, which includes CMDMA software driver (SW) and hardware circuits (HW) of one DMA IP, a configurable input switch (CISwitch), algorithm IPs, and an asynchronous output switch (AOSwitch). The configurable function of CMDMA is implemented by CISwitch through a configuration port in HW-level, and a configurable Round-Robin (CRR) algorithm is proposed to implement channel and input data schedule in SW-level. For output, a channel distinguishable output buffer (ChnDistBuf) is proposed, which is able to deliver channel ID and data size to SW earlier than the end time of an algorithm IP. With a double interrupt coordination method of both ChnDistBuf and algorithm IPs, CMDMA is able to successively store complete output data from different algorithm IPs. With a double interrupt coordination method of both ChnDistBuf and algorithm IPs, CMDMA is able to successively store complete output data from different algorithm IPs. The experiments based on 4 heterogeneous matrix multiplication algorithm IPs on Xilinx Zynq platform show that CMDMA is able to improve about 8% 29% average algorithm acceleration rates on single algorithm IP compared to the exclusive method that one DMA works for one algorithm IP only, and it is able to increase about 10-40 MB/s and 5-15 MB/s of DMA input and output data throughput with multiple algorithm IPs running in parallel. Moreover, the extended LUT and FF resources in CMDMA are 756 and 1219 , both of which are about 1% of Zynq platform. Besides, in a double CNN algorithm IPs test on Mnist application, an enhanced function of data broadcasting in CMDMA is able to improve 4 s than the system with 4 exclusive DMA running in parallel, meanwhile reduce 3 DMA utilization and 0. 03 W power consumption. (c) 2020 Elsevier B.V. All rights reserved.

关键词:

DMA FPGA Multiplex Switch System architecture

作者机构:

  • [ 1 ] [Huang, Zhangqin]Beijing Univ Technol, Beijing Engn Res Ctr IoT Software & Syst, Beijing 100124, Peoples R China
  • [ 2 ] [Zhang, Shuo]Beijing Univ Technol, Beijing Engn Res Ctr IoT Software & Syst, Beijing 100124, Peoples R China
  • [ 3 ] [Gao, Han]Beijing Univ Technol, Beijing Engn Res Ctr IoT Software & Syst, Beijing 100124, Peoples R China
  • [ 4 ] [Zhang, Xiaobo]Beijing Univ Technol, Beijing Engn Res Ctr IoT Software & Syst, Beijing 100124, Peoples R China
  • [ 5 ] [Yang, Shengqi]Beijing Univ Technol, Beijing Engn Res Ctr IoT Software & Syst, Beijing 100124, Peoples R China

通讯作者信息:

  • [Zhang, Shuo]Beijing Univ Technol, Beijing Engn Res Ctr IoT Software & Syst, Beijing 100124, Peoples R China

查看成果更多字段

相关关键词:

相关文章:

来源 :

MICROPROCESSORS AND MICROSYSTEMS

ISSN: 0141-9331

年份: 2020

卷: 77

2 . 6 0 0

JCR@2022

ESI学科: COMPUTER SCIENCE;

ESI高被引阀值:34

JCR分区:3

被引次数:

WoS核心集被引频次: 0

SCOPUS被引频次:

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 4

在线人数/总访问数:6405/2951138
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司