收录:
摘要:
In the past decade, deep learning is in a period of the rapid development, widely used in applications on different fields. In general, model training process will be deployed on cloud computing side, training on these small embedded devices is not recommended since it has the lower-end hardware configuration. Therefore these embedded devices are usually designed for inference. In this paper, a new image recognition framework based on heterogeneous multi core accelerator was established to achieve deep learning prediction process and improve the image recognition performance of embedded devices. At firstly, the fundamental principle of image recognition method based on deep learning reviewed as the basis of the study. And secondly, some important designs of CPU-Accelerator heterogeneous architecture based parallel image recognition framework included data splitting strategy framework architecture, data structure design and data parallelism were proposed to improve the recognition speed and the computational resource efficiency. Thirdly, Xilinx Zynq, Adapteva Epiphany combined hardware platform and the Rockchip RK3288 hardware platform were described in detail. Finally, an experiment of handwritten digits recognition was conducted to evaluate the accuracy and performance of this framework. The experimental results show that the proposed image recognition system can achieve nearly 8 times speedup as for recognized 28x28 image of ten handwritten digits and nearly 60 times speedup as for recognized 32x32 image of ten objects classification than RK3288 board which has the newest series of high-performance Arm core CPU as the control included 4 Arm A17 cores. © 2018 ACM.
关键词:
通讯作者信息:
电子邮件地址: