收录:
摘要:
Integrating GPU with CPU on the same chip is increasingly common in current processor architectures for high performance. CPU and GPU share on-chip network, last level cache, memory. Do not need to copy data back and forth that a discrete GPU requires. Shared virtual memory, memory coherence, and system-wide atomics are introduced to heterogeneous architectures and programming models to enable fine-grained CPU and GPU collaboration. Programming model such as OpenCL 2.0, CUDA 8.0, and C++ AMP support these heterogeneous architecture features. Data partition is one of the collaboration patterns. It is essential for improving performance and energy-efficiency to balance the data processed between CPU and GPU. In this paper, we first demonstrate that the optimal allocation of data to the CPU and GPU can provide 20% higher performance than fixed ratio of 20% for one application. Second, we evaluate another 5 heterogeneous applications covering the latest architecture features, found the relation of the data partitioning with performance. © Springer Nature Singapore Pte Ltd. 2018.
关键词:
通讯作者信息:
电子邮件地址: