收录:
摘要:
Convolutional Neural Networks (CNNs) have shown a great potential in different application domains including object detection, image classification, natural language processing, and speech recognition. Since the depth of the neural network architectures keep growing and the requirement of the large-scale dataset, to design a high-performance computing hardware for training CNNs is very necessary. In this paper, we measure the performance of different configuration on GPU platform and learning the patterns through training two CNNs architectures, LeNet and MiniNet, both perform the image classification. Observe the results of measurements, we indicate the correlation between L1D cache and the performance of GPUs during the training process. Also, we demonstrate that L2D cache slightly influences the performance. The network traffic intensity with both CNN models shows that each layer has distinct patterns of traffic intensity. © 2019 IEEE.
关键词:
通讯作者信息:
电子邮件地址: