收录:
摘要:
Nowadays, the Internet of Things (IoT) has developed rapidly. To deal with the security problems in some of the IoT applications, blockchain has aroused lots of attention in both academia and industry. In this paper, we consider the mobile blockchain supporting IoT applications, and the mobile edge computing (MEC) is deployed at the Small-cell Base Station (SBS) as a supplement to enhance the computation ability of IoT devices. To encourage the participation of the SBS in the mobile blockchain networks, the long-term revenue of the SBS is considered. The task scheduling problem maximizing the long-term mining reward and minimizing the resource cost of the SBS is formulated as a Markov Decision Process (MDP). To achieve an efficient intelligent strategy, the deep reinforcement learning (DRL) based solution named policy gradient based computing tasks scheduling (PG-CTS) algorithm is proposed. The policy mapping from the system state to the task scheduling decision is represented by a deep neural network. The episodic simulations are built and the REINFORCE algorithm with baseline is used to train the policy network. According to the training results, the PG-CTS method is about 10 better than the second-best method greedy. The generalization ability of PG-CTS is proved theoretically, and the testing results also show that the PG-CTS method has better performance over the other three strategies, greedy, first-in-first-out (FIFO) and random in different environments. © 2020 IEEE.
关键词:
通讯作者信息:
电子邮件地址: