收录:
摘要:
Multi-agent decision making is a fundamental problem in edge intelligence. In this paper, we study this problem for IoT networks under the distributed Multi-Armed Bandits (MAB) model. Most of existing works for distributed MAB demand long-time stable networks connected by powerful devices and hence may not be suitable for mobile IoT networks with harsh IoT constraints. To meet the challenge of resource constraints in mobile IoT environment, we propose a lightweight and robust learning algorithm in a dynamic network allowing topology changes. In our model, each agent is assumed to have only limited memory and communicate with each other asynchronously. Moreover, we assume that the bandwidth for exchanging information is limited and each agent can transmit O(log2K) bits (K denotes the number of arms) per communication. Rigorous analysis shows that despite these harsh constraints, the best arm/option can be identified collaboratively by the agents and the algorithm converges efficiently. Extensive experiments illustrate that the proposed algorithm exhibits good efficiency and stability in mobile settings. © 2020
关键词:
通讯作者信息:
电子邮件地址:
来源 :
Journal of Systems Architecture
ISSN: 1383-7621
年份: 2021
卷: 114
4 . 5 0 0
JCR@2022
ESI学科: COMPUTER SCIENCE;
ESI高被引阀值:87
JCR分区:1
归属院系: