收录:
摘要:
The next-generation wireless network is expected to use low-earth orbit (LEO) satellite networks to deliver seamless and high-capacity global communications services. Due to the high-speed mobility of LEO satellites, massive and frequent handovers inevitably occur. Moreover, handover becomes more complicated with the ever-growing constellation scale, number of mobile terminals (MTs), and demands for emerging delay-sensitive applications. In this paper, a decentralized Markov decision process (DEC-MDP) is adopted to formulate the handover problem in the LEO satellite network with finite bursty traffic. The target is maximizing the total reward associated with the service revenue and the cost of handover and packet loss. To deal with the high computational complexity caused by the large state space and action space, the solution is designed using a multi-agent double deep Q-network (MADDQN) with fully decentralized framework, which also allows each MT to train and use an individual local DDQN to avoid load imbalance between satellites. Further, to alleviate the non-stationary issue of the environment in parallel learning, multi-agent fingerprints are applied in MADDQN, and the proposed algorithm is called multi-agent fingerprints-enhanced double deep Q-network-based distributed intelligent handover (MAF-DDQN-DIH) mechanism. The implementation of MAF-DDQN-DIH in practical communication systems are discussed, and the corresponding communication overhead and computational complexity are analyzed. Simulation results demonstrate that the designed multi-agent fingerprints are effective and the proposed MAF-DDQN-DIH algorithm outperforms the comparison handover algorithms in terms of the total reward.
关键词:
通讯作者信息:
电子邮件地址:
来源 :
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
ISSN: 0018-9545
年份: 2024
期: 10
卷: 73
页码: 15255-15269
6 . 8 0 0
JCR@2022
归属院系: