收录:
摘要:
Coherent beam combination (CBC) is an effective method to break the limiting power of a single fiber laser. The Q-learning algorithm is one of the reinforcement learning algorithms. We use the Q-learning algorithm to do phase compensation in the field of CBC. The performance difference between the Q-learning algorithm and the stochastic parallel gradient descent optimization algorithm (SPGD) is analyzed by simulating time-domain coherent synthesis. The results show that the Q-learning algorithm is easier to debug and has better stability. © 2021 Elsevier B.V.
关键词:
通讯作者信息:
电子邮件地址: