收录:
摘要:
A reinforcement learning algorithm based on linear average is proposed, which is used to solve non-convergent problems of reinforcement learning function approximation in continuous state space. According to contraction theory, this algorithm is based on gradient descent method, which adopts linear average as performance evaluation of value function. So the iterative process of value function becomes a convergent process to a fixed value. A standard reinforcement learning problem, Mountain Car Problem, is used to verify the performance of the algorithm. Results show the effectiveness, feasibility and quick convergence of the algorithm.
关键词:
通讯作者信息:
电子邮件地址:
来源 :
Journal of Jilin University (Engineering and Technology Edition)
ISSN: 1671-5497
年份: 2008
期: 6
卷: 38
页码: 1407-1411