An optimal strategy learning for RoboCup in continuous state space - Details

Author：

Tao, Junyuan (Tao, Junyuan.) | Li, Desheng (Li, Desheng.) (Scholars：李德胜)

Indexed by：

EI Scopus

Abstract：

RoboCup　offers　a　set　of　challenges　for　machine　learning　researchers　because　it　is　a　dynamic,　nondeterministic,　goal　delayed　and　continuous　state　space　problem.　Reinforcement　learning　(RL)　is　often　used　for　strategy　learning　in　RoboCup,　which　is　a　method　to　learn　an　optimal　control　policy　for　sequential　decision-making　problems.　But　it　is　difficult　to　apply　RL　to　continuous　state　space　problems　because　of　the　exponential　growth　of　states　in　the　number　of　state　variables.　An　effective　method　is　to　combine　RL　with　function　approximation.　However,　this　combination　sometimes　leads　to　diverge.　In　this　paper,　we　analyze　the　main　reason　that　cause　the　non-convergent　of　the　current　approximation　RL　algorithms　and　propose　an　optimal　strategy　learning　method.　The　two　processes　-　value　evaluation　and　policy　improvement　in　RL　have　been　separated.　Policy　search　process　is　controlled　strictly　in　the　direction　of　improving　performance　according　the　evaluation　value　provided　by　the　value　function.　And　we　apply　this　algorithm　to　a　standard　RoboCup　sub-problem-Keepaway　successfully.　Experiment　result　has　verified　the　effective　of　the　method　and　showed　the　algorithm　could　converge　to　a　local　optimal　policy.　©2006　IEEE.

Keyword：

Learning algorithms Problem solving State space methods Reinforcement learning Robotics Function evaluation Decision making

Author Community：

[ 1 ] [Tao, Junyuan]Department of Automatic Measurement and Control, Harbin Institute of Technology, Harbin, Heilongjiang Province, China
[ 2 ] [Li, Desheng]Department of Mechanical and Electronic Engineering, Beijing University of Technology, Beijing, China

Reprint Author's Address：

Email：

tjy1975@hit.edu.cn

Show more details

Related Keywords：