An optimal strategy learning for RoboCup in continuous state space - Details

Author：

Tao Junyuan (Tao Junyuan.) | Li Desheng (Li Desheng.) (Scholars：李德胜)

Indexed by：

CPCI-S

Abstract：

RoboCup　offers　a　set　of　challenges　for　machine　learning　researchers　because　it　is　a　dynamic,　nondeterministic,　goal　delayed　and　continuous　state　space　problem.　Reinforcement　learning　(RL)　is　often　used　for　strategy　learning　in　RoboCup,　which　is　a　method　to　learn　an　optimal　control　policy　for　sequential　decision-making　problems.　But　it　is　difficult　to　apply　RL,　to　continuous　state　space　problems　because　of　the　exponential　growth　of　states　in　the　number　of　state　variables.　An　effective　method　is　to　combine　RL　with　function　approximation.　However,　this　combination　sometimes　leads　to　diverge.　In　this　paper,　we　analyze　the　main　reason　that　cause　the　non-convergent　of　the　current　approximation　RL　algorithms　and　propose　an　optimal　strategy　learning　method.　The　two　processes　-　value　evaluation　and　policy　improvement　in　RL　have　been　separated.　Policy　search　process　is　controlled　strictly　in　the　direction　of　improving　performance　according　the　evaluation　value　provided　by　the　value　function.　And　we　apply　this　algorithm　to　a　standard　RoboCup　sub-problem-Keepaway　successfully.　Experiment　result　has　verified　the　effective　of　the　method　and　showed　the　algorithm　could　converge　to　a　local　optimal　policy.

Keyword：

optimal control policy reinforcement learning RoboCup function approximation

Author Community：

[ 1 ] [Tao Junyuan]Harbin Inst Technol, Dept Automat Measurement & Control, Harbin 150006, Heilingjiang, Peoples R China
[ 2 ] [Li Desheng]Beijing Univ Technol, Dept Elect Mech Engn, Beijing, Peoples R China

Reprint Author's Address：

[Tao Junyuan]Harbin Inst Technol, Dept Automat Measurement & Control, Harbin 150006, Heilingjiang, Peoples R China

Email：

tjy1975@hit.edu.cn

Show more details

Related Keywords：

An optimal strategy learning for RoboCup in continuous state space
2006，2006 IEEE International Conference on Mechatronics and Automation, ICMA 2006
Cooperative strategy learning in multi-agent environment with continuous state space
2006，5th International Conference on Machine Learning and Cybernetics
Cooperative strategy learning in multi-agent environment with continuous state space
2006，2006 International Conference on Machine Learning and Cybernetics
Reinforcement learning function approximation algorithm based on linear average
2008，Journal of Jilin University (Engineering and Technology Edition)

Source ：

IEEE ICMA 2006: PROCEEDING OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS 1-3, PROCEEDINGS

Year： 2006

Page： 301-,

Language： English

Cited Count：

WoS CC Cited Count： 1

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

信息学部电子科学与技术学院（微电子学院）

材料与制造学部本学院/部未明确归属的数据

材料与制造学部机械工程与应用电子技术学院

Get Fulltext

Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to