Discounted Generalized Value Iteration for Adaptive Critic Control Based on 1-Regularization - Details

Author：

Ma, Hongyu (Ma, Hongyu.) | Wang, Ding (Wang, Ding.) | Gao, Ning (Gao, Ning.) | Liu, Ao (Liu, Ao.) | Ren, Jin (Ren, Jin.)

Indexed by：

EI Scopus

Abstract：

Combined　with　policy-based　reinforcement　learning　(RL)　and　value-based　RL,　the　actor-critic　(AC)　learning　structure　is　an　effective　framework.　However,　the　cost　function　of　this　AC　framework　has　large　variances,　which　make　it　difficult　to　accomplish　an　optimization　objective.　Based　on　the　discounted　generalized　value　iteration　method　with　1-regularization,　a　regularized　AC　(RAC)　framework　is　developed　to　address　the　optimal　regulation　problems　and　make　the　cost　function　converge　faster.　Two　neural　networks　are　constructed　to　update　the　cost　function　and　the　policy　gradient,　respectively.　The　1-regularization　is　used　in　the　policy　gradient　and　the　cost　function　in　the　process　of　value　iteration.　The　cost　function　is　proved　to　converge　to　the　optimal　cost　function　in　a　monotonically　decreasing　form.　Finally,　the　effectiveness　of　RAC　is　shown　through　two　experiments.　©　2023　IEEE.

Keyword：

Iterative methods Dynamic programming Reinforcement learning Cost functions

Author Community：

[ 1 ] [Ma, Hongyu]Beijing University of Technology, Beijing, China
[ 2 ] [Wang, Ding]Beijing University of Technology, Beijing, China
[ 3 ] [Gao, Ning]Beijing University of Technology, Beijing, China
[ 4 ] [Liu, Ao]Beijing University of Technology, Beijing, China
[ 5 ] [Ren, Jin]Beijing University of Technology, Beijing, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Reinforcement Learning With Adjustable Convergence Rate for Data-Based Nonlinear Control
2024，36th Chinese Control and Decision Conference, CCDC 2024
Stability Analysis of Model-Free Control under Iterative Q-learning Algorithms
2023，9th International Conference on Control Science and Systems Engineering, ICCSSE 2023
Learning-Based N-Step Heuristic Dynamic Programming for Affine Nonlinear Optimal Regulation
2022，41st Chinese Control Conference, CCC 2022
An improved cuckoo search algorithm for semiconductor final testing scheduling
2017，13th IEEE Conference on Automation Science and Engineering, CASE 2017

Source ：

Year： 2023

Page： 105-110

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to