Neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm - Details

Author：

Liang, Mingming (Liang, Mingming.) | Wang, Ding (Wang, Ding.) (Scholars：王鼎) | Liu, Derong (Liu, Derong.)

Indexed by：

EI Scopus SCIE

Abstract：

In　this　paper,　a　novel　policy　iteration　adaptive　dynamic　programming　(ADP)　algorithm　is　presented　which　is　called　＂local　policy　iteration　ADP　algorithm＂　to　obtain　the　optimal　control　for　discrete　stochastic　processes.　In　the　proposed　local　policy　iteration　ADP　algorithm,　the　iterative　decision　rules　are　updated　in　a　local　space　of　the　whole　state　space.　Hence,　we　can　significantly　reduce　the　computational　burden　for　the　CPU　in　comparison　with　the　conventional　policy　iteration　algorithm.　By　analyzing　the　convergence　properties　of　the　proposed　algorithm,　it　is　shown　that　the　iterative　value　functions　are　monotonically　nonincreasing.　Besides,　the　iterative　value　functions　can　converge　to　the　optimum　in　a　local　policy　space.　In　addition,　this　local　policy　space　will　be　described　in　detail　for　the　first　time.　Under　a　few　weak　constraints,　it　is　also　shown　that　the　iterative　value　function　will　converge　to　the　optimal　performance　index　function　of　the　global　policy　space.　Finally,　a　simulation　example　is　presented　to　validate　the　effectiveness　of　the　developed　method.

Keyword：

Aerospace electronics Iterative algorithms Optimal control optimal control Stochastic processes Dynamical systems adaptive dynamic programming (ADP) local policy iteration Adaptive critic designs Heuristic algorithms stochastic processes Performance analysis neuro-dynamic programming

Author Community：

[ 1 ] [Liang, Mingming]Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[ 2 ] [Wang, Ding]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 3 ] [Wang, Ding]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China
[ 4 ] [Liu, Derong]Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China

Reprint Author's Address：

王鼎
[Wang, Ding]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China

Email：

liangmingming2015@ia.ac.cn |
dingwang@bjut.edu.cn |
derongliu@foxmail.com

Show more details

Related Keywords：

Improved value iteration for neural-network-based stochastic optimal control design
2020，NEURAL NETWORKS
A Novel Value Iteration Scheme With Adjustable Convergence Rate
2022，IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Adaptive critic design for nonlinear multi-player zero-sum games with unknown dynamics and control constraints
2023，NONLINEAR DYNAMICS
Asymmetric Constrained Optimal Tracking Control With Critic Learning of Nonlinear Multiplayer Zero-Sum Games
2022，IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Source ：

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS

ISSN： 2168-2216

Year： 2020

Issue： 11

Volume： 50

Page： 3972-3985

8 . 7 0 0

JCR@2022

ESI Discipline： ENGINEERING;

ESI HC Threshold：115

Cited Count：

WoS CC Cited Count： 13

SCOPUS Cited Count： 17

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 2

Affiliated Colleges：

信息学部

信息学部人工智能与自动化学院

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to