Safe Q-Learning for Data-Driven Nonlinear Optimal Control with Asymmetric State Constraints - Details

Author：

Zhao, Mingming (Zhao, Mingming.) | Wang, Ding (Wang, Ding.) (Scholars：王鼎) | Song, Shijie (Song, Shijie.) | Qiao, Junfei (Qiao, Junfei.)

Indexed by：

EI Scopus SCIE

Abstract：

This　article　develops　a　novel　data-driven　safe　Q-learning　method　to　design　the　safe　optimal　controller　which　can　guarantee　constrained　states　of　nonlinear　systems　always　stay　in　the　safe　region　while　providing　an　optimal　performance.　First,　we　design　an　augmented　utility　function　consisting　of　an　adjustable　positive　definite　control　obstacle　function　and　a　quadratic　form　of　the　next　state　to　ensure　the　safety　and　optimality.　Second,　by　exploiting　a　pre-designed　admissible　policy　for　initialization,　an　off-policy　stabilizing　value　iteration　Q-learning　(SVIQL)　algorithm　is　presented　to　seek　the　safe　optimal　policy　by　using　offline　data　within　the　safe　region　rather　than　the　mathematical　model.　Third,　the　monotonicity,　safety,　and　optimality　of　the　SVIQL　algorithm　are　theoretically　proven.　To　obtain　the　initial　admissible　policy　for　SVIQL,　an　offline　VIQL　algorithm　with　zero　initialization　is　constructed　and　a　new　admissibility　criterion　is　established　for　immature　iterative　policies.　Moreover,　the　critic　and　action　networks　with　precise　approximation　ability　are　established　to　promote　the　operation　of　VIQL　and　SVIQL　algorithms.　Finally,　three　simulation　experiments　are　conducted　to　demonstrate　the　virtue　and　superiority　of　the　developed　safe　Q-learning　method.

Keyword：

Adaptive critic control Optimal control Safety Mathematical models stabilizing value iteration Q-learning (SVIQL) Heuristic algorithms Learning systems adaptive dynamic programming (ADP) control barrier functions (CBF) state constraints Q-learning Iterative methods

Author Community：

[ 1 ] [Zhao, Mingming]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing Lab Smart Environm Protect, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
[ 2 ] [Wang, Ding]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing Lab Smart Environm Protect, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
[ 3 ] [Qiao, Junfei]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing Lab Smart Environm Protect, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
[ 4 ] [Zhao, Mingming]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[ 5 ] [Wang, Ding]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[ 6 ] [Qiao, Junfei]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[ 7 ] [Song, Shijie]Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu 611731, Peoples R China

Reprint Author's Address：

[Qiao, Junfei]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing Lab Smart Environm Protect, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China;;[Qiao, Junfei]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China

Email：

zhaomm@emails.bjut.edu.cn |
dingwang@bjut.edu.cn |
songshijie@std.uestc.edu.cn |
adqiao@bjut.edu.cn

Show more details

Related Keywords：

Convergence and stability analysis of value iteration Q-learning under non-discounted cost for discrete-time optimal control
2024，NEUROCOMPUTING
Neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm
2020，IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS
Policy Gradient Adaptive Critic Design With Dynamic Prioritized Experience Replay for Wastewater Treatment Process Control
2022，IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS
A Novel Value Iteration Scheme With Adjustable Convergence Rate
2022，IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Source ：

IEEE-CAA JOURNAL OF AUTOMATICA SINICA

ISSN： 2329-9266

Year： 2024

Issue： 12

Volume： 11

Page： 2408-2422

1 1 . 8 0 0

JCR@2022

Cited Count：

WoS CC Cited Count： 2

SCOPUS Cited Count： 8

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to