Q-learning environment recognition method based on odor-reward shaping - Details

Author：

Ruan, Xiaogang (Ruan, Xiaogang.) | Liu, Pengfei (Liu, Pengfei.) | Zhu, Xiaoqing (Zhu, Xiaoqing.)

Indexed by：

Abstract：

Q-learning　is　a　model-free　iterative　reinforcement　learning　algorithm　that　is　widely　used　for　navigating　mobile　robots　in　unstructured　environments.　However,　the　exploration　and　utilization　of　the　environmental　data　limits　the　Q-learning　convergence　speed　for　mobile　robot　navigation.　This　study　used　the　Q-learning　algorithm　and　the　fact　that　rodents　use　olfactory　cues　for　spatial　orientation　and　navigation　to　develop　a　Q-learning　environmental　cognitive　strategy　based　on　odor-reward　shaping.　This　algorithm　reduces　useless　exploration　of　the　environment　by　improving　the　Q-learning　action　selection　strategy.　Environmental　odor　information　is　integrated　into　the　algorithm　with　the　olfactory　factor　used　to　weight　the　Q-learning　and　the　odor-reward　shaping　in　the　action　selection　strategy.　The　algorithm　effectiveness　is　evaluated　in　a　simulation　of　movement　in　the　labyrinth　environment　used　in　the　Tolman　mouse　experiment.　The　results　show　that　the　Q-learning　algorithm　with　odor-reward　shaping　reduces　useless　exploration　of　the　environment,　enhances　cognitive　learning　of　the　environment,　and　improves　the　algorithm　convergence　speed.　©　2021,　Tsinghua　University　Press.　All　right　reserved.

Keyword：

Electronic nose Learning systems Mammals Computer aided instruction Mobile robots Iterative methods Reinforcement learning Learning algorithms

Author Community：

[ 1 ] [Ruan, Xiaogang]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 2 ] [Ruan, Xiaogang]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
[ 3 ] [Liu, Pengfei]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 4 ] [Liu, Pengfei]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
[ 5 ] [Zhu, Xiaoqing]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 6 ] [Zhu, Xiaoqing]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China

Reprint Author's Address：

[zhu, xiaoqing]beijing key laboratory of computational intelligence and intelligent system, beijing; 100124, china;;[zhu, xiaoqing]faculty of information technology, beijing university of technology, beijing; 100124, china

Email：

alex.zhuxq@bjut.edu.cn

Show more details

Related Keywords：

Bionic electronic nose based on mos sensors array and machine learning algorithms used for wine properties detection
2019，Sensors (Switzerland)
Swarm robotic odor source localization using ant colony algorithm
2009，2009 IEEE International Conference on Control and Automation, ICCA 2009
Neural network-based reinforcement learning applied to obstacle avoidance
2008，Journal of Tsinghua University
The research on reinforcement learning based on robocup
2007，Journal of Harbin Institute of Technology
An Approach to Predict Multiple Cardiac Diseases
2019，1st International Workshop on Machine Learning and Medical Engineering for Cardiovasvular Healthcare, MLMECH 2019, and the 8th International Joint Workshops on Computing and Visualization for Intravascular Imaging and Computer Assisted Stenting, CVII-STENT 2019, held in conjunction with 22nd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2019

Source ：

Journal of Tsinghua University

ISSN： 1000-0054

Year： 2021

Issue： 3

Volume： 61

Page： 254-260

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 3

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

信息学部

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to