A Visual Navigation Method Based on Goal-Driven Behavior and Space Topological Memory - Details

Author：

Ruan, Xiao-Gang (Ruan, Xiao-Gang.) | Li, Peng (Li, Peng.) | Zhu, Xiao-Qing (Zhu, Xiao-Qing.) | Liu, Peng-Fei (Liu, Peng-Fei.)

Indexed by：

Abstract：

Everyone　knows　it　is　impossible　for　agents　to　reach　the　goal　efficiently　until　it　has　sufficiently　explored　the　environment　or　constructed　cognitive　model　of　the　world,　but　the　essential　question　is　how　to　generate　goal-driven　behaviour.　Organisms　can　spontaneously　explore　the　environment　with　rare　or　deceptive　reward　and　build　map-like　representation　to　support　subsequent　actions,　such　as　finding　food,　shelters　or　mates.　What　we　want　to　know　is　whether　the　robot　can　imitate　such　cognitive　mechanism　to　complete　navigational　tasks?　Obviously,　relying　on　high　precision　sensors　as　a　source　to　recall　the　structure　of　environment　is　not　practical　in　real　world,　so　we　perceive　the　state　space　and　learn　control　policy　with　visual　inputs.　And　to　deal　with　the　problems　stem　from　dimension　disaster,　the　deep　learning　is　also　used　in　our　method.　The　navigation　systems　developed　in　robotics　can　typically　be　divided　into　two　classes:　one　reach　the　goal　by　encoding　the　structure　of　environment,　it　can　utilize　multiple　sensor　information　as　input　and　provide　high-quality　environment　maps;　and　the　other　one　is　map-less　approach,　which　maintain　a　control　policy　in　the　learning　process　and　use　it　to　finish　goal　reaching　tasks,　each　of　them　has　their　pros　and　cons.　In　this　paper,　we　proposed　a　visual　navigation　method　which　can　learn　goal-driven　behavior　and　encode　space　structure　synchronously.　Firstly,　in　order　to　learn　control　policy　from　raw　visual　information,　we　take　deep　reinforcement　learning　as　basic　navigation　framework,　it　provides　an　end-to-end　framework　and　allow　our　approach　directly　predict　control　signal　from　high-dimensional　sensory　inputs.　Meanwhile,　due　to　the　environment　contains　a　much　wider　variety　of　possible　training　signals,　an　auxiliary　task　named　collision　prediction　is　added　to　the　model.　Then,　in　the　process　of　exploration,　the　agent　throughout　the　environment　numerous　times　and　observe　a　lot　of　states,　but　much　of　them　are　repetitive,　the　temporal　correlation　network　is　used　to　remove　these　redundant　observation　and　search　for　waypoints.　Because　the　various　perspective　of　agent,　instead　of　using　hand-designed　features,　we　use　temporal　distance,　which　only　related　to　environment　steps　to　compute　the　similarity　between　states.　And　inspired　by　the　researches　about　cognitive　mechanism　of　animals,　we　learned　that　many　mammals　are　able　to　utilize　an　observation,　especially　the　one　include　landmarks,　to　represent　a　neighboring　state　space,　thus　encoding　the　environment　in　a　simpler　and　efficient　way.　So　we　use　waypoints,　which　discovered　in　exploration　sequences　and　can　represent　an　adjacent　state　space　that　within　a　certain　temporal　distance,　to　describe　the　structure　of　environment　gradually.　Finally,　the　space　topological　map　is　integrated　into　the　model　as　a　path　planning　module,　and　combines　with　locomotion　network　to　obtain　a　more　general　navigation　method.　The　experiment　was　conducted　in　3D　simulation　environment　DMlab.　The　experiment　results　show　this　navigation　method　can　learn　goal-driven　behavior　from　visual　inputs,　and　show　more　efficient　learning　approach　and　navigation　policy　in　all　test　environments,　and　reduce　the　amount　of　data　required　to　build　map.　Furthermore,　by　placing　the　agent　in　dynamically　blocked　environment,　the　model　can　take　advantage　of　topological　map　to　guide　detour　behavior　and　complete　navigational　tasks,　showing　better　environmental　adaptability.　©　2021,　Science　Press.　All　right　reserved.

Keyword：

Visual servoing Mammals Quality control Air navigation Navigation systems Reinforcement learning Robots Encoding (symbols) Disaster prevention Signal encoding Topology Deep learning Vision Learning systems

Author Community：

[ 1 ] [Ruan, Xiao-Gang]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 2 ] [Ruan, Xiao-Gang]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
[ 3 ] [Li, Peng]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 4 ] [Li, Peng]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
[ 5 ] [Zhu, Xiao-Qing]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 6 ] [Zhu, Xiao-Qing]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
[ 7 ] [Liu, Peng-Fei]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 8 ] [Liu, Peng-Fei]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China

Reprint Author's Address：

[zhu, xiao-qing]faulty of information technology, beijing university of technology, beijing; 100124, china;;[zhu, xiao-qing]beijing key laboratory of computational intelligence and intelligent system, beijing; 100124, china

Email：

alex.zhuxq@bjut.edu.cn

Show more details

Related Keywords：