• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Ruan, Xiao-Gang (Ruan, Xiao-Gang.) | Li, Peng (Li, Peng.) | Zhu, Xiao-Qing (Zhu, Xiao-Qing.) | Liu, Peng-Fei (Liu, Peng-Fei.)

收录:

EI CSCD

摘要:

Everyone knows it is impossible for agents to reach the goal efficiently until it has sufficiently explored the environment or constructed cognitive model of the world, but the essential question is how to generate goal-driven behaviour. Organisms can spontaneously explore the environment with rare or deceptive reward and build map-like representation to support subsequent actions, such as finding food, shelters or mates. What we want to know is whether the robot can imitate such cognitive mechanism to complete navigational tasks? Obviously, relying on high precision sensors as a source to recall the structure of environment is not practical in real world, so we perceive the state space and learn control policy with visual inputs. And to deal with the problems stem from dimension disaster, the deep learning is also used in our method. The navigation systems developed in robotics can typically be divided into two classes: one reach the goal by encoding the structure of environment, it can utilize multiple sensor information as input and provide high-quality environment maps; and the other one is map-less approach, which maintain a control policy in the learning process and use it to finish goal reaching tasks, each of them has their pros and cons. In this paper, we proposed a visual navigation method which can learn goal-driven behavior and encode space structure synchronously. Firstly, in order to learn control policy from raw visual information, we take deep reinforcement learning as basic navigation framework, it provides an end-to-end framework and allow our approach directly predict control signal from high-dimensional sensory inputs. Meanwhile, due to the environment contains a much wider variety of possible training signals, an auxiliary task named collision prediction is added to the model. Then, in the process of exploration, the agent throughout the environment numerous times and observe a lot of states, but much of them are repetitive, the temporal correlation network is used to remove these redundant observation and search for waypoints. Because the various perspective of agent, instead of using hand-designed features, we use temporal distance, which only related to environment steps to compute the similarity between states. And inspired by the researches about cognitive mechanism of animals, we learned that many mammals are able to utilize an observation, especially the one include landmarks, to represent a neighboring state space, thus encoding the environment in a simpler and efficient way. So we use waypoints, which discovered in exploration sequences and can represent an adjacent state space that within a certain temporal distance, to describe the structure of environment gradually. Finally, the space topological map is integrated into the model as a path planning module, and combines with locomotion network to obtain a more general navigation method. The experiment was conducted in 3D simulation environment DMlab. The experiment results show this navigation method can learn goal-driven behavior from visual inputs, and show more efficient learning approach and navigation policy in all test environments, and reduce the amount of data required to build map. Furthermore, by placing the agent in dynamically blocked environment, the model can take advantage of topological map to guide detour behavior and complete navigational tasks, showing better environmental adaptability. © 2021, Science Press. All right reserved.

关键词:

Air navigation Deep learning Disaster prevention Encoding (symbols) Learning systems Mammals Navigation systems Quality control Reinforcement learning Robots Signal encoding Topology Vision Visual servoing

作者机构:

  • [ 1 ] [Ruan, Xiao-Gang]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 2 ] [Ruan, Xiao-Gang]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
  • [ 3 ] [Li, Peng]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 4 ] [Li, Peng]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
  • [ 5 ] [Zhu, Xiao-Qing]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 6 ] [Zhu, Xiao-Qing]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
  • [ 7 ] [Liu, Peng-Fei]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 8 ] [Liu, Peng-Fei]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China

通讯作者信息:

  • [zhu, xiao-qing]faulty of information technology, beijing university of technology, beijing; 100124, china;;[zhu, xiao-qing]beijing key laboratory of computational intelligence and intelligent system, beijing; 100124, china

电子邮件地址:

查看成果更多字段

相关关键词:

来源 :

Chinese Journal of Computers

ISSN: 0254-4164

年份: 2021

期: 3

卷: 44

页码: 594-608

被引次数:

WoS核心集被引频次: 0

SCOPUS被引频次: 3

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 2

归属院系:

在线人数/总访问数:230/2892344
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司