收录:
摘要:
On the basis of trajectory imitation learning, aiming at the problem of the poor learning effect caused by the fluctuations of the single demonstration data, this paper presents a kind of imitation learning method based on multiple demonstrations. The multi constraint optimization idea is introduced for trajectory optimization. Multiple demonstrations data are coded by GMM separately, and the probability interval parameters are set as the optimization constraints. The intersection of the constraint conditions is to obtain the interval overlap probability. The probability of superior data is higher, and the probability of inferior data is lower. Parameter's set for multi constrained optimal trajectory are solved based on Gaussian distribution product form, and then obtain a multi constrained optimal trajectory through GMR. This method reduces the probability of bad data in the demonstration. It effectively avoids the fluctuation of the reproduction trajectory caused by the inferior demonstration data. The simulation results show the effectiveness of the method. © 2017 IEEE.
关键词:
通讯作者信息:
电子邮件地址: