收录:
摘要:
Estimating depth/disparity information from stereo pairs via stereo matching is a classical research topic in computer vision. Recently, along with the development of deep learning technologies, many end-to-end deep networks have been proposed for stereo matching. These networks generally borrow convolutional neural network (CNN) structures originally designed for other tasks to extract features. These structures are generally redundant for the task of stereo matching. Besides, 3D convolutions in these networks are too complex to be extended for large perception fields which are helpful for disparity estimation. In order to overcome these problems, we propose a deep network structure based on the properties of stereo matching. In the proposed network, a concise and effective feature extraction module is presented. Moreover, a separated 3D convolution is introduced to avoid parameter explosion caused by increasing the size of convolution kernels. We validate our network on the dataset of SceneFlow in aspects of both accuracy and computation costs. Results show that the proposed network obtains state-of-the-art performance. Compared with the other structures, our feature extraction module can reduce 90% parameters and 25% time cost while achieving comparable accuracy. At the same time, our separated 3D convolution, accompanied by group normalization (GN), achieves lower end-point-error (EPE) than baseline methods. © 2020, Science Press. All right reserved.
关键词:
通讯作者信息:
电子邮件地址: