收录:
摘要:
Depth maps have been used in many vision tasks due to the real-time acquisition and low cost of consumer depth cameras. However, they still suffer from low precision and severe sensor noise, even with the significant research in depth enhancement. We propose a novel multi-level feature fusion convolutional neural network (CNN) for facial depth map refinement named MFFNet. It is a multi-stage network, where each stage is a local multi-level feature fusion (LMLF) block. For smoothing the noise as well as boosting detailed facial structure, a hierarchical fusion strategy is adopted to fully fuse multi-level features, i.e., an LMLF block fuses multi-level features locally in each stage, while inter-stage skip connections are employed to reach a global multi-level feature fusion. Moreover, the inter-stage skip connections can also ease the training through shortening the information propagation paths. We introduce an effective data augmentation method to synthesize noisy facial depth maps of various poses. Training with these synthetic data improves the robustness of the proposed method to face poses. The proposed method is evaluated with a synthetic facial depth map dataset, a real Kinect V2 facial depth map dataset and the Middlebury Stereo Dataset. Experimental results show that our method produces refined depth maps with high quality and outperforms several state-of-the-art methods.
关键词:
通讯作者信息:
电子邮件地址:
来源 :
SIGNAL PROCESSING-IMAGE COMMUNICATION
ISSN: 0923-5965
年份: 2022
卷: 103
3 . 5
JCR@2022
3 . 5 0 0
JCR@2022
ESI学科: ENGINEERING;
ESI高被引阀值:49
JCR分区:2
中科院分区:3
归属院系: