• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Liu, Caixia (Liu, Caixia.) | Kong, Dehui (Kong, Dehui.) (学者:孔德慧) | Wang, Shaofan (Wang, Shaofan.) | Li, Jinghua (Li, Jinghua.) | Yin, Baocai (Yin, Baocai.) (学者:尹宝才)

收录:

EI Scopus SCIE

摘要:

Although deep networks based methods outperform traditional 3D reconstruction methods which require multiocular images or class labels to recover the full 3D geometry, they may produce incomplete recovery and unfaithful reconstruction when facing occluded parts of 3D objects. To address these issues, we propose Depth-preserving Latent Generative Adversarial Network (DLGAN) which consists of 3D Encoder-Decoder based GAN (EDGAN, serving as a generator and a discriminator) and Extreme Learning Machine (ELM, serving as a classifier) for 3D reconstruction from a monocular depth image of an object. Firstly, EDGAN decodes a latent vector from the 2.5D voxel grid representation of an input image, and generates the initial 3D occupancy grid under common GAN losses, a latent vector loss and a depth loss. For the latent vector loss, we design 3D deep AutoEncoder (AE) to learn a target latent vector from ground truth 3D voxel grid and utilize the vector to penalize the latent vector encoded from the input 2.5D data. For the depth loss, we utilize the input 2.5D data to penalize the initial 3D voxel grid from 2.5D views. Afterwards, ELM transforms float values of the initial 3D voxel grid to binary values under a binary reconstruction loss. Experimental results show that DLGAN not only outperforms several state-of-the-art methods by a large margin on both a synthetic dataset and a real-world dataset, but also predicts more occluded parts of 3D objects accurately without class labels.

关键词:

depth loss monocular depth image Three-dimensional displays 3D reconstruction Transforms ELM Image reconstruction Shape Generative adversarial networks latent vector Two dimensional displays Gallium nitride

作者机构:

  • [ 1 ] [Liu, Caixia]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 2 ] [Kong, Dehui]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 3 ] [Wang, Shaofan]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 4 ] [Li, Jinghua]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 5 ] [Yin, Baocai]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China

通讯作者信息:

  • [Wang, Shaofan]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China

查看成果更多字段

相关关键词:

来源 :

IEEE TRANSACTIONS ON MULTIMEDIA

ISSN: 1520-9210

年份: 2021

卷: 23

页码: 2843-2856

7 . 3 0 0

JCR@2022

ESI学科: COMPUTER SCIENCE;

ESI高被引阀值:87

JCR分区:1

被引次数:

WoS核心集被引频次: 10

SCOPUS被引频次: 12

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 1

在线人数/总访问数:143/4298800
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司