• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索
高影响力成果及被引频次趋势图 关键词云图及合作者关系图

您的检索:

学者姓名:孔德慧

精炼检索结果:

来源

应用 展开

合作者

应用 展开

清除所有精炼条件

排序方式:
默认
  • 默认
  • 标题
  • 年份
  • WOS被引数
  • 影响因子
  • 正序
  • 倒序
< 页,共 37 >
Joint multi-scale transformers and pose equivalence constraints for 3D human pose estimation SCIE
期刊论文 | 2024 , 103 | JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION
摘要&关键词 引用

摘要 :

Different from image-based 3D pose estimation, video-based 3D pose estimation gains performance improvement with temporal information. However, these methods still face the challenge of insufficient generalization ability, including human motion speed, body shape, and camera distance. To address the above problems, we propose a novel approach, referred to as joint Spatial-temporal Multi-scale Transformers and Pose Transformation Equivalence Constraints (SMT-PTEC) for 3D human pose estimation from videos. We design a more general spatial-temporal multi-scale feature extraction strategy, and introduce optimization constraints that adapt to the diversity of data to improve the accuracy of pose estimation. Specifically, we first introduce a spatial multi-scale transformer to extract multi-scale features of pose and establish a cross-scale information transfer mechanism, which effectively explores the underlying knowledge of human motion. Then, we present a temporal multi-scale transformer to explore multi-scale dependencies between frames, enhance the adaptability of the network to human motion speed, and improve the estimation accuracy through a context aware fusion of multi-scale predictions. Moreover, we add pose transformation equivalence constraints by changing the training samples with horizontal flipping, scaling, and body shape transformation to effectively overcome the influence of camera distance and body shape for the prediction accuracy. Extensive experimental results demonstrate that our approach achieves superior performance with less computational complexity than previous state-of-the-art methods. Code is available at https://github.com/JNGao123/SMT-PTEC.

关键词 :

Spatial-temporal multi-scale Spatial-temporal multi-scale Transformer Transformer Pose transformation equivalence Pose transformation equivalence Pose estimation Pose estimation

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 Wu, Yongpeng , Kong, Dehui , Gao, Junna et al. Joint multi-scale transformers and pose equivalence constraints for 3D human pose estimation [J]. | JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION , 2024 , 103 .
MLA Wu, Yongpeng et al. "Joint multi-scale transformers and pose equivalence constraints for 3D human pose estimation" . | JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION 103 (2024) .
APA Wu, Yongpeng , Kong, Dehui , Gao, Junna , Li, Jinghua , Yin, Baocai . Joint multi-scale transformers and pose equivalence constraints for 3D human pose estimation . | JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION , 2024 , 103 .
导入链接 NoteExpress RIS BibTex
OASNet: Object Affordance State Recognition Network With Joint Visual Features and Relational Semantic Embeddings SCIE
期刊论文 | 2024 , 34 (5) , 3368-3382 | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
摘要&关键词 引用

摘要 :

Traditional affordance learning tasks aim to understand object's interactive functions in an image, such as affordance recognition and affordance detection. However, these tasks cannot determine whether the object is currently interacting, which is crucial for many follow-up tasks, including robotic manipulation and planning task. To fill this gap, this paper proposes a novel object affrodance state (OAS) recognition task, i.e., simultaneously recognizing an object's affordances and the partner objects that are interacting with it. Accordingly, to facilitate the application of deep learning technology, an OAS recognition task related dataset OAS10k is constructed by collecting and labeling over 10k images. In the dataset, a sample is defined as a set of an image and its OAS labels, each label is represented as $\left \langle{ \rm {\textit {subject, subject's affrodance, interacted object}} }\right \rangle $ . These triplet labels have rich relational semantic information, which can improve OAS recognition performance. We hence construct a directed OAS knowledge graph of affordance states, and extract an OAS matrix from it for modelling the semantic relationships of the triplets. Based on the matrix, we propose an OAS recognition network (OASNet), which utilizes GCN to capture the relational semantic embeddings, and uses a transformer to fuse them with the visual features from an image to recognize the affordance states of objects in the image. Experimental results on OAS10k dataset and other triplet label recognition datasets demonstrate that the proposed OASNet achieves the best performance compared to the state-of-the-art methods. The dataset and codes will be released on https://github.com/mxmdpc/OAS.

关键词 :

Object affordance state recognition Object affordance state recognition transformer transformer multi-label image classification multi-label image classification relational semantic embeddings relational semantic embeddings graph convolution networks graph convolution networks

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 Chen, Dongpan , Kong, Dehui , Li, Jinghua et al. OASNet: Object Affordance State Recognition Network With Joint Visual Features and Relational Semantic Embeddings [J]. | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2024 , 34 (5) : 3368-3382 .
MLA Chen, Dongpan et al. "OASNet: Object Affordance State Recognition Network With Joint Visual Features and Relational Semantic Embeddings" . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 34 . 5 (2024) : 3368-3382 .
APA Chen, Dongpan , Kong, Dehui , Li, Jinghua , Wang, Lichun , Gao, Junna , Yin, Baocai . OASNet: Object Affordance State Recognition Network With Joint Visual Features and Relational Semantic Embeddings . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2024 , 34 (5) , 3368-3382 .
导入链接 NoteExpress RIS BibTex
一种多视角加权聚合的三维点云重建方法 incoPat
专利 | 2023-02-24 | CN202310195559.3
摘要&关键词 引用

摘要 :

本发明公开了一种多视角加权聚合的三维点云重建方法,由非局部特征提取器对输入图像进行处理得到特征图,利用单应性变换对特征图进行变换以生成多个成本体,通过轻量化的加权聚合模块将成本体之间的三维关系进行编码生成一个三维成本体,使用边缘语义引导的伪三维卷积回归网络对三维成本体进行深度回归得到多视角深度图,最后利用相机矩阵参数反映射计算生成点云。通过提出基于空洞卷积的非局部特征提取器来提升点云的完整度;通过构建轻量化的加权聚合模块来提升点云的精度并降低网络的计算量;通过提出边缘语义引导的伪三维卷积网络来提升多视角三维重建的重建精度和降低对硬件的要求。

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 孔德慧 , 张少杰 , 李敬华 et al. 一种多视角加权聚合的三维点云重建方法 : CN202310195559.3[P]. | 2023-02-24 .
MLA 孔德慧 et al. "一种多视角加权聚合的三维点云重建方法" : CN202310195559.3. | 2023-02-24 .
APA 孔德慧 , 张少杰 , 李敬华 , 尹宝才 . 一种多视角加权聚合的三维点云重建方法 : CN202310195559.3. | 2023-02-24 .
导入链接 NoteExpress RIS BibTex
一种基于超图注意力的人体网格重建方法 incoPat
专利 | 2023-05-25 | CN202310600839.8
摘要&关键词 引用

摘要 :

本发明公开了一种基于超图注意力的人体网格重建方法,提出基于超图的人体网格分层表示来形成含部件语义的人体网格表示模型,这种新的表示模型为人体网格重建提供结构基础;通过构建Body2Parts特征转移模块实现部件间特征的汇聚和与图像信息的融合,从部件层级去进行信息交互和融合,可支持部件层次的高质量人体重建;通过提出Part2Vertices特征转移模块实现部件特征和顶点特征的转移,以及利用超图注意力来细化顶点级的特征,以顶点为表示单元,在部件内进行特征传递,支持网格点层次的精细化人体重建。基于层级化人体网格表示模型的层级化重建方法,本发明实现了三维人体网格重建精度和计算代价的高性能折衷。

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 孔德慧 , 郝晨辉 , 李敬华 et al. 一种基于超图注意力的人体网格重建方法 : CN202310600839.8[P]. | 2023-05-25 .
MLA 孔德慧 et al. "一种基于超图注意力的人体网格重建方法" : CN202310600839.8. | 2023-05-25 .
APA 孔德慧 , 郝晨辉 , 李敬华 , 尹宝才 . 一种基于超图注意力的人体网格重建方法 : CN202310600839.8. | 2023-05-25 .
导入链接 NoteExpress RIS BibTex
一种基于人体交互意图信息的层级人物交互检测方法 incoPat
专利 | 2023-03-20 | CN202310266335.7
摘要&关键词 引用

摘要 :

本发明公开了一种基于人体交互意图信息的层级人物交互检测方法,分为1)目标检测:检测输入图像中的所有对象实例。2)人物交互检测:对图像中所有的对实例进行人物交互检测。通过视觉特征的设计抽象出人体注视信息来建模交互参与者关注的上下文区域;提出面向人体交互意图的人体姿态图构建,优化身体运动对交互检测的差异信息;使用人和物体之间的距离‑特征作为引导视觉距离特征的优化,提升人物交互检测算法的性能。

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 孔德慧 , 王帅 , 李敬华 et al. 一种基于人体交互意图信息的层级人物交互检测方法 : CN202310266335.7[P]. | 2023-03-20 .
MLA 孔德慧 et al. "一种基于人体交互意图信息的层级人物交互检测方法" : CN202310266335.7. | 2023-03-20 .
APA 孔德慧 , 王帅 , 李敬华 , 尹宝才 . 一种基于人体交互意图信息的层级人物交互检测方法 : CN202310266335.7. | 2023-03-20 .
导入链接 NoteExpress RIS BibTex
一种基于空间骨架信息的手绘草图三维模型重建方法 incoPat
专利 | 2023-02-24 | CN202310163381.4
摘要&关键词 引用

摘要 :

本发明公开了一种基于空间骨架信息的手绘草图三维模型重建方法,提出空间骨架引导编码器、域自适应编码器和自注意力解码器,通过空间骨架编码器提取草图的骨架特征,骨架信息作为一种先验知识来提供重建完整三维模型所需的辅助信息,域自适应编码器将合成草图学习到的知识迁移到手绘草图中,基于注意力的解码器消除歧义性,本方法提升了单张手绘草图的三维重建精度。自注意力机制使得模型区分轮廓相似度较高的草图输入;相对于其他技术使用判别器与梯度反转层的域自适应方法,其训练的值函数相当于最小化两个分布之间的Jensen‑Shannon散度,因为最小化的散度对于生成器参数来说可能不是连续的,而本发明的域自适应约束函数可被认为处处可微,训练更加稳定。

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 孔德慧 , 马杨 , 李敬华 et al. 一种基于空间骨架信息的手绘草图三维模型重建方法 : CN202310163381.4[P]. | 2023-02-24 .
MLA 孔德慧 et al. "一种基于空间骨架信息的手绘草图三维模型重建方法" : CN202310163381.4. | 2023-02-24 .
APA 孔德慧 , 马杨 , 李敬华 , 尹宝才 . 一种基于空间骨架信息的手绘草图三维模型重建方法 : CN202310163381.4. | 2023-02-24 .
导入链接 NoteExpress RIS BibTex
CIGNet: Category-and-Intrinsic-Geometry Guided Network for 3D coarse-to-fine reconstruction SCIE
期刊论文 | 2023 , 554 | NEUROCOMPUTING
WoS核心集被引次数: 4
摘要&关键词 引用

摘要 :

3D object reconstruction from arbitrary view intensity images is a challenging but meaningful research topic in computer vision. The main limitations of existing approaches are that they lack complete and efficient prior information and might not be able to deal with serious occlusion or partial observation of 3D objects, which may produce incomplete and unreliable reconstructions. To reconstruct structure and recover missing or unseen parts of objects, category prior and intrinsic geometry relation are particularly useful and necessary during the 3D reconstruction process. In this paper, we propose Category-and-Intrinsic-Geometry Guided Network (CIGNet) for 3D coarse-to-fine reconstruction from arbitrary view intensity images by leveraging category prior and intrinsic geometry relation. CIGNet combines a category prior guided reconstruction module with an intrinsic geometry relation guided refinement module. In the first reconstruction module, we leverage semantic class context by adding a supervision term over object categories to output coarse reconstructed results. In the second refinement module, we model the coarse 3D volumetric data as 2D slices and consider intrinsic geometry relations between them to design graph structures of coarse 3D volumes to finish the graph based refinement. CIGNet can accomplish high-quality 3D reconstruction tasks by exploring the intra-category characteristics of objects as well as the intrinsic geometry relations of each object, both of which serve as useful complements to the visual information of images, in a coarse-to-fine fashion. Extensive quantitative and qualitative experiments on a synthetic dataset ShapeNet and real-world datasets Pix3D, Statue Model Repository, and BlendedMVS indicate that CIGNet outperforms several state-of-the-art methods in terms of accuracy and detail recovery.

关键词 :

Graph convolutional network Graph convolutional network 3D reconstruction 3D reconstruction Category prior Category prior Geometry perception Geometry perception 3D refinement 3D refinement

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 Gao, Junna , Kong, Dehui , Wang, Shaofan et al. CIGNet: Category-and-Intrinsic-Geometry Guided Network for 3D coarse-to-fine reconstruction [J]. | NEUROCOMPUTING , 2023 , 554 .
MLA Gao, Junna et al. "CIGNet: Category-and-Intrinsic-Geometry Guided Network for 3D coarse-to-fine reconstruction" . | NEUROCOMPUTING 554 (2023) .
APA Gao, Junna , Kong, Dehui , Wang, Shaofan , Li, Jinghua , Yin, Baocai . CIGNet: Category-and-Intrinsic-Geometry Guided Network for 3D coarse-to-fine reconstruction . | NEUROCOMPUTING , 2023 , 554 .
导入链接 NoteExpress RIS BibTex
ADOSMNet: a novel visual affordance detection network with object shape mask guided feature encoders SCIE
期刊论文 | 2023 , 83 (11) , 31629-31653 | MULTIMEDIA TOOLS AND APPLICATIONS
摘要&关键词 引用

摘要 :

Visual affordance detection aims to understand the functional attributes of objects, which is crucial for robots to achieve interactive tasks. Most existing affordance detection methods mainly utilize the global image features for affordance detection while do not fully exploit the features of local relevant objects in the image, which often leads to suboptimal detection accuracy under the interference of cluttered backgrounds and neighbour objects. Numerous researches have proved that the accuracy of affordance detection largely depends on the quality of extracted image features. In this paper, we propose a novel affordance detection network with object shape mask guided feature encoders. The masks play as an attention mechanism that enforce the network to focus on the shape regions of target objects in the image, which facilitate to obtain high-quality features. Specifically, we first propose a shape mask guided encoder, which uses masks to effectively locate all target objects so as to extract more expressive features. Based on the encoder, we then propose a dual enhance feature aggregation module, which consists of two branches. The first branch encodes the global features of the original image, while the second branch locates each local relevant object and encodes its precise features. Aggregating these features enhances the feature representation of each object, further improving feature quality and suppressing interference. Quantitative and qualitative evaluations compared with state-of-the-art methods demonstrate that the proposed method achieves superior performance on the two commonly used affordance detection datasets.

关键词 :

Feature enhancement Feature enhancement Visual affordance detection Visual affordance detection Feature representation Feature representation Object shape mask Object shape mask Image segmentation Image segmentation

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 Chen, Dongpan , Kong, Dehui , Li, Jinghua et al. ADOSMNet: a novel visual affordance detection network with object shape mask guided feature encoders [J]. | MULTIMEDIA TOOLS AND APPLICATIONS , 2023 , 83 (11) : 31629-31653 .
MLA Chen, Dongpan et al. "ADOSMNet: a novel visual affordance detection network with object shape mask guided feature encoders" . | MULTIMEDIA TOOLS AND APPLICATIONS 83 . 11 (2023) : 31629-31653 .
APA Chen, Dongpan , Kong, Dehui , Li, Jinghua , Wang, Shaofan , Yin, Baocai . ADOSMNet: a novel visual affordance detection network with object shape mask guided feature encoders . | MULTIMEDIA TOOLS AND APPLICATIONS , 2023 , 83 (11) , 31629-31653 .
导入链接 NoteExpress RIS BibTex
A Survey of Visual Affordance Recognition Based on Deep Learning SCIE
期刊论文 | 2023 , 9 (6) , 1458-1476 | IEEE TRANSACTIONS ON BIG DATA
WoS核心集被引次数: 2
摘要&关键词 引用

摘要 :

Visual affordance recognition is an important research topic in robotics, human-computer interaction, and other computer vision tasks. In recent years, deep learning-based affordance recognition methods have achieved remarkable performance. However, there is no unified and intensive survey of these methods up to now. Therefore, this article reviews and investigates existing deep learning-based affordance recognition methods from a comprehensive perspective, hoping to pursue greater acceleration in this research domain. Specifically, this article first classifies affordance recognition into five tasks, delves into the methodologies of each task, and explores their rationales and essential relations. Second, several representative affordance recognition datasets are investigated carefully. Third, based on these datasets, this article provides a comprehensive performance comparison and analysis of the current affordance recognition methods, reporting the results of different methods on the same datasets and the results of each method on different datasets. Finally, this article summarizes the progress of affordance recognition, outlines the existing difficulties and provides corresponding solutions, and discusses its future application trends.

关键词 :

function understanding function understanding Big Data Big Data Image segmentation Image segmentation computer vision computer vision deep learning models deep learning models Task analysis Task analysis Surveys Surveys Visual affordance recognition Visual affordance recognition Feature extraction Feature extraction robotics robotics convolutional neural network convolutional neural network Humanities Humanities Affordances Affordances

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 Chen, Dongpan , Kong, Dehui , Li, Jinghua et al. A Survey of Visual Affordance Recognition Based on Deep Learning [J]. | IEEE TRANSACTIONS ON BIG DATA , 2023 , 9 (6) : 1458-1476 .
MLA Chen, Dongpan et al. "A Survey of Visual Affordance Recognition Based on Deep Learning" . | IEEE TRANSACTIONS ON BIG DATA 9 . 6 (2023) : 1458-1476 .
APA Chen, Dongpan , Kong, Dehui , Li, Jinghua , Wang, Shaofan , Yin, Baocai . A Survey of Visual Affordance Recognition Based on Deep Learning . | IEEE TRANSACTIONS ON BIG DATA , 2023 , 9 (6) , 1458-1476 .
导入链接 NoteExpress RIS BibTex
DASI: Learning Domain Adaptive Shape Impression for 3D Object Reconstruction SCIE
期刊论文 | 2023 , 25 , 5248-5262 | IEEE TRANSACTIONS ON MULTIMEDIA
摘要&关键词 引用

摘要 :

Previous 3D object reconstruction methods from 2D images involve two issues: the lack of in-depth exploration of the prior knowledge of 3D shapes, and the difficulty of dealing with the serious occluded parts. Inspired by human's perception on real-world objects which is composed of an overall impression (known as shape impression) and an enhanced cognition, we propose a deep network (denoted by DASI) to learn the Domain Adaptive Shape Impression for 3D reconstruction from arbitrary view images. DASI consists of two modules: shape reconstruction module and shape refinement module. The former module reconstructs a coarse volume by learning a domain adaptive shape impression as embedding in image-based reconstruction. We first leverage 3D objects to learn a shape impression being associated with prior knowledge of 3D objects. To attain consensus on shape impression from 2D images, we regard the 3D shape and the 2D image as two different domains. By adapting the two domains, the shape impression learned from 3D objects is transferred to 2D images and guides the images-based reconstruction. The latter module refines the objects by modeling the whole 3D volume to local 3D patches and exploring their intrinsic geometry relationships. Quantitative and qualitative experimental results on two benchmark datasets demonstrate that DASI outperforms several state-of-the-arts for 3D reconstruction from single and multi-view 2D images.

关键词 :

deep learning deep learning transformer transformer 3D refinement 3D refinement domain adaptation domain adaptation 3D reconstruction 3D reconstruction

引用:

复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。

GB/T 7714 Gao, Junna , Kong, Dehui , Wang, Shaofan et al. DASI: Learning Domain Adaptive Shape Impression for 3D Object Reconstruction [J]. | IEEE TRANSACTIONS ON MULTIMEDIA , 2023 , 25 : 5248-5262 .
MLA Gao, Junna et al. "DASI: Learning Domain Adaptive Shape Impression for 3D Object Reconstruction" . | IEEE TRANSACTIONS ON MULTIMEDIA 25 (2023) : 5248-5262 .
APA Gao, Junna , Kong, Dehui , Wang, Shaofan , Li, Jinghua , Yin, Baocai . DASI: Learning Domain Adaptive Shape Impression for 3D Object Reconstruction . | IEEE TRANSACTIONS ON MULTIMEDIA , 2023 , 25 , 5248-5262 .
导入链接 NoteExpress RIS BibTex
每页显示 10| 20| 50 条结果
< 页,共 37 >

导出

数据:

选中

格式:
在线人数/总访问数:566/4922423
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司