• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Cui, Zheng (Cui, Zheng.) | Hu, Yongli (Hu, Yongli.) (学者:胡永利) | Sun, Yanfeng (Sun, Yanfeng.) | Yin, Baocai (Yin, Baocai.)

收录:

EI Scopus SCIE

摘要:

Image-text retrieval is a fundamental yet challenging task, which aims to bridge a semantic gap between heterogeneous data to achieve precise measurements of semantic similarity. The technique of fine-grained alignment between cross-modal features plays a key role in various successful methods that have been proposed. Nevertheless, existing methods cannot effectively utilise intra-modal information to enhance feature representation and lack powerful similarity reasoning to get a precise similarity score. Intending to tackle these issues, a context-aware Relation Enhancement and Similarity Reasoning model, called RESR, is proposed, which conducts both intra-modal relation enhancement and inter-modal similarity reasoning while considering the global-context information. For intra-modal relation enhancement, a novel context-aware graph convolutional network is introduced to enhance local feature representations by utilising relation and global-context information. For inter-modal similarity reasoning, local and global similarity features are exploited by the bidirectional alignment of image and text, and the similarity reasoning is implemented among multi-granularity similarity features. Finally, refined local and global similarity features are adaptively fused to get a precise similarity score. The experimental results show that our effective model outperforms some state-of-the-art approaches, achieving average improvements of 2.5% and 6.3% in R@sum on the Flickr30K and MS-COCO dataset. A novel context-aware relation enhancement and similarity reasoning model is proposed to achieve precise image-text retrieval, which conducts both intra-modal relation enhancement and inter-modal similarity reasoning while considering the global-context information. image

关键词:

image retrieval multimedia systems

作者机构:

  • [ 1 ] [Cui, Zheng]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, Peoples R China
  • [ 2 ] [Hu, Yongli]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, Peoples R China
  • [ 3 ] [Sun, Yanfeng]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, Peoples R China
  • [ 4 ] [Yin, Baocai]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, Peoples R China
  • [ 5 ] [Hu, Yongli]Beijing Univ Technol, 100 Pingleyuan, Beijing, Peoples R China

通讯作者信息:

  • 胡永利

    [Hu, Yongli]Beijing Univ Technol, 100 Pingleyuan, Beijing, Peoples R China

电子邮件地址:

查看成果更多字段

相关关键词:

相关文章:

来源 :

IET COMPUTER VISION

ISSN: 1751-9632

年份: 2024

期: 5

卷: 18

页码: 652-665

1 . 7 0 0

JCR@2022

被引次数:

WoS核心集被引频次:

SCOPUS被引频次:

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 0

归属院系:

在线人数/总访问数:388/4969580
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司