• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Hu, Yongli (Hu, Yongli.) | Feng, Lincong (Feng, Lincong.) | Jiang, Huajie (Jiang, Huajie.) | Liu, Mengting (Liu, Mengting.) | Yin, Baocai (Yin, Baocai.) (学者:尹宝才)

收录:

EI Scopus SCIE

摘要:

Generalized zero-shot learning(GZSL) aims to recognize images from seen and unseen classes with side information, such as manually annotated attribute vectors. Traditional methods focus on mapping images and semantics into a common latent space, thus achieving the visual-semantics alignment. Since the unseen classes are unavailable during training, there is a serious problem of recognition bias, which will tend to recognize unseen classes as seen classes. To solve this problem, we propose a Domain-aware Prototype Network(DPN), which splits the GZSL problem into the seen class recognition and unseen class recognition problem. For the seen classes, we design a domain-aware prototype learning branch with a dual attention feature encoder to capture the essential visual information, which aims to recognize the seen classes and discriminate the novel categories. To further recognize the fine-grained unseen classes, a visual-semantic embedding branch is designed, which aims to align the visual and semantic information for unseen-class recognition. Through the multi-task learning of the prototype learning branch and visual-semantic embedding branch, our model can achieve excellent performance on three popular GZSL datasets.

关键词:

transformer-based dual attention Semantics domain detection Generalized zero-shot learning Visualization Task analysis Prototypes Feature extraction Image recognition Transformers

作者机构:

  • [ 1 ] [Hu, Yongli]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 2 ] [Feng, Lincong]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 3 ] [Jiang, Huajie]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 4 ] [Liu, Mengting]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 5 ] [Yin, Baocai]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Fac Informat Technol, Beijing 100124, Peoples R China

通讯作者信息:

  • [Jiang, Huajie]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Fac Informat Technol, Beijing 100124, Peoples R China;;

查看成果更多字段

相关关键词:

相关文章:

来源 :

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

ISSN: 1051-8215

年份: 2024

期: 5

卷: 34

页码: 3180-3191

8 . 4 0 0

JCR@2022

被引次数:

WoS核心集被引频次:

SCOPUS被引频次: 5

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 1

归属院系:

在线人数/总访问数:523/4962879
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司