• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Zhang, Yuqing (Zhang, Yuqing.) | Zhang, Yong (Zhang, Yong.) (学者:张勇) | Wang, Shaofan (Wang, Shaofan.) | Liang, Yun (Liang, Yun.) | Yin, Baocai (Yin, Baocai.)

收录:

EI Scopus SCIE

摘要:

Video object segmentation (VOS) exhibits heavy occlusions, large deformation, and severe motion blur. While many remarkable convolutional neural networks are devoted to the VOS task, they often mis-identify background noise as the target or output coarse object boundaries, due to the failure of mining detail information and high-order correlations of pixels within the whole video. In this work, we propose an edge attention gated graph convolutional network (GCN) for VOS. The seed point initialization and graph construction stages construct a spatio-temporal graph of the video by exploring the spatial intra-frame correlation and the temporal inter-frame correlation of superpixels. The node classification stage identifies foreground superpixels by using an edge attention gated GCN which mines higher-order correlations between superpixels and propagates features among different nodes. The segmentation optimization stage optimizes the classification of foreground superpixels and reduces segmentation errors by using a global appearance model which captures the long-term stable feature of objects. In summary, the key contribution of our framework is twofold: (a) the spatio-temporal graph representation can propagate the seed points of the first frame to subsequent frames and facilitate our framework for the semi-supervised VOS task; and (b) the edge attention gated GCN can learn the importance of each node with respect to both the neighboring nodes and the whole task with a small number of layers. Experiments on Davis 2016 and Davis 2017 datasets show that our framework achieves the excellent performance with only small training samples (45 video sequences).

关键词:

semi-supervised video object segmentation graph convolutional network superpixel spatio-temporal graph model

作者机构:

  • [ 1 ] [Zhang, Yuqing]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, 100 Pingleyuan, Beijing 100124, Peoples R China
  • [ 2 ] [Zhang, Yong]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, 100 Pingleyuan, Beijing 100124, Peoples R China
  • [ 3 ] [Wang, Shaofan]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, 100 Pingleyuan, Beijing 100124, Peoples R China
  • [ 4 ] [Yin, Baocai]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, 100 Pingleyuan, Beijing 100124, Peoples R China
  • [ 5 ] [Liang, Yun]South China Agr Univ, Guangzhou Key Lab Intelligent Agr, Coll Math & Informat, Guangzhou, Peoples R China

通讯作者信息:

  • [Zhang, Yong]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, 100 Pingleyuan, Beijing 100124, Peoples R China;;

查看成果更多字段

相关关键词:

来源 :

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS

ISSN: 1551-6857

年份: 2024

期: 1

卷: 20

5 . 1 0 0

JCR@2022

ESI学科: COMPUTER SCIENCE;

ESI高被引阀值:4

被引次数:

WoS核心集被引频次:

SCOPUS被引频次:

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 1

归属院系:

在线人数/总访问数:549/4964175
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司