MPLA-Net: Multiple Pseudo Label Aggregation Network for Weakly Supervised Video Salient Object Detection - Details

Author：

Ma, Chunjie (Ma, Chunjie.) | Du, Lina (Du, Lina.) | Zhuo, Li (Zhuo, Li.) | Li, Jiafeng (Li, Jiafeng.)

Indexed by：

EI Scopus SCIE

Abstract：

Weakly　Supervised　Video　Salient　Object　Detection　(WSVSOD)　only　requires　coarse-grained　manual　annotations,　which　can　achieve　a　good　trade-off　between　labeling　efficiency　and　detection　performance.　In　this　paper,　a　Multiple　Pseudo　Label　Aggregation　Network　(MPLA-Net)　is　proposed　for　WSVSOD.　Firstly,　the　video　frames　that　can　obtain　high-quality　pseudo　labels　are　selected　to　generate　multiple　pseudo　labels,　so　as　to　avoid　the　prejudice　of　the　single　label.　Moreover,　the　pseudo　label　with　fine　edge　information　is　used　to　generate　the　Edge　Information　Map　(EIM).　Secondly,　MPLA-Net　is　designed　to　adequately　excavate　and　utilize　the　comprehensive　saliency　cues　in　multiple　pseudo　labels　to　improve　the　detection　accuracy,　in　which　ResNet-50　is　adopted　as　the　backbone　network.　Edge　loss,　pseudo　label　loss,　self-supervised　loss　and　fusion　loss　are　exploited　to　jointly　supervise　and　optimize　the　network　training　to　obtain　a　robust　detection　model.　Experimental　results　on　five　benchmark　datasets　demonstrate　that,　compared　with　existing　weakly　supervised　methods,　the　proposed　method　can　achieve　state-of-the-art　detection　accuracy　with　less　model　parameters　and　higher　detection　speed.　And　the　detected　salient　objects　have　fine　boundaries.

Keyword：

Annotations Task analysis pseudo label consistency evaluation Object detection Optical flow multiple pseudo label aggregation Weakly supervised video salient object detection Training Image edge detection video frame quality evaluation Feature extraction

Author Community：

[ 1 ] [Ma, Chunjie]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
[ 2 ] [Zhuo, Li]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
[ 3 ] [Li, Jiafeng]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
[ 4 ] [Ma, Chunjie]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 5 ] [Zhuo, Li]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 6 ] [Li, Jiafeng]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 7 ] [Du, Lina]Shandong Jianzhu Univ, Sch Comp Sci & Technol, Jinan 250101, Peoples R China

Reprint Author's Address：

[Zhuo, Li]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China;;[Zhuo, Li]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China;;

Email：

mcj@machunjie.com |
dulina22@sdjzu.edu.cn |
zhuoli@bjut.edu.cn |
lijiafeng@bjut.edu.cn

Show more details

Related Keywords：

A Multi-Task CNN for Maritime Target Detection
2021，IEEE SIGNAL PROCESSING LETTERS
Joint Multisource Saliency and Exemplar Mechanism for Weakly Supervised Video Object Segmentation
2021，IEEE TRANSACTIONS ON IMAGE PROCESSING
Object Detection Algorithm of Aircrafts in Airport Scene Based on Improved YOLOv5
2024，43rd Chinese Control Conference, CCC 2024
Disentangled Cross-Modal Transformer for RGB-D Salient Object Detection and Beyond
2024，IEEE TRANSACTIONS ON IMAGE PROCESSING

Source ：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

ISSN： 1051-8215

Year： 2024

Issue： 5

Volume： 34

Page： 3905-3918

8 . 4 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 8

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 2

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to