ASGSA: global semantic-aware network for action segmentation - Details

Author：

Bian, Qingyun (Bian, Qingyun.) | Zhang, Chun (Zhang, Chun.) | Ren, Keyan (Ren, Keyan.) | Yue, Tianyi (Yue, Tianyi.) | Zhang, Yunlu (Zhang, Yunlu.)

Indexed by：

EI Scopus

Abstract：

Action　segmentation　is　vital　for　video　understanding　because　it　heuristically　divides　complex　untrimmed　videos　into　short　semantic　clips.　Real-world　human　actions　exhibit　complex　temporal　dynamics,　encompassing　variations　in　duration,　rhythm,　and　range　of　motions,　etc.　While　deep　networks　have　been　successfully　applied　to　these　tasks,　they　face　challenges　in　effectively　adapting　to　these　complex　variations　due　to　the　inherent　difficulty　in　capturing　semantic　information　from　a　global　perspective.　Merely　relying　on　distinguishing　visual　representations　in　local　regions　leads　to　the　issue　of　over-segmentation.　In　an　attempt　to　address　this　practical　issue,　we　propose　a　novel　approach　named　ASGSA,　which　aims　to　obtain　smoother　segmentation　results　by　extracting　instructive　semantic　information.　Our　core　component,　Global　Semantic-Aware　module,　provides　an　effective　way　to　encode　the　long-range　temporal　relation　in　the　long　untrimmed　video.　Specifically,　we　exploit　a　hierarchical　temporal　context　aggregation,　which　is　identified　by　a　gated-mechanism　selection　to　control　the　information　passage　at　different　scales.　In　addition,　an　adaptive　fusion　strategy　is　designed　to　guide　the　segmentation　with　the　extracted　semantic　information.　Simultaneously,　to　obtain　higher-quality　video　representation　without　extra　annotations,　we　resort　to　self-supervised　training　strategy　and　propose　the　Video　Speed　Prediction　module.　Extensive　experiments　demonstrate　that　our　approach　achieves　state-of-the-art　performance　on　all　three　challenging　benchmark　datasets　(Breakfast,　50Salads,　GTEA)　and　significantly　improves　the　F1　score@50,　which　represents　the　reduction　of　over-segmentation.　The　code　is　available　at　https://github.com/ten000/ASGSA.　©　The　Author(s),　under　exclusive　licence　to　Springer-Verlag　London　Ltd.,　part　of　Springer　Nature　2024.

Keyword：

Benchmarking Semantic Segmentation Semantic Web Semantics Complex networks

Author Community：

[ 1 ] [Bian, Qingyun]Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 2 ] [Zhang, Chun]Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 3 ] [Ren, Keyan]Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 4 ] [Yue, Tianyi]Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 5 ] [Zhang, Yunlu]China Mobile Research Institute, Beijing, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

A DWT-Utilized Classifier for UPJO Diagnosis Using Ultrasound Images
2022，19th IEEE International Conference on Networking, Sensing and Control, ICNSC 2022
T-C3D: Temporal convolutional 3D network for real-time action recognition
2018，32nd AAAI Conference on Artificial Intelligence, AAAI 2018
Fuzzy Transfer Learning Algorithm with Knowledge-Task Matching
2023，9th International Conference on Information, Cybernetics, and Computational Social Systems, ICCSS 2023
A dense segmentation network for fine semantic mapping
2019，2019 IEEE International Conference on Robotics and Biomimetics, ROBIO 2019

Source ：

Neural Computing and Applications

ISSN： 0941-0643

Year： 2024

Issue： 22

Volume： 36

Page： 13629-13645

6 . 0 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to