Prompt-supervised dynamic attention graph convolutional network for skeleton-based action recognition - Details

Author：

Zhu, Shasha (Zhu, Shasha.) | Sun, Lu (Sun, Lu.) | Ma, Zeyuan (Ma, Zeyuan.) | Li, Chenxi (Li, Chenxi.) | He, Dongzhi (He, Dongzhi.)

Indexed by：

EI Scopus SCIE

Abstract：

Skeleton-based　action　recognition　is　a　core　task　in　the　field　of　video　understanding.　Skeleton　sequences　are　characterized　by　high　information　density,　low　redundancy,　and　clear　structural　information,　thereby　facilitating　the　analysis　of　complex　relationships　among　human　behaviors　more　readily　than　other　modalities.　Although　existing　studies　have　encoded　skeleton　data　and　achieved　positive　outcomes,　they　have　often　overlooked　the　precise　high-level　semantic　information　inherent　in　the　action　descriptions.　To　address　this　issue,　this　paper　proposes　a　prompt-supervised　dynamic　attention　graph　convolutional　network　(PDA-GCN).　Specifically,　the　PDA-GCN　incorporates　a　prompt　supervision　(PS)　module　that　leverages　a　pre-trained　large-scale　language　model　(LLM)　as　a　knowledge　engine　and　retains　the　generated　text　features　as　prompts　to　provide　additional　supervision　during　model　training,　enhancing　the　model＇s　ability　to　discern　analogous　actions　with　negligible　computational　cost.　In　addition,　for　the　purpose　of　bolstering　the　learning　of　discriminative　features,　a　dynamic　attention　graph　convolution　(DA-GC)　module　is　presented.　This　module　utilizes　self-attention　mechanism　to　adaptively　infer　intrinsic　relationships　between　joints　and　integrates　dynamic　convolution　to　strengthen　the　emphasis　on　local　information.　This　dual　focus　on　both　global　context　and　local　details　further　amplifies　the　efficiency　and　effectiveness　of　the　model.　Extensive　experiments,　conducted　on　the　widely-used　skeleton-based　action　recognition　datasets　NTU　RGB+D　60　and　NTU　RGB+D　120,　demonstrate　that　the　PDA-GCN　surpasses　known　state-of-the-art　methods,　achieving　accuracies　of　93.4%　on　the　NTU　RGB+D　60　cross-subject　split　and　90.7%　on　the　NTU　RGB+D　120　cross-subject　split.

Keyword：

Attention mechanism Prompt learning Dynamic convolution Graph convolutional network Skeleton-based action recognition

Author Community：

[ 1 ] [Zhu, Shasha]Beijing Univ Technol, Coll Comp Sci, Beijing, Peoples R China
[ 2 ] [Sun, Lu]Beijing Univ Technol, Coll Comp Sci, Beijing, Peoples R China
[ 3 ] [Ma, Zeyuan]Beijing Univ Technol, Coll Comp Sci, Beijing, Peoples R China
[ 4 ] [Li, Chenxi]Beijing Univ Technol, Coll Comp Sci, Beijing, Peoples R China
[ 5 ] [He, Dongzhi]Beijing Univ Technol, Coll Comp Sci, Beijing, Peoples R China

Reprint Author's Address：

[Zhu, Shasha]Beijing Univ Technol, Coll Comp Sci, Beijing, Peoples R China;;

Email：

victor@bjut.edu.cn

Show more details

Related Keywords：

A Spatial-Temporal Attention Approach for Traffic Prediction
2021，IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
A Multi-domain Named Entity Recognition Method Based on Part-of-Speech Attention Mechanism
2019，COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2019
Multi-modality self-attention aware deep network for 3D biomedical segmentation
2020，BMC MEDICAL INFORMATICS AND DECISION MAKING
DeepCAC: a deep learning approach on DNA transcription factors classification based on multi-head self-attention and concatenate convolutional neural network
2023，BMC BIOINFORMATICS

Source ：

NEUROCOMPUTING

ISSN： 0925-2312

Year： 2024

Volume： 611

6 . 0 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to