OASNet: Object Affordance State Recognition Network With Joint Visual Features and Relational Semantic Embeddings - Details

Author：

Indexed by：

EI Scopus SCIE

Abstract：

Traditional　affordance　learning　tasks　aim　to　understand　object＇s　interactive　functions　in　an　image,　such　as　affordance　recognition　and　affordance　detection.　However,　these　tasks　cannot　determine　whether　the　object　is　currently　interacting,　which　is　crucial　for　many　follow-up　tasks,　including　robotic　manipulation　and　planning　task.　To　fill　this　gap,　this　paper　proposes　a　novel　object　affrodance　state　(OAS)　recognition　task,　i.e.,　simultaneously　recognizing　an　object＇s　affordances　and　the　partner　objects　that　are　interacting　with　it.　Accordingly,　to　facilitate　the　application　of　deep　learning　technology,　an　OAS　recognition　task　related　dataset　OAS10k　is　constructed　by　collecting　and　labeling　over　10k　images.　In　the　dataset,　a　sample　is　defined　as　a　set　of　an　image　and　its　OAS　labels,　each　label　is　represented　as　$\left　\langle{　\rm　{\textit　{subject,　subject＇s　affrodance,　interacted　object}}　}\right　\rangle　$　.　These　triplet　labels　have　rich　relational　semantic　information,　which　can　improve　OAS　recognition　performance.　We　hence　construct　a　directed　OAS　knowledge　graph　of　affordance　states,　and　extract　an　OAS　matrix　from　it　for　modelling　the　semantic　relationships　of　the　triplets.　Based　on　the　matrix,　we　propose　an　OAS　recognition　network　(OASNet),　which　utilizes　GCN　to　capture　the　relational　semantic　embeddings,　and　uses　a　transformer　to　fuse　them　with　the　visual　features　from　an　image　to　recognize　the　affordance　states　of　objects　in　the　image.　Experimental　results　on　OAS10k　dataset　and　other　triplet　label　recognition　datasets　demonstrate　that　the　proposed　OASNet　achieves　the　best　performance　compared　to　the　state-of-the-art　methods.　The　dataset　and　codes　will　be　released　on　https://github.com/mxmdpc/OAS.

Keyword：

Object affordance state recognition transformer multi-label image classification relational semantic embeddings graph convolution networks

Author Community：

[ 1 ] [Chen, Dongpan]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[ 2 ] [Kong, Dehui]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[ 3 ] [Li, Jinghua]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[ 4 ] [Wang, Lichun]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[ 5 ] [Gao, Junna]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[ 6 ] [Yin, Baocai]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China

Reprint Author's Address：

[Kong, Dehui]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China;;

Email：

kdh@bjut.edu.cn

Show more details

Related Keywords：

GCN-ICF: Graph Convolution Networks for Item-based Recommendation
2020，IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)
Predicting Disease-related RNA Associations based on Graph Convolutional Attention Network
2019，IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Hierarchical Graph Convolution Networks for Traffic Forecasting
2021，35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence
Towards City-Scale Traffic Prediction: Cluster Graph Wavelet Networks
2022，2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC)

Source ：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

ISSN： 1051-8215

Year： 2024

Issue： 5

Volume： 34

Page： 3368-3382

8 . 4 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 1

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to