Indexed by:
Abstract:
Visual affordance studies what kind of interaction is possible and whether the interaction is reasonable in the current environment from an image/video. When inferring affordances of objects, semantics and relations of objects in the environment should be considered, and graph is usually used for modeling the environment context for object. Considering the weight of edge in graph describes the amount of contributed information between objects during affordance reasoning, this paper proposes VAR-Net (Visual Affordance Reasoning Network) which models the weights as graph attention coefficients and learns the weights based on objects' semantic and visual features implying their affordances. VAR-Net achieves higher accuracy on COCO-Tasks and ADE-Affordance datasets. Experiments also explain the meaning of edge weights in VAR-Net. For a definite affordance, an object commits it more, the edges linking from it to other objects have larger weights and vice versa, which makes objects' features distinguishable for inferring affordances. © 2022 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
Year: 2022
Page: 283-290
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 1
Affiliated Colleges: