收录:
摘要:
As a popular cross-modal reasoning task, Visual Question Answering (VQA) has achieved great progress in recent years. However, the issue of language bias has always affected the reliability of VQA models. To address this problem, counterfactual learning methods are proposed to learn more robust features to mitigate the bias problem. However, current counterfactual learning approaches mainly focus on generating synthesized samples and assigning answers to them, neglecting the relationship between factual and original data, which hinders robust feature learning for effective reasoning. To overcome this limitation, we propose a Self-supervised Knowledge Distillation approach in Counterfactual Learning for VQA, dubbed as VQA-SkdCL, which utilizes a self-supervised constraint to make good use of the hidden knowledge in the factual samples, enhancing the robustness of VQA models. We demonstrate the effectiveness of the proposed approach on VQA v2, VQA-CP v1, and VQA-CP v2 datasets and our approach achieves excellent performance.
关键词:
通讯作者信息:
电子邮件地址:
来源 :
PATTERN RECOGNITION LETTERS
ISSN: 0167-8655
年份: 2023
卷: 177
页码: 33-39
5 . 1 0 0
JCR@2022
归属院系: