• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Xu, Kai (Xu, Kai.) | Wang, Lichun (Wang, Lichun.) (学者:王立春) | Li, Shuang (Li, Shuang.) | Xin, Jianjia (Xin, Jianjia.) | Yin, Baocai (Yin, Baocai.)

收录:

EI Scopus SCIE

摘要:

Compared with traditional knowledge distillation, self-distillation does not require a pre-trained teacher network, which is more concise. Among them, data augmentation-based methods provide an elegant solution without modifying the network structure or additional memory consumption. However, when employing data augmentation in the input space, the forward propagations for augmented data bring additional computation costs and the augmentation methods need to be adaptive to the modality of input data. Meanwhile, we note that from a generalization perspective, under the condition of being able to distinguish from other classes, a dispersed intra-class feature distribution is superior to compact intra-class feature distribution, especially for categories with larger sample differences. Based on the above considerations, this paper proposes a feature augmentation-based self-distillation method (FASD) based on the idea of feature extrapolation. For each source feature, two augmentations are generated by subtraction between features. The one is subtracting the temporary class center computed with samples belonging to the same category, and another one is subtracting a sample feature belonging to other categories with the closest distance. Then, the predicted outputs of the augmented features are constrained to be consistent with that of the source feature. The consistent constraint on the previous augmented feature expands the learned class feature distribution, leading to greater overlap with the unknown feature distribution of test samples, thereby improving the generalization performance of the network. The consistent constraint on the latter augmented feature increases the distance between samples from different categories, which enhances the distinguishability between categories. Experimental results on image classification task demonstrate the effectiveness and efficiency of the proposed method. Meanwhile, experiments on text and audio tasks prove the universality of the method for classification tasks with different modalities.

关键词:

Knowledge distillation classification task Training Predictive models generalization performance feature augmentation Knowledge engineering Feature extraction self-distillation Extrapolation Data augmentation Task analysis

作者机构:

  • [ 1 ] [Xu, Kai]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 2 ] [Wang, Lichun]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 3 ] [Xin, Jianjia]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 4 ] [Yin, Baocai]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 5 ] [Li, Shuang]Beijing Informat Sci & Technol Univ, Sch Automat, Beijing 100192, Peoples R China

通讯作者信息:

  • [Wang, Lichun]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China;;

查看成果更多字段

相关关键词:

来源 :

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

ISSN: 1051-8215

年份: 2024

期: 10

卷: 34

页码: 9578-9590

8 . 4 0 0

JCR@2022

被引次数:

WoS核心集被引频次:

SCOPUS被引频次: 1

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 1

归属院系:

在线人数/总访问数:561/4966612
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司