• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Xu, Kai (Xu, Kai.) | Wang, Lichun (Wang, Lichun.) (学者:王立春) | Xin, Jianjia (Xin, Jianjia.) | Li, Shuang (Li, Shuang.) | Yin, Baocai (Yin, Baocai.)

收录:

EI Scopus SCIE

摘要:

Knowledge Distillation transfers knowledge learned by a teacher network to a student network. A common mode of knowledge transfer is directly using the teacher network's experience for all samples without differentiating whether the experience of teacher is successful or not. According to common sense, experience varies with its nature. Successful experience is used for guidance, and failed experience is used for correction. Inspired by that, this paper analyzes the failure of teacher and proposes a reflective learning paradigm, which additionally uses heuristic knowledge extracted from the teacher's failure besides following the authority of teacher. Specifically, this paper defines Mutual Error Distance (MED) based on the teacher's wrong predictions. MED measures the adequacy of the decision boundary learned by teacher, which concretizes the failure of teacher. Then, this paper proposes DCGD (divide-and-conquer grouping distillation) to critically transfer the teacher's knowledge by grouping the target task into small-scale subtasks and designing multi-branch networks on the basis of MED. Finally, a switchable training mechanism is designed to integrate a regular student which provides an option of student network without parameter addition compared with the multi-branch student network. Extensive experiments on three image classification benchmarks (CIFAR-10, CIFAR-100 and TinyImageNet) show the effectiveness of the proposed paradigm. Especially on CIFAR-100 dataset, the average error of students using DCGD+DKD decreased by 4.28%. In addition, the experiment results show that the paradigm is also applicable to self-distillation.

关键词:

decision boundary Training mutual error distance divide-and-conquer Knowledge engineering Task analysis Birds Dogs Marine vehicles reflective learning paradigm Automobiles Knowledge distillation

作者机构:

  • [ 1 ] [Xu, Kai]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, 100124, Peoples R China
  • [ 2 ] [Wang, Lichun]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, 100124, Peoples R China
  • [ 3 ] [Xin, Jianjia]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, 100124, Peoples R China
  • [ 4 ] [Li, Shuang]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, 100124, Peoples R China
  • [ 5 ] [Yin, Baocai]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, 100124, Peoples R China

通讯作者信息:

  • [Wang, Lichun]Beijing Univ Technol, Beijing Artificial Intelligence Inst, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing, 100124, Peoples R China;;

查看成果更多字段

相关关键词:

相关文章:

来源 :

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

ISSN: 1051-8215

年份: 2024

期: 1

卷: 34

页码: 384-396

8 . 4 0 0

JCR@2022

被引次数:

WoS核心集被引频次:

SCOPUS被引频次: 9

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 1

归属院系:

在线人数/总访问数:720/4960630
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司