Using Distillation to Improve Network Performance after Pruning and Quantization - Details

Author：

Bao, Zhenshan (Bao, Zhenshan.) | Liu, Jiayang (Liu, Jiayang.) | Zhang, Wenbo (Zhang, Wenbo.)

Indexed by：

CPCI-S EI

Abstract：

As　the　complexity　of　processing　issues　increases,　deep　neural　networks　require　more　computing　and　storage　resources.　At　the　same　time,　the　researchers　found　that　the　deep　neural　network　contains　a　lot　of　redundancy,　causing　unnecessary　waste,　and　the　network　model　needs　to　be　further　optimized.　Based　on　the　above　ideas,　researchers　have　turned　their　attention　to　building　more　compact　and　efficient　models　in　recent　years,　so　that　deep　neural　networks　can　be　better　deployed　on　nodes　with　limited　resources　to　enhance　their　intelligence.　At　present,　the　deep　neural　network　model　compression　method　have　weight　pruning,　weight　quantization,　and　knowledge　distillation　and　so　on,　these　three　methods　have　their　own　characteristics,　which　are　independent　of　each　other　and　can　be　self-contained,　and　can　be　further　optimized　by　effective　combination.　This　paper　will　construct　a　deep　neural　network　model　compression　framework　based　on　weight　pruning,　weight　quantization　and　knowledge　distillation.　Firstly,　the　model　will　be　double　coarse-grained　compression　with　pruning　and　quantization,　then　the　original　network　will　be　used　as　the　teacher　network　to　guide　the　compressed　student　network.　Training　is　performed　to　improve　the　accuracy　of　the　student　network,　thereby　further　accelerating　and　compressing　the　model　to　make　the　loss　of　accuracy　smaller.　The　experimental　results　show　that　the　combination　of　three　algorithms　can　compress　80%　FLOPs　and　reduce　the　accuracy　by　only　1%.

Keyword：

Model compression Pruning Knowledge distillation CNN Quantization

Author Community：

[ 1 ] [Bao, Zhenshan]Beijing Univ Technol, Beijing, Peoples R China
[ 2 ] [Liu, Jiayang]Beijing Univ Technol, Beijing, Peoples R China
[ 3 ] [Zhang, Wenbo]Beijing Univ Technol, Beijing, Peoples R China

Reprint Author's Address：

[Bao, Zhenshan]Beijing Univ Technol, Beijing, Peoples R China

Email：

baozhenshan@bjut.edu.cn |
976160664@qq.com |
zhangwenbo@bjut.edu.cn

Show more details

Related Keywords：

Multi-stage knowledge distillation for sequential recommendation with interest knowledge
2024，INFORMATION SCIENCES
Sparse optimization guided pruning for neural networks
2024，NEUROCOMPUTING
ARPruning: An automatic channel pruning based on attention map ranking
2024，NEURAL NETWORKS
Diluted binary neural network
2023，PATTERN RECOGNITION

Source ：

PROCEEDINGS OF THE 2019 2ND INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND MACHINE INTELLIGENCE (MLMI 2019)

Year： 2019

Page： 3-6

Language： English

Cited Count：

WoS CC Cited Count： 4

SCOPUS Cited Count： 2

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 2

Affiliated Colleges：

学院待认领

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to