A Dual-Path Multi-Scale Feature Fusion Decoder for SegFormer - Details

Author：

Zhang, Hongyu (Zhang, Hongyu.) | Wang, Suyu (Wang, Suyu.) | Zhang, Yingying (Zhang, Yingying.)

Indexed by：

EI Scopus

Abstract：

The　encoder-decoder　structure　is　the　basic　structure　of　most　semantic　segmentation　models　and　is　adopted　by　a　large　number　of　segmentation　models.　How　to　effectively　extract　image　features　and　achieve　high-precision　mapping　through　the　optimal　design　of　encoder　and　decoder　is　the　key　issue　of　current　research.　SegFormer　designs　an　encoder　with　excellent　performance,　which　fully　extracts　the　feature　information　of　different　semantic　granularity　in　the　image　with　a　large　receptive　field.　Even　if　a　simple　fully　connected　layer　decoder　is　used,　excellent　segmentation　results　can　also　be　achieved.　However,　this　simplified　decoder　does　not　make　full　use　of　the　advantages　of　the　SegFormer　encoder.　Therefore,　a　decoder　structure　with　dual-path　multi-scale　feature　fusion　is　designed　in　this　paper,　and　the　decoder　is　redesigned　according　to　the　characteristics　of　the　SegFormer　encoder.　The　decoder　adopts　a　dual-path　structure,　one　path　passes　the　abstract　global　information　layer　by　layer　to　the　local　detail　information　through　the　layer-by-layer　upsampling　fusion　module　(LFM),　and　gradually　upsamples　the　feature　maps　obtained　from　the　encoder,　and　then　use　the　channel　fusion　module　to　learn　the　importance　of　different　channels　in　the　deep　abstract　semantic　feature　map　and　the　shallow　local　detail　feature　map,　and　perform　dynamic　fusion　to　obtain　a　feature　map　containing　both　abstract　semantic　information　and　local　details.　The　other　path　takes　advantage　of　the　large　receptive　field　of　the　feature　map　output　by　the　SegFormer　encoder,　and　uses　the　weighted　hybrid　multi-scale　feature　extraction　module　(WMF)　to　extract　multi-scale　features　containing　global　semantics　from　the　deep　semantic　feature　map　finally　output　by　the　encoder.　Finally,　the　Deep　Feature　Fusion　Module　(DFM)　is　used　to　fuse　the　outputs　of　the　first　two　modules,　fully　mining　the　multi-scale　global　information　in　the　encoder,　and　obtaine　the　feature　maps　with　rich　semantic　information,　which　effectively　improves　the　algorithm　model　performance.　©　2023　The　Authors.　Published　by　Elsevier　B.V.

Keyword：

Semantics Semantic Segmentation Channel coding Signal encoding Decoding

Author Community：

[ 1 ] [Zhang, Hongyu]Beijing University of Technology, Beijing; 100124, China
[ 2 ] [Wang, Suyu]Beijing University of Technology, Beijing; 100124, China
[ 3 ] [Zhang, Yingying]Shenyang University of Chemical Technology, Shenyang; 110142, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Road Scene Segmentation Based on Multi-scale Attention Mechanism
2022，5th IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference, IMCEC 2022
TCBFormer: A General Architecture Based on Dual-Branch Feature Fusion for Polyp Segmentation
2024，5th International Seminar on Artificial Intelligence, Networking and Information Technology, AINIT 2024
A Joint Entity and Relation Extraction Model Based on Encoder-Decoder
2023，3rd IEEE International Conference on Information Technology, Big Data and Artificial Intelligence, ICIBA 2023
Transformer-based Temporal Knowledge Graph Completion
2023，3rd IEEE International Conference on Computer Communication and Artificial Intelligence, CCAI 2023

Source ：

Year： 2023

Volume： 222

Page： 157-166

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 2

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 2

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to