SEB-Net: Revisiting Deep Encoder-Decoder Networks for Scene Understanding - Details

Author：

Indexed by：

EI Scopus

Abstract：

As　a　research　area　of　computer　vision　and　deep　learning,　scene　understanding　has　attracted　a　lot　of　attention　in　recent　years.　One　major　challenge　encountered　is　obtaining　high　levels　of　segmentation　accuracy　while　dealing　with　the　computational　cost　and　time　associated　with　training　or　inference.　Most　current　algorithms　compromise　one　metric　for　the　other　depending　on　the　intended　devices.　To　address　this　problem,　this　paper　proposes　a　novel　deep　neural　network　architecture　called　Segmentation　Efficient　Blocks　Network　(SEB-Net)　that　seeks　to　achieve　the　best　possible　balance　between　accuracy　and　computational　costs　as　well　as　real-time　inference　speed.　The　model　is　composed　of　both　an　encoder　path　and　a　decoder　path　in　a　symmetric　structure.　The　encoder　path　consists　of　16　convolution　layers　identical　to　a　VGG-19　model,　and　the　decoder　path　includes　what　we　call　E-blocks　(Efficient　Blocks)　inspired　by　the　widely　popular　ENet　architecture＇s　bottleneck　module　with　slight　modifications.　One　advantage　of　this　model　is　that　the　max-unpooling　in　the　decoder　path　is　employed　for　expansion　and　projection　convolutions　in　the　E-Blocks,　allowing　for　less　learnable　parameters　and　efficient　computation　(10.1　frames　per　second　(fps)　for　a　480x320　input,　11x　fewer　parameters　than　DeconvNet,　52.4　GFLOPs　for　a　640x360　input　on　a　TESLA　K40　GPU　device).　Experimental　results　on　two　outdoor　scene　datasets;　Cambridge-driving　Labeled　Video　Database　(CamVid)　and　Cityscapes,　indicate　that　SEB-Net　can　achieve　higher　performance　compared　to　Fully　Convolutional　Networks　(FCN),　SegNet,　DeepLabV,　and　Dilation8　in　most　cases.　What＇s　more,　SEB-Net　outperforms　efficient　architectures　like　ENet　and　LinkNet　by　16.1　and　11.6　respectively　in　terms　of　Instance-level　intersection　over　Union　(iLoU).　SEB-Net　also　shows　better　performance　when　further　evaluated　on　the　SUNRGB-D,　an　indoor　scene　dataset　©　2020　ACM.

Keyword：

Network architecture Network coding Convolutional neural networks Convolution Deep neural networks Deep learning Decoding

Author Community：

[ 1 ] [Gadosey, Pius Kwao]Beijing University of Technology, Beijing; 100124, China
[ 2 ] [Li, Yuijan]School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin; 541004, China
[ 3 ] [Zhang, Ting]Beijing University of Technology, Beijing; 100124, China
[ 4 ] [Liu, Zhaoying]Beijing University of Technology, Beijing; 100124, China
[ 5 ] [Too, Edna Chebet]Department of Computer Science and Ict, Chuka University, Kenya
[ 6 ] [Essaf, Firdaous]Beijing University of Technology, Beijing; 100124, China