SAM-Event-Adapter: Adapting Segment Anything Model for Event-RGB Semantic Segmentation - Details

Author：

Indexed by：

EI Scopus

Abstract：

Semantic　segmentation,　a　fundamental　visual　task　ubiquitously　employed　in　sectors　ranging　from　transportation　and　robotics　to　healthcare,　has　always　captivated　the　research　community.　In　the　wake　of　rapid　advancements　in　large　model　research,　the　foundation　model　for　semantic　segmentation　tasks,　termed　the　Segment　Anything　Model　(SAM),　has　been　introduced.　This　model　substantially　addresses　the　dilemma　of　poor　generalizability　of　previous　segmentation　models　and　the　disadvantage　in　requiring　to　retrain　the　whole　model　on　variant　datasets.　Nonetheless,　segmentation　models　developed　on　SAM　remain　constrained　by　the　inherent　limitations　of　RGB　sensors,　particularly　in　scenarios　characterized　by　complex　lighting　conditions　and　high-speed　motion.　Motivated　by　these　observations,　a　natural　recourse　is　to　adapt　SAM　to　additional　visual　modalities　without　compromising　its　robust　generalizability.　To　achieve　this,　we　introduce　a　lightweight　SAM-Event-Adapter　(SE-Adapter)　module,　which　incorporates　event　camera　data　into　a　cross-modal　learning　architecture　based　on　SAM,　with　only　limited　tunable　parameters　incremental.　Capitalizing　on　the　high　dynamic　range　and　temporal　resolution　afforded　by　event　cameras,　our　proposed　multi-modal　Event-RGB　learning　architecture　effectively　augments　the　performance　of　semantic　segmentation　tasks.　In　addition,　we　propose　a　novel　paradigm　for　representing　event　data　in　a　patch　format　compatible　with　transformer-based　models,　employing　multi-spatiotemporal　scale　encoding　to　efficiently　extract　motion　and　semantic　correlations　from　event　representations.　Exhaustive　empirical　evaluations　conducted　on　the　DSEC-Semantic　and　DDD17　datasets　provide　validation　of　the　effectiveness　and　rationality　of　our　proposed　approach.　©　2024　IEEE.

Keyword：

Semantic Segmentation Image coding

Author Community：

[ 1 ] [Yao, Bowen]Beijing University of Technology, College of Computer Science, Beijing, China
[ 2 ] [Deng, Yongjian]Beijing University of Technology, College of Computer Science, Beijing, China
[ 3 ] [Liu, Yuhan]Beijing University of Technology, College of Computer Science, Beijing, China
[ 4 ] [Chen, Hao]Key Lab of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China
[ 5 ] [Li, Youfu]City University of Hong Kong, Department of Mechanical Engineering, Kowloon, Hong Kong
[ 6 ] [Yang, Zhen]Beijing University of Technology, College of Computer Science, Beijing, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

UAL: UNCHANGED AREA LOSS-FUNCTION FOR CHANGE DETECTION NETWORKS
2022，2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022)
ASGSA: global semantic-aware network for action segmentation
2024，Neural Computing and Applications
TCBFormer: A General Architecture Based on Dual-Branch Feature Fusion for Polyp Segmentation
2024，5th International Seminar on Artificial Intelligence, Networking and Information Technology, AINIT 2024
Hybrid ASCII Art Extraction Algorithm Based on String Distance
2024，22nd IEEE/ACIS International Conference on Software Engineering Research, Management and Applications, SERA 2024

Source ：

ISSN： 1050-4729

Year： 2024

Page： 9093-9100

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 3

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 2

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to