• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Yao, Bowen (Yao, Bowen.) | Deng, Yongjian (Deng, Yongjian.) | Liu, Yuhan (Liu, Yuhan.) | Chen, Hao (Chen, Hao.) | Li, Youfu (Li, Youfu.) | Yang, Zhen (Yang, Zhen.)

Indexed by:

EI Scopus

Abstract:

Semantic segmentation, a fundamental visual task ubiquitously employed in sectors ranging from transportation and robotics to healthcare, has always captivated the research community. In the wake of rapid advancements in large model research, the foundation model for semantic segmentation tasks, termed the Segment Anything Model (SAM), has been introduced. This model substantially addresses the dilemma of poor generalizability of previous segmentation models and the disadvantage in requiring to retrain the whole model on variant datasets. Nonetheless, segmentation models developed on SAM remain constrained by the inherent limitations of RGB sensors, particularly in scenarios characterized by complex lighting conditions and high-speed motion. Motivated by these observations, a natural recourse is to adapt SAM to additional visual modalities without compromising its robust generalizability. To achieve this, we introduce a lightweight SAM-Event-Adapter (SE-Adapter) module, which incorporates event camera data into a cross-modal learning architecture based on SAM, with only limited tunable parameters incremental. Capitalizing on the high dynamic range and temporal resolution afforded by event cameras, our proposed multi-modal Event-RGB learning architecture effectively augments the performance of semantic segmentation tasks. In addition, we propose a novel paradigm for representing event data in a patch format compatible with transformer-based models, employing multi-spatiotemporal scale encoding to efficiently extract motion and semantic correlations from event representations. Exhaustive empirical evaluations conducted on the DSEC-Semantic and DDD17 datasets provide validation of the effectiveness and rationality of our proposed approach. © 2024 IEEE.

Keyword:

Semantic Segmentation Image coding

Author Community:

  • [ 1 ] [Yao, Bowen]Beijing University of Technology, College of Computer Science, Beijing, China
  • [ 2 ] [Deng, Yongjian]Beijing University of Technology, College of Computer Science, Beijing, China
  • [ 3 ] [Liu, Yuhan]Beijing University of Technology, College of Computer Science, Beijing, China
  • [ 4 ] [Chen, Hao]Key Lab of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China
  • [ 5 ] [Li, Youfu]City University of Hong Kong, Department of Mechanical Engineering, Kowloon, Hong Kong
  • [ 6 ] [Yang, Zhen]Beijing University of Technology, College of Computer Science, Beijing, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

ISSN: 1050-4729

Year: 2024

Page: 9093-9100

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count: 3

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Affiliated Colleges:

Online/Total:713/5295473
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.