Isotropic Self-Supervised Learning for Driver Drowsiness Detection With Attention-Based Multimodal Fusion - Details

Author：

Indexed by：

EI Scopus SCIE

Abstract：

Driverdrowsiness　is　an　important　cause　of　traffic　accidents.　Many　studies　using　computer　vision　techniques　to　detect　driver　drowsiness　states,　such　as　slow　blinking,　yawning,　and　nodding,　have　demonstrated　excellent　potential.　Although　existing　studies　have　made　significant　progress,　the　number　of　samples　in　the　training　corpora　is　small,　which　makes　it　difficult　for　a　model　to　learn　effective　drowsiness　representations　from　images　or　videos.　To　address　this　issue,　we　develop　an　isotropic　self-supervised　learning　(IsoSSL)　approach　to　learn　powerful　representations　of　images　without　relying　on　human-provided　annotations　and　propose　an　IsoSSL-MoCo　model　by　combining　IsoSSL　with　momentum　contrast　(MoCo).　To　exploit　the　complementarity　of　multimodal　data,　an　attention-based　multimodal　fusion　model　is　also　proposed　to　fuse　features　from　the　eye,　mouth,　and　optical　flow　of　the　head.　Specifically,　we　first　use　the　IsoSSL-MoCo　model　to　pretrain　the　image　encoders　for　the　three　modalities　in　other　datasets.　Then,　these　encoders　are　fine-tuned　and　integrated　into　the　proposed　fusion　model.　The　feature　vectors　generated　by　the　image　encoders　of　the　three　modalities　are　fed　into　the　recursive　layer　to　extract　temporal　information.　To　capture　the　importance　degrees　of　the　effects　of　temporal　features　from　the　three　modalities　on　drowsiness　detection,　an　attention　mechanism　is　introduced　to　automatically　weigh　the　feature　vectors　from　the　recursive　layer　to　improve　detection　accuracy.　Finally,　a　vector　representation　is　generated　by　the　attention　layer　and　is　used　to　detect　driver　drowsiness　states.　Experimental　results　based　on　two　challenging　datasets　show　that　our　method　outperforms　the　baseline　methods　and　the　latest　existing　methods.

Keyword：

Feature extraction Convolutional neural networks Vehicles momentum contrast (MoCo) Computational modeling Attention Videos Dictionaries multimodal fusion model driver drowsiness detection Hidden Markov models isotropic self-supervised learning (IsoSSL)

Author Community：

[ 1 ] [Mou, Luntian]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[ 2 ] [Zhou, Chao]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[ 3 ] [Zhao, Pengfei]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[ 4 ] [Yin, Baocai]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[ 5 ] [Xie, Pengtao]Univ Calif San Diego, La Jolla, CA 92093 USA
[ 6 ] [Jain, Ramesh]Univ Calif Irvine, Inst Future Hlth, Bren Sch Informat & Comp Sci, Irvine, CA 92697 USA
[ 7 ] [Gao, Wen]Peking Univ, Inst Digital Media, Beijing 100871, Peoples R China
[ 8 ] [Gao, Wen]Peking Univ, Shenzhen Grad Sch, Sch Elect & Comp Engn, Shenzhen 518055, Peoples R China

Reprint Author's Address：

[Yin, Baocai]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China;;

Email：

Show more details

Related Keywords：

Porn Streamer Recognition in Live Video Streaming via Attention-Gated Multimodal Deep Features
2020，IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
Recognition of Teachers' Facial Expression Intensity Based on Convolutional Neural Network and Attention Mechanism
2020，IEEE ACCESS
Aerial Forest Fire Detection based on Transfer Learning and Improved Faster RCNN
2023，3rd IEEE International Conference on Information Technology, Big Data and Artificial Intelligence, ICIBA 2023
Detection and fine-grained classification of malicious code using convolutional neural networks and swarm intelligence algorithms
2020，International Journal of Wireless and Mobile Computing

Source ：

IEEE TRANSACTIONS ON MULTIMEDIA

ISSN： 1520-9210

Year： 2023

Volume： 25

Page： 529-542

7 . 3 0 0

JCR@2022

ESI Discipline： COMPUTER SCIENCE;

ESI HC Threshold：19

Cited Count：

WoS CC Cited Count： 24

SCOPUS Cited Count： 33

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to