• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Mou, Luntian (Mou, Luntian.) | Zhou, Chao (Zhou, Chao.) | Xie, Pengtao (Xie, Pengtao.) | Zhao, Pengfei (Zhao, Pengfei.) | Jain, Ramesh (Jain, Ramesh.) | Gao, Wen (Gao, Wen.) | Yin, Baocai (Yin, Baocai.) (学者:尹宝才)

收录:

EI Scopus SCIE

摘要:

Driverdrowsiness is an important cause of traffic accidents. Many studies using computer vision techniques to detect driver drowsiness states, such as slow blinking, yawning, and nodding, have demonstrated excellent potential. Although existing studies have made significant progress, the number of samples in the training corpora is small, which makes it difficult for a model to learn effective drowsiness representations from images or videos. To address this issue, we develop an isotropic self-supervised learning (IsoSSL) approach to learn powerful representations of images without relying on human-provided annotations and propose an IsoSSL-MoCo model by combining IsoSSL with momentum contrast (MoCo). To exploit the complementarity of multimodal data, an attention-based multimodal fusion model is also proposed to fuse features from the eye, mouth, and optical flow of the head. Specifically, we first use the IsoSSL-MoCo model to pretrain the image encoders for the three modalities in other datasets. Then, these encoders are fine-tuned and integrated into the proposed fusion model. The feature vectors generated by the image encoders of the three modalities are fed into the recursive layer to extract temporal information. To capture the importance degrees of the effects of temporal features from the three modalities on drowsiness detection, an attention mechanism is introduced to automatically weigh the feature vectors from the recursive layer to improve detection accuracy. Finally, a vector representation is generated by the attention layer and is used to detect driver drowsiness states. Experimental results based on two challenging datasets show that our method outperforms the baseline methods and the latest existing methods.

关键词:

Feature extraction Convolutional neural networks Vehicles momentum contrast (MoCo) Computational modeling Attention Videos Dictionaries multimodal fusion model driver drowsiness detection Hidden Markov models isotropic self-supervised learning (IsoSSL)

作者机构:

  • [ 1 ] [Mou, Luntian]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
  • [ 2 ] [Zhou, Chao]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
  • [ 3 ] [Zhao, Pengfei]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
  • [ 4 ] [Yin, Baocai]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
  • [ 5 ] [Xie, Pengtao]Univ Calif San Diego, La Jolla, CA 92093 USA
  • [ 6 ] [Jain, Ramesh]Univ Calif Irvine, Inst Future Hlth, Bren Sch Informat & Comp Sci, Irvine, CA 92697 USA
  • [ 7 ] [Gao, Wen]Peking Univ, Inst Digital Media, Beijing 100871, Peoples R China
  • [ 8 ] [Gao, Wen]Peking Univ, Shenzhen Grad Sch, Sch Elect & Comp Engn, Shenzhen 518055, Peoples R China

通讯作者信息:

  • [Yin, Baocai]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China;;

查看成果更多字段

相关关键词:

来源 :

IEEE TRANSACTIONS ON MULTIMEDIA

ISSN: 1520-9210

年份: 2023

卷: 25

页码: 529-542

7 . 3 0 0

JCR@2022

ESI学科: COMPUTER SCIENCE;

ESI高被引阀值:19

被引次数:

WoS核心集被引频次:

SCOPUS被引频次: 29

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 0

归属院系:

在线人数/总访问数:286/4972100
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司