收录:
摘要:
In recent years, human activity recognition (HAR) methods are developing rapidly. However, most existing methods base on single input data modality, and suffers from accuracy and robustness issues. In this paper, we present a novel multi-modal HAR architecture which fuses signals from both RGB visual data and Inertial Measurement Units (IMU) data. As for the RGB modality, the speed-weighted star RGB representation is proposed to aggregate the temporal information, and a convolutional network is employed to extract features; As for the IMU modality, Fast Fourier transform and multi-layer perceptron are employed to extract the dynamical features of IMU data. As for the feature fusion scheme, the global soft attention layer is designed to adjust the weights according to the concatenated features, and the L-softmax with soft voting is adopted to classify activities. The proposed method is evaluated on the UP-Fall dataset, the F1-scores are 0.92 and 1.00 for 11 classes classification task and fall/non-fall binary classification task respectively. © 2023, China Computer Federation (CCF).
关键词:
通讯作者信息:
电子邮件地址:
来源 :
CCF Transactions on Pervasive Computing and Interaction
ISSN: 2524-521X
年份: 2023
期: 3
卷: 5
页码: 321-332
归属院系: