收录:
摘要:
In this paper, we propose a novel descriptor method for human action recognition based on depth video sequences. The proposed method improves the flexibility of action recognition by using local multiresolution pyramids in feature space. In feature extraction, we extract polynormals of different scales and compose new pyramid-polynormals to express the multilayer apparent information of a local subcube, improving the discrimination of the descriptor. Moreover, we also present a novel group sparse constraint dictionary learning method to reduce the correlations between different sub-dictionaries and obtain a sparse dictionary with more discriminative ability; we then use sparse low-level coding features by utilizing the learned sparse dictionary and applying the spatial average pool and temporal maximum pool to aggregate the sparse coefficients into a high-dimensional feature. The feature vectors extracted from each grid are then concatenated as the final P-SNV descriptor. The descriptor can effectively preserve local multi-layer apparent information of human actions while eliminating similar contents contained in the different categories of action, effectively improving the recognition rate. Experimental results on four public benchmark datasets demonstrate that our algorithm achieves superior performance compared to the state-of-the-art algorithms. © 2015 by Binary Information Press.
关键词:
通讯作者信息:
电子邮件地址: