收录:
摘要:
Crowd counting algorithms play an important role in the field of public safety management. Most of the current mainstream crowd counting methods are based on deep convolutional neural networks (CNNs), which use multi-column or multi-scale convolutional structures to obtain contextual information in images to compensate for the impact of perspective distortion on counting results. However, due to the locally connected nature of convolution, this method cannot obtain enough global context, which often leads to misidentification in complex background regions, which affects the accuracy of counting. To solve this problem. First, we design a double recursive sparse self-attention module, which can better obtain long-distance dependency information and improve the problem of background false detection on the basis of reducing the amount of computation and parameters. Secondly, we design a Transformer structure based on feature pyramid as the feature extraction module of the crowd counting algorithm, which effectively improves the algorithm’s ability to extract global information. The experimental results on public datasets show that our proposed algorithm outperforms the current mainstream crowd counting methods, and effectively improves the background false detection problem of complex scene images. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
关键词:
通讯作者信息:
电子邮件地址:
来源 :
ISSN: 0302-9743
年份: 2022
卷: 13534 LNCS
页码: 722-734
语种: 英文
归属院系: