收录:
摘要:
Target searching, i.e. fast locating target objects in images or videos, has attracted much attention in computer vision. A comprehensive understanding of factors influencing human visual searching is essential to design target searching algorithms for computer vision systems. In this paper, we propose a combined model to generate scan paths for computer vision to follow to search targets in images. The model explores and integrates three factors influencing human vision searching, top-down target information, spatial context and bottom-up visual saliency, respectively. The effectiveness of the combined model is evaluated by comparing the generated scan paths with human vision fixation sequences to locate targets in the same images. The evaluation strategy is also used to learn the optimal weighting coefficients of the factors through linear search. In the meanwhile, the performances of every single one of the factors and their arbitrary combinations are examined. Through plenty of experiments, we prove that the top-down target information is the most important factor influencing the accuracy of target searching. The effects from the bottom-up visual saliency are limited. Any combinations of the three factors have better performances than each single component factor. The scan paths obtained by the proposed model are optimal, since they are most similar to the human vision fixation sequences. © 2014 IEEE.
关键词:
通讯作者信息:
电子邮件地址: