收录:
摘要:
Detecting "hot" topics from the enormous user-generated content (UGC) data on web poses two main difficulties that the conventional approaches can barely handle: 1) poor feature representations from noisy images or short texts, and 2) uncertain roles of modalities where the visual content is either highly or weakly relevant to the textual cues due to the less-constrained UGC. In this paper, following the detection-by-ranking approach, we address above challenges by learning a robust latent representation from multiple, noisy and a high probability of the complementary features. Both the textual features and the visual ones are encoded into a k-nearest neighbor hybrid similarity graph (HSG), where nonnegative matrix factorization using random walk is introduced to generate topic candidates. An efficient fusion of multiple HSGs is then done by a latent poisson deconvolution, which consists of a poisson deconvolution with sparse basis similarity for each edge. Experiments show significantly improved accuracy of the proposed approach in comparison with the state-of-the-art methods on two public datasets.
关键词:
通讯作者信息:
电子邮件地址:
来源 :
IEEE TRANSACTIONS ON MULTIMEDIA
ISSN: 1520-9210
年份: 2016
期: 12
卷: 18
页码: 2482-2493
7 . 3 0 0
JCR@2022
ESI学科: COMPUTER SCIENCE;
ESI高被引阀值:167
中科院分区:1