收录:
摘要:
Due to the anonymous and free-for-all characteristics of online forums, it is very hard for human beings to differentiate deceptive reviews from truthful reviews. This paper proposes a deep learning approach for text representation called DC Word (Deep Context representation by Word vectors) to deceptive review identification. The basic idea is that since deceptive reviews and truthful reviews are composed by writers without and with real experience on using the online purchased goods or services, there should be different contextual information of words between them. Unlike state-of-the-art techniques in seeking best linguistic features for representation, we use word vectors to characterize contextual information of words in deceptive and truthful reviews automatically. The average-pooling strategy (called DC Word-A) and max-pooling strategy (called DC Word-M) are used to produce review vectors from word vectors. Experimental results on the Spam dataset and the Deception dataset demonstrate that the DCWord-M representation with LR (Logistic Regression) produces the best performances and outperforms state-of-the-art techniques on deceptive reviewidentification. Moreover, the DC Word-M strategy outperforms the DC Word-A strategy in review representation for deceptive review identification. The outcome of this study provides potential implications for online review management and business intelligence of deceptive review identification.
关键词:
通讯作者信息:
电子邮件地址: