收录:
摘要:
The task of topic detection becomes the hotspot research direction in the field of natural language processing in recent years. The cyberspace public opinion has given the significant impact on the large numbers of internet users. This makes the effective public opinion detection and tracking become very important. We select the effective features in the story document and the vector center model is taken to represent the text document. The algorithm of distance based clustering is carried out for public opinion detection. It identifies the emergence of new events and also merges the story to the corresponding cluster. Finally, we give the performance evaluation by the F-Value and entropy value. The system achieves the performance of 76% F value on the test set. The technique of topics detection can monitor the information sources in various languages, and it will provide the efficient guidance for judging the hot spots on the web. © Springer-Verlag Berlin Heidelberg 2011.
关键词:
通讯作者信息:
电子邮件地址: