收录:
摘要:
To address the challenges,which are the limited number of domain entitiesandtherelative lack ofcorpus samples,for entity recognition in the fine-grained domain, an unsupervised method for narrow-domain entity recognition was proposed by integrating word frequency and context information.Firstly, fusing the word frequency and context information, the new relevance hypothesis with term-corpus was designed, and the probability of hypothesis was calculated by using log likelihood ratio to obtain domain discrimination degree of candidate entities. Based on the relative domain ratio of head-word of candidate entities in the corpus, the domain dependence function was constructed to recognize the domain tendency of the candidate entities; Finally, combining the domain discrimination degree and the domain dependence, the domain relevance measurement of the candidate entities was calculated, and the candidate entities whose domain relevance measurement were greater than the threshold were selected as the narrow domain entities. The experimental results show that the proposed method can improve the accuracy of narrow-domainentity recognition and reduce manual intervention in the recognition process. © 2018, Editorial Department of Journal of Beijing University of Technology. All right reserved.
关键词:
通讯作者信息:
电子邮件地址: