• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Wang, Aiqing (Wang, Aiqing.) | Zhang, Sen (Zhang, Sen.)

收录:

EI Scopus

摘要:

In Chinese and many other Asian languages which are based on non-ASCII alphabet, words are not delimited with whitespace (space, tab etc.), and word boundaries must therefore be reconstructed. Further syntactic analysis is based on the output of word segmentation result. Ambiguity and unregistered words are the most important problems in Chinese word segmentation. In this paper we analyzed the ambiguous reasons and presented a one-pass scan method for the detection and modification of ambiguous cases. To deal with the unregistered words and special words (such as names), we proposed a combination method that can recognize new words, hence the accuracy can be increased. In the realization, we used the bisection search method to look up words in a large dictionary (more than 40,000 items), and the average search cost for a word is less than 16 operations, so the speed is satisfactory if the system is embedded into Chinese understanding systems or Chinese speech processing systems. © 2007 IEEE.

关键词:

Character recognition Image segmentation Natural language processing systems Speech processing Syntactics

作者机构:

  • [ 1 ] [Wang, Aiqing]Department of Mathematics, Qingdao Technological University, Qingdao, China
  • [ 2 ] [Zhang, Sen]Information and Computing Sci. Lab., Beijing University of Technology, China

通讯作者信息:

电子邮件地址:

查看成果更多字段

相关关键词:

相关文章:

来源 :

年份: 2007

卷: 3

页码: 738-743

语种: 英文

被引次数:

WoS核心集被引频次: 0

SCOPUS被引频次:

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 4

归属院系:

在线人数/总访问数:905/2905941
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司