• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Zhao, Qiang (Zhao, Qiang.) | Shi, Yuliang (Shi, Yuliang.) | Qing, Zepeng (Qing, Zepeng.)

收录:

CPCI-S EI Scopus

摘要:

Many clustering algorithms work well on small data sets of less than 200 data objects. However, a large database may contain millions of objects, and clustering on such a large data set may lead to biased results. As data volumes and availability continue to grow, so does the need for large dataset analytics. Among the most commonly used clustering algorithms, K-means proved to be one of the most popular choices to provide acceptable results in a reasonable amount of time. In this paper, we present an improved k-means algorithm with better initial centroids. Also, we implement this modified algorithm on Hadoop platform. Experiments show that the improved k-means algorithm converges faster than the classic k-means and the average execution time is reduced compared to the traditional k-means.

关键词:

clustering Hadoop k-means MapReduce

作者机构:

  • [ 1 ] [Zhao, Qiang]Beijing Univ Technol, Sch Software Engn, 34 100 Pingyuan, Beijing, Peoples R China
  • [ 2 ] [Shi, Yuliang]Beijing Univ Technol, Sch Software Engn, 34 100 Pingyuan, Beijing, Peoples R China
  • [ 3 ] [Qing, Zepeng]Beijing Univ Technol, Sch Software Engn, 34 100 Pingyuan, Beijing, Peoples R China

通讯作者信息:

  • [Shi, Yuliang]Beijing Univ Technol, Sch Software Engn, 34 100 Pingyuan, Beijing, Peoples R China

查看成果更多字段

相关关键词:

相关文章:

来源 :

FOURTH INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION

ISSN: 0277-786X

年份: 2019

卷: 11198

语种: 英文

被引次数:

WoS核心集被引频次: 6

SCOPUS被引频次: 7

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 1

在线人数/总访问数:95/3270362
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司