收录:
摘要:
MapReduce is an effective programming model for analyzing large-scale data. Hadoop-a distributed processing system is widely used nowadays. Improving the task parallelism can be a key point to improve the MapReduce performance in Hadoop. In this paper, we address the problem in two ways. On the one hand we can run the tasks with some dynamic configurations. On the other hand, considering of the difference of tasktracker we use mathematics method to predict the cups' utilization of tasktracker to assign the task. Experimental results on both ways show we can improve the performance in Hadoop by improving the task parallelism.
关键词:
通讯作者信息:
电子邮件地址: