收录:
摘要:
With the arrival of big data era, distributed computing framework Hadoop has become the main solution to deal with big data now. People usually promote the performance of distributed computing by adding new computing nodes to cluster. With the expansion of the scale of the cluster, it produces a large amount of power consumption because of lack of reasonable management strategy. So how to make full use of computing resources in the cluster to improve the performance of the whole system and reduce the power consumption has become the main research direction of scholars and industrial circles. For the above, in order to make best use of computing resources and reduce the power consumption, this paper firstly proposes to optimize a reasonable configuration of the parameters provided by Hadoop. Comparing with the default configuration of Hadoop. It shows we can get better performance by parameter tuning. This paper proposes a task scheduling mechanism based on memory usage prediction. In this task schedule, it predicts the future use status of memory in the computing nodes by analyzing the use status before. The task scheduling mechanism can reduce the memory pressure by reducing the allocation of tasks when the computing node is under memory pressure. The task scheduling mechanism can be more flexible by setting the threshold of memory usage. This mechanism based on predicting memory usage can improve the performance of the system by making full use of the computing resources. © Springer Nature Singapore Pte Ltd. 2018.
关键词:
通讯作者信息:
电子邮件地址: