One Self-Adaptive Memory Scheduling Algorithm for the Shuffle Process in Spark Platform - 文章详情页

作者：

Xu, Jungang (Xu, Jungang.) | Huang, Shanshan (Huang, Shanshan.) | Liu, Renfeng (Liu, Renfeng.) | Li, Pengfei (Li, Pengfei.) (学者：李鹏飞)

收录：

CPCI-S

摘要：

The　Shuffle　module　is　one　of　the　core　modules　in　Spark　platform,　its　performance　directly　influences　the　performance　and　throughput　of　the　whole　Spark　platform.　The　existing　memory　scheduling　algorithm　for　the　Shuffle　process　only　equitably　allocates　tasks　according　to　the　number　of　tasks　without　considering　the　different　memory　requirements　of　different　tasks,　which　causes　memory　utilization　to　drop　and　low　running　efficiency　when　data　is　skewed.　To　solve　this　problem,　one　self-adaptive　memory　scheduling　algorithm　for　the　Shuffle　process　(SAMSAS)　is　proposed　in　this　paper,　which　does　not　need　to　set　the　priority　of　task　processing　in　advance.　Instead,　it　can　adjust　memory　allocation　self　-adaptively　through　constantly　monitoring　and　learning　the　actual　memory　requirements　of　task　execution.　The　experimental　results　show　that　SAMSAS　algorithm　can　improve　the　utilization　rate　of　the　entire　memory　pool　and　the　running　efficiency　of　each　Task,　and　specially　it　can　effectively　improve　the　running　efficiency　of　Spark　platform　when　processing　skew　data.

关键词：

作者机构：

[ 1 ] [Xu, Jungang]Univ Chinese Acad Sci, Beijing, Peoples R China
[ 2 ] [Liu, Renfeng]Univ Chinese Acad Sci, Beijing, Peoples R China
[ 3 ] [Li, Pengfei]Univ Chinese Acad Sci, Beijing, Peoples R China
[ 4 ] [Huang, Shanshan]Beijing Univ Technol, Beijing, Peoples R China

通讯作者信息：

[Xu, Jungang]Univ Chinese Acad Sci, Beijing, Peoples R China

电子邮件地址：

xujg@ucas.ac.cn |
huangss118@emails.bjut.edu.cn |
liurenfeng16@mails.ucas.ac.cn |
lipengfei@ucas.ac.cn

查看成果更多字段

成果类型
所属机构

所有年份指定年份从至