• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Liang, Yi (Liang, Yi.) | Zeng, Shaokang (Zeng, Shaokang.) | Xu, Xiaoxian (Xu, Xiaoxian.) | Chang, Shilu (Chang, Shilu.) | Su, Xing (Su, Xing.)

Indexed by:

EI Scopus SCIE

Abstract:

Spark is the most popular in-memory processing framework for big data analytics. Memory is the crucial resource for workloads to achieve performance acceleration on Spark. The extant memory capacity configuration approach in Spark is to statically configure the memory capacity for workloads based on user's specifications. However, without the deep knowledge of the workload's system-level characteristics, users in practice often conservatively overestimate the memory utilizations of their workloads and require resource manager to grant more memory share than that they actually need, which leads to the severe waste of memory resources. To address the above issue, SMConf, an automated memory capacity configuration solution for in-memory computing workloads in Spark is proposed. SMConf is designed based on the observation that, though there is not one-size -fit-all proper configuration, the one-size-fit-bunch configuration can be found for in-memory computing workloads. SMConf classifies typical Spark workloads into categories based on metrics across layers of Spark system stack. For each workload category, an individual memory requirement model is learned from the workload's input data size and the strong-correlated configuration parameters. For an ad-hoc workload, SMConf matches its memory requirement signature to one of the workload categories with small-sized input data and determines its proper memory capacity configuration with the corresponding memory requirement model. Experimental results demonstrate that, compared to the conservative default configuration, SMConf can reduce the memory resource provision to Spark workloads by up to 69% with the slight performance degradation, and reduce the average turnaround time of Spark workloads by up to 55% in the multi-tenant environments.

Keyword:

automated configuration memory capacity Spark

Author Community:

  • [ 1 ] [Liang, Yi]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 2 ] [Zeng, Shaokang]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 3 ] [Chang, Shilu]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 4 ] [Su, Xing]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 5 ] [Xu, Xiaoxian]Univ Zurich, Dept Informat, CH-8050 Zurich, Switzerland

Reprint Author's Address:

  • [Liang, Yi]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China

Show more details

Related Keywords:

Source :

CMC-COMPUTERS MATERIALS & CONTINUA

ISSN: 1546-2218

Year: 2021

Issue: 2

Volume: 66

Page: 1697-1717

3 . 1 0 0

JCR@2022

ESI HC Threshold:87

JCR Journal Grade:2

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 2

Affiliated Colleges:

Online/Total:1101/5360450
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.