Journal article

Performance and Cost-Efficient Spark Job Scheduling Based on Deep Reinforcement Learning in Cloud Computing Environments

Muhammed Tawfiqul Islam, Shanika Karunasekera, Rajkumar Buyya

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | IEEE COMPUTER SOC | Published : 2022

Abstract

Big data frameworks such as Spark and Hadoop are widely adopted to run analytics jobs in both research and industry. Cloud offers affordable compute resources that are easier to manage. Hence, many organizations are shifting towards a cloud deployment of their big data computing clusters. However, job scheduling is a complex problem in the presence of various Service Level Agreement (SLA) objectives such as monetary cost reduction, and job performance improvement. Most of the existing research does not address multiple objectives together and fail to capture the inherent cluster and workload characteristics. In this paper, we formulate the job scheduling problem of a cloud-deployed Spark clu..

View full abstract