Energy-aware scheduling for spark job based on deep reinforcement learning in cloud
Article Ecrit par: Li, Hongjian ; Lu, Liang ; Shi, Wenhu ; Tan, Gangfan ; Luo, Hao ;
Résumé: Big data frameworks such as Storm, Spark and Hadoop are widely deployed in commercial and research applications, the energy consumption of cloud data centers that support big data processing platforms is becoming more and more prominent. However, job scheduling is a complex problem in the presence of various service level agreement (SLA) goals, such as cost reduction and job performance improvement. The highly heterogeneous nature of clusters and the variability of resource requirements acrossworkloads make energy-efficient scheduling on big data platforms extremely complex under SLA constraints. Existing performance-based models and heuristic scheduling methods rely excessively on historical data and are difficult to optimize or modify for changes in load and clusters. In this paper, we construct an energy consumption model based on resource utilization and a reinforcement learning model for energy-efficient scheduling under SLA constraints for Spark clusters, and design two Deep Reinforcement Learning (DRL) algorithms. The cluster scheduler designed and implemented based on this model can automatically capture different load characteristics and inherent cluster characteristics, find the appropriate executor creation policy for resource allocation, and reduce cluster energy consumption under the constraint of job execution time. Experimental results show that the DRL scheduler proposed in this paper saves a maximum energy of about 33% under different load characteristics.
Langue:
Anglais