基于PUSH机制的任务调度方法

来源期刊:中南大学学报(自然科学版)2016年第7期

论文作者:张霄宏 孙江峰 赵文涛

文章页码:2334 - 2341

关键词:数据局部性;性能优化;任务调度;MapReduce

Key words:data locality; performance optimization; task scheduling; MapReduce

摘    要:为降低Hadoop MapReduce环境中任务的数据访问延时进而提高系统性能,提出一种基于PUSH机制的任务调度方法。该方法根据输入数据分布,主动将任务推送到存储其输入数据的节点。当任务在这些节点执行时,可以直接从本地磁盘读取数据,从而避免远程数据访问延时。该方法已在hadoop-0.20.2中实现,并在真实集群中进行验证。研究结果表明:与原有调度方式相比,该方法可将作业执行时间平均降低8%,在最好情况下可降低14.3%。

Abstract: To reduce remote data access latency and improve the system performance in Hadoop MapReduce, a new task scheduling method was proposed. According to the method, tasks were pushed to the nodes of storing their input data. When executing on those nodes, those tasks can access the relative input data from local disks, and hence avoiding remote data access latency. The new method was implemented in Hadoop-0.20.2, and evaluated in a real cluster. The results show that the method can decrease the execution time of jobs by 14.3% in the best case, and 8% on average.

有色金属在线官网  |   会议  |   在线投稿  |   购买纸书  |   科技图书馆

中南大学出版社 技术支持 版权声明   电话:0731-88830515 88830516   传真:0731-88710482   Email:administrator@cnnmol.com

互联网出版许可证:(署)网出证(京)字第342号   京ICP备17050991号-6      京公网安备11010802042557号