One of the primary issues associated with the efficient and effective utilization of distributed computing is resource management and scheduling. As distributed computing resource failure is a common occurrence, the issue of deploying support for integrated scheduling and fault-tolerant approaches becomes paramount importance. To this end, we propose a fault-tolerant dynamic scheduling policy that loosely couples dynamic job scheduling with job replication scheme such that jobs are efficiently and reliably executed. The novelty of the proposed algorithm is that it uses passive replication approach under high system load and active replication approach under low system loads. The switch between these two replication methods is also done dynamically and transparently. Performance evaluation of the proposed fault-tolerant scheduler and a comparison with similar fault-tolerant scheduling policy is presented and shown that the proposed policy performs better than the existing approach.
Presented at ICA3PP international workshops and symposiums. Zhangjiajie, China, November 18-20, 2015
Publication classification
B Book chapter, B1 Book chapter
Copyright notice
2015, Springer
Extent
77
Editor/Contributor(s)
Guojun W, Zomaya A, Perez GM, Li K
Publisher
Springer International Publishing
Place of publication
Cham, Switzerland
Title of book
Algorithms and architectures for parallel processing : ICA3PP international workshops and symposiums, Zhangjiajie, China, November 18-20, 2015, proceedings