Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems
Zhao, Laiping, Ren, Yizhi, Xiang, Yang and Sakurai, Kouichi 2010, Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems, in HPCC 2010 : Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, IEEE, Piscataway, N.J., pp. 434-441.
(Some files may be inaccessible until you login with your DRO credentials)
In the existing studies on fault-tolerant scheduling, the active replication schema makes use of ε + 1 replicas for each task to tolerate ε failures. However, in this paper, we show that it does not always lead to a higher reliability with more replicas. Besides, the more replicas implies more resource consumption and higher economic cost. To address this problem, with the target to satisfy the user’s reliability requirement with minimum resources, this paper proposes a new fault tolerant scheduling algorithm: MaxRe. In the algorithm, we incorporate the reliability analysis into the active replication schema and the theoretical analysis and experiments prove that the MaxRe algorithm’s schedule can certainly satisfy user’s reliability requirements. And the MaxRe scheduling algorithm can achieve the corresponding reliability with at most 70% fewer resources than the FTSA algorithm.
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.
Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO.
If you believe that your rights have been infringed by this repository, please contact firstname.lastname@example.org.