Deakin University
Browse

File(s) under permanent embargo

Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems

conference contribution
posted on 2010-01-01, 00:00 authored by L Zhao, Y Ren, Yang Xiang, K Sakurai

In the existing studies on fault-tolerant scheduling, the active replication schema makes use of ε + 1 replicas for each task to tolerate ε failures. However, in this paper, we show that it does not always lead to a higher reliability with more replicas. Besides, the more replicas implies more resource consumption and higher economic cost. To address this problem, with the target to satisfy the user’s reliability requirement with minimum resources, this paper proposes a new fault tolerant scheduling algorithm: MaxRe. In the algorithm, we incorporate the reliability analysis into the active replication schema and the theoretical analysis and experiments prove that the MaxRe algorithm’s schedule can certainly satisfy user’s reliability requirements. And the MaxRe scheduling algorithm can achieve the corresponding reliability with at most 70% fewer resources than the FTSA algorithm.

History

Event

IEEE International Conference on High Performance Computing and Communications (12th : 2010 : Melbourne, Vic.)

Pagination

434 - 441

Publisher

IEEE

Location

Melbourne, Vic.

Place of publication

Piscataway, N.J.

Start date

2010-09-01

End date

2010-09-03

ISBN-13

9780769542140

ISBN-10

076954214X

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2010, IEEE

Title of proceedings

HPCC 2010 : Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC