Openly accessible

Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems

Zhao, Laiping, Ren, Yizhi, Xiang, Yang and Sakurai, Kouichi 2010, Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems, in HPCC 2010 : Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, IEEE, Piscataway, N.J., pp. 434-441.

Attached Files
Name Description MIMEType Size Downloads
xiang-faulttolerant-2010.pdf Published version application/pdf 1.26MB 7

Title Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems
Author(s) Zhao, Laiping
Ren, Yizhi
Xiang, Yang
Sakurai, Kouichi
Conference name IEEE International Conference on High Performance Computing and Communications (12th : 2010 : Melbourne, Vic.)
Conference location Melbourne, Vic.
Conference dates 1-3 Sep. 2010
Title of proceedings HPCC 2010 : Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications
Editor(s) [Unknown]
Publication date 2010
Conference series International Conference on High Performance Computing and Communications
Start page 434
End page 441
Total pages 8
Publisher IEEE
Place of publication Piscataway, N.J.
Keyword(s) resource scheduling
fault-tolerance
reliability
heterogeneous system
Summary

In the existing studies on fault-tolerant scheduling, the active replication schema makes use of ε + 1 replicas for each task to tolerate ε failures. However, in this paper, we show that it does not always lead to a higher reliability with more replicas. Besides, the more replicas implies more resource consumption and higher economic cost. To address this problem, with the target to satisfy the user’s reliability requirement with minimum resources, this paper proposes a new fault tolerant scheduling algorithm: MaxRe. In the algorithm, we incorporate the reliability analysis into the active replication schema and the theoretical analysis and experiments prove that the MaxRe algorithm’s schedule can certainly satisfy user’s reliability requirements. And the MaxRe scheduling algorithm can achieve the corresponding reliability with at most 70% fewer resources than the FTSA algorithm.

ISBN 076954214X
9780769542140
Language eng
Field of Research 080503 Networking and Communications
Socio Economic Objective 890202 Application Tools and System Utilities
HERDC Research category E1 Full written paper - refereed
HERDC collection year 2010
Copyright notice ©2010, IEEE
Persistent URL http://hdl.handle.net/10536/DRO/DU:30034376

Document type: Conference Paper
Collections: School of Information Technology
Open Access Collection
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.

Versions
Version Filter Type
Access Statistics: 156 Abstract Views, 13 File Downloads  -  Detailed Statistics
Created: Mon, 18 Apr 2011, 15:09:02 EST by Sandra Dunoon

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.