The issue of under-estimated length of jobs (parallel applications) on backfill-based scheduling is ignored in the current literature because users want to avoid their jobs to be killed when the requested time expires. Therefore, users prefer to over-estimate the length of their jobs. This paper shows the impact of underestimated length of jobs on their execution performance in an EASY-backfill scheduling-based system. We have developed a batch job scheduler for Linux clusters that implements an enhanced EASY- backfilling algorithm in such a way that a job with an under-estimated execution time would not be killed unless it would delay other jobs. We have carried out performance evaluation by scheduling static workloads of well known MPI parallel applications on a real cluster. Our results show that most of the jobs do not have to be aborted even though their job lengths are under-estimated whereas the slowdown of jobs and the throughput of the system are only slightly degraded.
History
Event
Euromicro Conference on Parallel, Distributed and Network-based Processing (16th : 2008 : Toulouse, France)