Deakin University
Browse

Efficient and low-delay task scheduling for big data clusters in a theoretical perspective

Version 2 2024-06-05, 05:27
Version 1 2017-04-21, 10:55
conference contribution
posted on 2024-06-05, 05:27 authored by Y Gao, H Yu, S Luo, S Yu
In big data clusters, task dispatchers assign arriving tasks to one of many workers (servers) for load balancing. Workers schedule task executions for rapidly completing queueing tasks. Both dispatchers and workers are important for optimizing task/job-completion-time (TCT/JCT). Current dispatchers probe loads on workers before assigning every task/job, which incurs expensive message overheads and significant delays. Besides, they use simple First-In-First-Out (FIFO) scheduling on workers, which further harms their TCT/JCT performance due to head-of-line blocking. In our TASCO scheduler, workers report their loads to dispatchers so that dispatchers avoid to probe them, which significantly reduces expensive overheads and delays on current dispatchers. Motivated by recent observations that more than 60% tasks in big data clusters are recurring with predictable task service time, we also use delay- optimal smallest-task-first (STF) scheduling to improve current simple FIFO scheduling on workers. We also derive the average TCT of TASCO based on its equivalence to an M/G/1/STF queue and the insight that workers reporting loads to dispatchers follows a Poisson process in the large- system limit. Our theories and simulation results demonstrate that the average TCT/JCT of TASCO outperforms state-of-art schedulers from 5.3% to 55.9%.

History

Pagination

2242-2247

Location

Washington, D.C.

Start date

2016-12-04

End date

2016-12-08

ISBN-13

9781509013289

Language

eng

Publication classification

E Conference publication, E1.1 Full written paper - refereed

Copyright notice

2016, IEEE

Title of proceedings

GLOBECOM 2016: Proceedings of the 59th IEEE Global Communications Conference

Event

IEEE Global Communications Conference (59th : 2016 : Washington, D.C.)

Publisher

IEEE

Place of publication

Piscataway, N.J.

Usage metrics

    Research Publications

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC