Deakin University
Browse

Rethinking elastic online scheduling of big data streaming applications over high-velocity continuous data streams

journal contribution
posted on 2018-02-01, 00:00 authored by D Sun, H Yan, Shang GaoShang Gao, X Liu, R Buyya
Online scheduling plays a key role for big data streaming applications in a big data stream computing environment, as the arrival rate of high-velocity continuous data stream might fluctuate over time. In this paper, an elastic online scheduling framework for big data streaming applications (E-Stream) is proposed, exhibiting the following features. (1) Profile mathematical relationships between system response time, multiple application fairness, and online features of high-velocity continuous stream. (2) Scale out or scale in a data stream graph by quantifying computation and communication cost, and the vertex semantics for arrival rate of data stream, and adjust the degree of parallelism of vertices in the graph. Subgraph is further constructed to minimize data dependencies among the subgraphs. (3) Elastically schedule a graph by a priority-based earliest finish time first online scheduling strategy, and schedule multiple graphs by a max–min fairness strategy. (4) Evaluate the low system response time and acceptable applications fairness objectives in a real-world big data stream computing environment. Experimental results conclusively demonstrate that the proposed E-Stream provides better system response time and applications fairness compared to the existing Storm framework.

History

Journal

Journal of Supercomputing

Volume

74

Pagination

615-636

Location

Berlin, Germany

ISSN

0920-8542

eISSN

1573-0484

Language

English

Publication classification

C Journal article, C1 Refereed article in a scholarly journal

Copyright notice

2017, Springer Science + Business Media

Issue

2

Publisher

SPRINGER