Performance evaluation and analysis of multiple scenarios of big data stream computing on storm platform
journal contribution
posted on 2018-07-31, 00:00 authored by D Sun, H Yan, Shang GaoShang Gao, Z Zhou© 2018 KSII. In big data era, fresh data grows rapidly every day. More than 30,000 gigabytes of data are created every second and the rate is accelerating. Many organizations rely heavily on real time streaming, while big data stream computing helps them spot opportunities and risks from real time big data. Storm, one of the most common online stream computing platforms, has been used for big data stream computing, with response time ranging from milliseconds to sub-seconds. The performance of Storm plays a crucial role in different application scenarios, however, few studies were conducted to evaluate the performance of Storm. In this paper, we investigate the performance of Storm under different application scenarios. Our experimental results show that throughput and latency of Storm are greatly affected by the number of instances of each vertex in task topology, and the number of available resources in data center. The fault-tolerant mechanism of Storm works well in most big data stream computing environments. As a result, it is suggested that a dynamic topology, an elastic scheduling framework, and a memory based fault-tolerant mechanism are necessary for providing high throughput and low latency services on Storm platform.
History
Journal
KSII transactions on internet and information systemsVolume
12Pagination
2977-2997Location
Seoul, South KoreaPublisher DOI
Open access
- Yes
ISSN
1976-7277eISSN
2288-1468Language
engPublication classification
C1 Refereed article in a scholarly journalCopyright notice
2018, KIISIssue
7Publisher
Korean Society for Internet InformationUsage metrics
Categories
No categories selectedLicence
Exports
RefWorksRefWorks
BibTeXBibTeX
Ref. managerRef. manager
EndnoteEndnote
DataCiteDataCite
NLMNLM
DCDC