Dynamic on-the-fly minimum cost benchmarking for storing generated scientific datasets in the cloud

Yuan, Dong, Liu, Xiao and Yang, Yun 2015, Dynamic on-the-fly minimum cost benchmarking for storing generated scientific datasets in the cloud, IEEE transactions on computers, vol. 64, no. 10, pp. 2781-2795, doi: 10.1109/TC.2015.2389801.

Attached Files
Name Description MIMEType Size Downloads

Title Dynamic on-the-fly minimum cost benchmarking for storing generated scientific datasets in the cloud
Author(s) Yuan, Dong
Liu, XiaoORCID iD for Liu, Xiao orcid.org/0000-0001-8400-5754
Yang, Yun
Journal name IEEE transactions on computers
Volume number 64
Issue number 10
Start page 2781
End page 2795
Total pages 15
Publisher IEEE
Place of publication Piscataway, N.J.
Publication date 2015-01-09
ISSN 0018-9340
Keyword(s) cloud computing
minimum cost benchmarking
datasets storage and regeneration
scientific applications
Summary Massive computation power and storage capacity of cloud computing systems enable users to either store large generated scientific datasets in the cloud or delete and then regenerate them whenever reused. Due to the pay-as-you-go model, the more datasets we store, the more storage cost we need to pay, alternatively, we can delete some generated datasets to save the storage cost but more computation cost is incurred for regeneration whenever the datasets are reused. Hence, there should exist a trade-off between computation and storage in the cloud, where different storage strategies lead to different total costs. The minimum cost, which reflects the best trade-off, is an important benchmark for evaluating the cost-effectiveness of different storage strategies. However, the current benchmarking approach is neither efficient nor practical to be applied on the fly at runtime. In this paper, we propose a novel Partitioned Solution Space based approach with efficient algorithms for dynamic yet practical on-the-fly minimum cost benchmarking of storing generated datasets in the cloud. In this approach, we pre-calculate all the possible minimum cost storage strategies and save them in different partitioned solution spaces. The minimum cost storage strategy represents the minimum cost benchmark, and whenever the datasets storage cost changes at runtime in the cloud (e.g. new datasets are generated and/or existing datasets' usage frequencies are changed), our algorithms can efficiently retrieve the current minimum cost storage strategy from the partitioned solution space and update the benchmark. By dynamically keeping the benchmark updated, our approach can be practically utilised on the fly at runtime in the cloud, based on which the minimum cost benchmark can be either proactively reported or instantly responded upon request. Case studies and experimental results based on Amazon cloud show the efficiency, scalability and practicality of our approach.
Language eng
DOI 10.1109/TC.2015.2389801
Field of Research 080303 Computer System Security
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category C1.1 Refereed article in a scholarly journal
ERA Research output type C Journal article
Copyright notice ©2015, IEEE
Persistent URL http://hdl.handle.net/10536/DRO/DU:30082899

Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 2 times in TR Web of Science
Scopus Citation Count Cited 8 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 240 Abstract Views, 2 File Downloads  -  Detailed Statistics
Created: Tue, 19 Jul 2016, 11:48:56 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.