Massive computation power and storage capacity of cloud computing systems enable users to either store large generated scientific datasets in the cloud or delete and then regenerate them whenever reused. Due to the pay-as-you-go model, the more datasets we store, the more storage cost we need to pay, alternatively, we can delete some generated datasets to save the storage cost but more computation cost is incurred for regeneration whenever the datasets are reused. Hence, there should exist a trade-off between computation and storage in the cloud, where different storage strategies lead to different total costs. The minimum cost, which reflects the best trade-off, is an important benchmark for evaluating the cost-effectiveness of different storage strategies. However, the current benchmarking approach is neither efficient nor practical to be applied on the fly at runtime. In this paper, we propose a novel Partitioned Solution Space based approach with efficient algorithms for dynamic yet practical on-the-fly minimum cost benchmarking of storing generated datasets in the cloud. In this approach, we pre-calculate all the possible minimum cost storage strategies and save them in different partitioned solution spaces. The minimum cost storage strategy represents the minimum cost benchmark, and whenever the datasets storage cost changes at runtime in the cloud (e.g. new datasets are generated and/or existing datasets' usage frequencies are changed), our algorithms can efficiently retrieve the current minimum cost storage strategy from the partitioned solution space and update the benchmark. By dynamically keeping the benchmark updated, our approach can be practically utilised on the fly at runtime in the cloud, based on which the minimum cost benchmark can be either proactively reported or instantly responded upon request. Case studies and experimental results based on Amazon cloud show the efficiency, scalability and practicality of our approach.
Field of Research
080303 Computer System Security
Socio Economic Objective
970108 Expanding Knowledge in the Information and Computing Sciences
Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact email@example.com.