Metric selection and anomaly detection for cloud operations using log and metric correlation analysis

Farshchi, M, Schneider, JG, Weber, I and Grundy, John 2018, Metric selection and anomaly detection for cloud operations using log and metric correlation analysis, Journal of Systems and Software, pp. 531-549, doi: 10.1016/j.jss.2017.03.012.

Attached Files
Name Description MIMEType Size Downloads

Title Metric selection and anomaly detection for cloud operations using log and metric correlation analysis
Author(s) Farshchi, M
Schneider, JG
Weber, I
Grundy, JohnORCID iD for Grundy, John
Journal name Journal of Systems and Software
Start page 531
End page 549
Total pages 19
Publisher Elsevier
Place of publication New York, N.Y.
Publication date 2018-03
ISSN 0164-1212
Keyword(s) cloud application operations
cloud monitoring
metric selection
anomaly detection
error detection
log analysis
Summary Cloud computing systems provide the facilities to make application services resilient against failures of individual computing resources. However, resiliency is typically limited by a cloud consumer's use and operation of cloud resources. In particular, system operations have been reported as one of the leading causes of system-wide outages. This applies specifically to DevOps operations, such as backup, redeployment, upgrade, customized scaling, and migration - which are executed at much higher frequencies now than a decade ago. We address this problem by proposing a novel approach to detect errors in the execution of these kinds of operations, in particular for rolling upgrade operations. Our regression-based approach leverages the correlation between operations' activity logs and the effect of operation activities on cloud resources. First, we present a metric selection approach based on regression analysis. Second, the output of a regression model of selected metrics is used to derive assertion specifications, which can be used for runtime verification of running operations. We have conducted a set of experiments with different configurations of an upgrade operation on Amazon Web Services, with and without randomly injected faults to demonstrate the utility of our new approach.
Language eng
DOI 10.1016/j.jss.2017.03.012
Field of Research 080309 Software Engineering
0803 Computer Software
0806 Information Systems
Socio Economic Objective 890202 Application Tools and System Utilities
HERDC Research category C1 Refereed article in a scholarly journal
ERA Research output type C Journal article
Copyright notice ©2017, Elsevier
Persistent URL

Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 5 times in TR Web of Science
Scopus Citation Count Cited 12 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 193 Abstract Views, 2 File Downloads  -  Detailed Statistics
Created: Thu, 13 Apr 2017, 20:01:57 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact