Experience report: anomaly detection of cloud application operations using log and cloud metric correlation analysis

Farshchi, Mostafa, Schneider, Jean-Guy, Weber, Ingo and Grundy, john 2015, Experience report: anomaly detection of cloud application operations using log and cloud metric correlation analysis, in ISSRE 2015 : Proceedings of the IEEE Software Reliability Engineering 2015 International Symposium, IEEE, Piscataway, N.J., pp. 24-34, doi: 10.1109/ISSRE.2015.7381796.

Attached Files
Name Description MIMEType Size Downloads

Title Experience report: anomaly detection of cloud application operations using log and cloud metric correlation analysis
Author(s) Farshchi, Mostafa
Schneider, Jean-Guy
Weber, Ingo
Grundy, johnORCID iD for Grundy, john orcid.org/0000-0003-4928-7076
Conference name IEEE Software Reliability Engineering. International Symposium (26th : 2015 : Gaithersburg, Md.)
Conference location Gaithersburg, Md.
Conference dates 2-5 Nov. 2015
Title of proceedings ISSRE 2015 : Proceedings of the IEEE Software Reliability Engineering 2015 International Symposium
Editor(s) [Unknown]
Publication date 2015
Conference series IEEE Software Reliability Engineering International Symposium
Start page 24
End page 34
Total pages 11
Publisher IEEE
Place of publication Piscataway, N.J.
Keyword(s) Cloud application operations
DevOps
Cloud monitoring
Anomaly detection
Error detection
Log analysis
Summary Failure of application operations is one of the maincauses of system-wide outages in cloud environments. Thisparticularly applies to DevOps operations, such as backup,redeployment, upgrade, customized scaling, and migration that areexposed to frequent interference from other concurrent operations,configuration changes, and resources failure. However, currentpractices fail to provide a reliable assurance of correct execution ofthese kinds of operations. In this paper, we present an approach toaddress this problem that adopts a regression-based analysistechnique to find the correlation between an operation’s activity logsand the operation activity’s effect on cloud resources. Thecorrelation model is then used to derive assertion specifications,which can be used for runtime verification of running operations andtheir impact on resources. We evaluated our proposed approach onAmazon EC2 with 22 rounds of rolling upgrade operations whileother types of operations were running and random faults wereinjected. Our experiment shows that our approach successfullymanaged to raise alarms for 115 random injected faults, with aprecision of 92.3%.
ISBN 9781509004058
Language eng
DOI 10.1109/ISSRE.2015.7381796
Field of Research 080309 Software Engineering
Socio Economic Objective 890201 Application Software Packages (excl. Computer Games)
HERDC Research category E1.1 Full written paper - refereed
ERA Research output type E Conference publication
Copyright notice ©2015, IEEE
Persistent URL http://hdl.handle.net/10536/DRO/DU:30082725

Document type: Conference Paper
Collections: School of Information Technology
2018 ERA Submission
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 6 times in TR Web of Science
Scopus Citation Count Cited 13 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 200 Abstract Views, 4 File Downloads  -  Detailed Statistics
Created: Fri, 15 Jul 2016, 17:30:07 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.