Predicting Failures of Autoscaling Distributed Applications (FSE 2024 - Research Papers)

Mon 15 - Fri 19 July 2024 Porto de Galinhas, Brazil, Brazil

Who

Giovanni Denaro, Noura El Moussa, Rahim Heydarov, Francesco Lomio, Mauro Pezze, Ketai Qiu

Track

FSE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 19 Jul 2024 11:54 - 12:12 at Pitomba - AI4SE 4 Chair(s): Wesley Assunção

Abstract

Predicting failures in production environments allows service providers to activate countermeasures that prevent harming the users of the applications. The most successful approaches predict failures from error states that the current approaches identify from anomalies in time series of fixed sets of KPI values collected at runtime. They cannot handle time series of KPI sets with size that varies over time. Thus these approaches work with applications that run on statically configured sets of components and computational nodes, and do not scale up to the many popular cloud applications that exploit autoscaling.

This paper proposes PREFACE, a novel approach to predict failures in cloud applications that exploit autoscaling. PREFACE originally augments the neural-network-based failure predictors successfully exploited to predict failures in statically configured applications, with a Rectifier layer that handles KPI sets of highly variable size as the ones collected in cloud autoscaling applications, and reduces those KPIs to a set of rectified-KPIs of fixed size that can be fed to the neural-network predictor. The PREFACE Rectifier computes the rectified-KPIs as descriptive statistics of the original KPIs, for each logical component of the target application. The descriptive statistics shrink the highly variable sets of KPIs collected at different timestamps to a fixed set of values compatible with the input nodes of the neural-network failure predictor. The neural network can then reveal anomalies that correspond to error states, before they propagate to failures that harm the users of the applications. The experiments on both a commercial application and a widely used academic exemplar confirm that PREFACE can indeed predict many harmful failures early enough to activate proper countermeasures.

Link to Preprint

https://star.inf.usi.ch/media/papers/2024-fse-ketai-preface.pdf

DOI

https://doi.org/10.1145/3660794

Giovanni Denaro

University of Milano - Bicocca

Noura El Moussa

USI Università della Svizzera Italiana & SIT Schaffhausen Institute of Technology

Switzerland

Rahim Heydarov

USI Università della Svizzera Italiana

Francesco Lomio

SIT Schaffhausen Institute of Technology

Mauro Pezze

USI Università della Svizzera Italiana & SIT Schaffhausen Institute of Technology

Switzerland

Ketai Qiu

USI Università della Svizzera Italiana

Switzerland

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 19 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

11:00 - 12:30	AI4SE 4 Research Papers at Pitomba Chair(s): Wesley Assunção North Carolina State University

11:00 18m Talk		Improving the Learning of Code Review Successive Tasks with Cross-Task Knowledge Distillation Research Papers Oussama Ben Sghaier DIRO, Université de Montréal, Houari Sahraoui DIRO, Université de Montréal
11:18 18m Talk		Learning to Detect and Localize Multilingual Bugs Research Papers Haoran Yang Washington State University, Yu Nong Washington State University, Tao Zhang Macau University of Science and Technology, Xiapu Luo The Hong Kong Polytechnic University, Haipeng Cai Washington State University DOI Pre-print
11:36 18m Talk		Mining Action Rules for Defect Reduction Planning Research Papers Khouloud Oueslati Polytechnique Montréal, Canada, Gabriel Laberge Polytechnique Montréal, Canada, Maxime Lamothe Polytechnique Montreal, Foutse Khomh Polytechnique Montréal
11:54 18m Talk		Predicting Failures of Autoscaling Distributed Applications Research Papers Giovanni Denaro University of Milano - Bicocca, Noura El Moussa USI Università della Svizzera Italiana & SIT Schaffhausen Institute of Technology, Rahim Heydarov USI Università della Svizzera Italiana, Francesco Lomio SIT Schaffhausen Institute of Technology, Mauro Pezze USI Università della Svizzera Italiana & SIT Schaffhausen Institute of Technology, Ketai Qiu USI Università della Svizzera Italiana DOI Pre-print
12:12 18m Talk		RavenBuild: Context, Relevance, and Dependency Aware Build Outcome Prediction Research Papers Gengyi Sun University of Waterloo, Sarra Habchi Ubisoft Montréal, Shane McIntosh University of Waterloo