Incident management for large cloud services is a complex and tedious process and requires significant amount of manual efforts from on-call engineers (OCEs). OCEs typically leverage data from different stages of the software development lifecycle [SDLC](e.g., codes, configuration, monitor data, service properties, service dependencies, trouble-shooting documents, etc.) to generate insights for detection, root causing and mitigating of incidents. Recent advancements in large language models [LLMs](e.g., ChatGPT, GPT-4, Gemini) created opportunities to automatically generate contextual recommendations to the OCEs assisting them to quickly identify and mitigate critical issues. However, existing research typically takes a silo-ed view for solving a certain task in incident management by leveraging data from a single stage of SDLC. In this paper, we demonstrate that augmenting additional contextual data from different stages of SDLC improves the performance of two critically important and practically challenging tasks: (1) automatically generating root cause recommendations for dependency failure related incidents, and (2) identifying ontology of service monitors used for automatically detecting incidents. By leveraging 353 incident and 260 monitor dataset from Microsoft, we demonstrate that augmenting contextual information from different stages of the SDLC improves the performance over the state-of-the-art methods.
Thu 18 JulDisplayed time zone: Brasilia, Distrito Federal, Brazil change
16:00 - 18:00 | AI4SE 3Industry Papers / Demonstrations / Journal First at Pitomba Chair(s): Maliheh Izadi Delft University of Technology | ||
16:00 18mTalk | Rethinking Software Engineering in the Era of Foundation Models Industry Papers Ahmed E. Hassan Queen’s University, Dayi Lin Centre for Software Excellence, Huawei Canada, Gopi Krishnan Rajbahadur Centre for Software Excellence, Huawei, Canada, Keheliya Gallaba Centre for Software Excellence, Huawei Canada, Filipe Cogo Centre for Software Excellence, Huawei Canada, Boyuan Chen Centre for Software Excellence, Huawei Canada, Haoxiang Zhang Huawei, Kishanthan Thangarajah Centre for Software Excellence, Huawei Canada, Gustavo Oliva Centre for Software Excellence, Huawei Canada, Jiahuei (Justina) Lin Centre for Software Excellence, Huawei Canada, Wali Mohammad Abdullah Centre for Software Excellence, Huawei Canada, Zhen Ming (Jack) Jiang York University | ||
16:18 18mTalk | LM-PACE: Confidence Estimation by Large Language Models for Effective Root Causing of Cloud Incidents Industry Papers Shizhuo Zhang University of Illinois Urbana-Champaign, Xuchao Zhang Microsoft, Chetan Bansal Microsoft Research, Pedro Las-Casas Microsoft, Rodrigo Fonseca Microsoft Research, Saravan Rajmohan Microsoft | ||
16:36 18mTalk | Application of Quantum Extreme Learning Machines for QoS Prediction of Elevators' Software in an Industrial Context Industry Papers Xinyi Wang Simula Research Laboratory and University of Oslo, Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University, Aitor Arrieta Mondragon University, Paolo Arcaini National Institute of Informatics
, Maite Arratibel Orona | ||
16:54 18mTalk | X-lifecycle Learning for Cloud Incident Management using LLMs Industry Papers Drishti Goel Microsoft, Fiza Husain Microsoft, Aditya Kumar Singh Microsoft, Supriyo Ghosh Microsoft, Anjaly Parayil Microsoft, Chetan Bansal Microsoft Research, Xuchao Zhang Microsoft, Saravan Rajmohan Microsoft Media Attached | ||
17:12 18mTalk | Neat: Mobile App Layout Similarity Comparison based on Graph Convolutional Networks Industry Papers Zhu Tao ByteDance, Yongqiang Gao ByteDance, Jiayi Qi ByteDance, Chao Peng ByteDance, China, Qinyun Wu Bytedance Ltd., Xiang Chen ByteDance, Ping Yang Bytedance Network Technology | ||
17:30 18mTalk | Transformers and Meta-Tokenization in Sentiment Analysis for Software Engineering Journal First Nathan Cassee Eindhoven University of Technology, Andrei Agaronian Eindhoven University of Technology, Eleni Constantinou University of Cyprus, Nicole Novielli University of Bari, Alexander Serebrenik Eindhoven University of Technology | ||
17:48 9mTalk | EM-Assist: Safe automated ExtractMethod refactoring with LLMs Demonstrations Dorin Pomian University of Colorado Boulder, Abhiram Bellur University of Colorado Boulder, Malinda Dilhara University of Colorado Boulder, Zarina Kurbatova JetBrains Research, Egor Bogomolov JetBrains Research, Andrey Sokolov JetBrains Research, Timofey Bryksin JetBrains Research, Danny Dig University of Colorado Boulder, JetBrains Research Pre-print |