LILAC: Log Parsing using LLMs with Adaptive Parsing Cache (FSE 2024 - Research Papers)

Who

Zhihan Jiang, Jinyang Liu, Zhuangbin Chen, Yichen LI, Junjie Huang, Yintong Huo, Pinjia He, Jiazhen Gu, Michael Lyu

Track

FSE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Jul 2024 17:21 - 17:39 at Acerola - Log Analysis and Debugging Chair(s): Domenico Bianculli

Abstract

Log parsing transforms log messages into structured formats, serving as the prerequisite step for various log analysis tasks. Although a variety of log parsing approaches have been proposed, their performance on complicated log data remains compromised due to the use of human-crafted rules or learning-based models with limited training data. The recent emergence of powerful large language models (LLMs) demonstrates their vast pre-trained knowledge related to code and logging, making it promising to apply LLMs for log parsing. However, their lack of specialized log parsing capabilities currently hinders their accuracy in parsing. Moreover, the inherent inconsistent answers, as well as the substantial overhead, prevent the practical adoption of LLM-based log parsing.

To address these challenges, we propose LILAC, the first practical log parsing framework using LLMs with adaptive parsing cache. To facilitate accurate and robust log parsing, LILAC leverages the in-context learning (ICL) capability of the LLM by performing a hierarchical candidate sampling algorithm and selecting high-quality demonstrations. Furthermore, LILAC incorporates a novel component, an adaptive parsing cache, to store and refine the templates generated by the LLM. It helps mitigate LLM’s inefficiency issue by enabling rapid retrieval of previously processed log templates. In this process, LILAC adaptively updates the templates within the parsing cache to ensure the consistency of parsed results. The extensive evaluation on public large-scale datasets shows that LILAC outperforms state-of-the-art methods by 69.5% in terms of the average F1 score of template accuracy. In addition, LILAC reduces the query times to LLMs by several orders of magnitude, achieving a comparable efficiency to the fastest baseline.

Link to Preprint

https://arxiv.org/abs/2310.01796

DOI

https://doi.org/10.1145/3643733

Zhihan Jiang

The Chinese University of Hong Kong

Jinyang Liu

The Chinese University of Hong Kong

China

Zhuangbin Chen

School of Software Engineering, Sun Yat-sen University

Yichen LI

The Chinese University of Hong Kong

China

Junjie Huang

The Chinese University of Hong Kong

Yintong Huo

The Chinese University of Hong Kong

Hong Kong SAR China

Pinjia He

Chinese University of Hong Kong, Shenzhen

China

Jiazhen Gu

The Chinese University of Hong Kong

China

Michael Lyu

The Chinese University of Hong Kong

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 18 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

16:00 - 18:00	Log Analysis and DebuggingResearch Papers / Industry Papers at Acerola Chair(s): Domenico Bianculli University of Luxembourg

16:00 18m Talk		Go Static: Contextualized Logging Statement Generation Research Papers Yichen LI The Chinese University of Hong Kong, Yintong Huo The Chinese University of Hong Kong, Renyi Zhong The Chinese University of Hong Kong, Zhihan Jiang The Chinese University of Hong Kong, Jinyang Liu The Chinese University of Hong Kong, Junjie Huang The Chinese University of Hong Kong, Jiazhen Gu The Chinese University of Hong Kong, Pinjia He Chinese University of Hong Kong, Shenzhen, Michael Lyu The Chinese University of Hong Kong
16:18 18m Talk		DeSQL: Interactive Debugging of SQL in Data-Intensive Scalable Computing Research Papers Sabaat Haroon Virginia tech, Chris Brown Virginia Tech, Muhammad Ali Gulzar Virginia Tech
16:36 18m Talk		DTD: Comprehensive and Scalable Testing for Debuggers Research Papers Hongyi Lu Southern University of Science and Technology/Hong Kong University of Science and Technology, Zhibo Liu The Hong Kong University of Science and Technology, Shuai Wang The Hong Kong University of Science and Technology, Fengwei Zhang Southern University of Science and Technology
16:54 9m Talk		Decoding Anomalies! Unraveling Operational Challenges in Human-in-the-Loop Anomaly Validation Industry Papers Dong Jae Kim Concordia University, Steven Locke , Tse-Hsun (Peter) Chen Concordia University, Andrei Toma ERA Environmental Management Solutions, Sarah Sajedi ERA Environmental Management Solutions, Steve Sporea , Laura Weinkam
17:03 18m Talk		A Critical Review of Common Log Data Sets Used for Evaluation of Sequence-based Anomaly Detection Techniques Research Papers Max Landauer AIT Austrian Institute of Technology, Florian Skopik AIT Austrian Institute of Technology, Markus Wurzenberger AIT Austrian Institute of Technology
17:21 18m Research paper		LILAC: Log Parsing using LLMs with Adaptive Parsing Cache Research Papers Zhihan Jiang The Chinese University of Hong Kong, Jinyang Liu The Chinese University of Hong Kong, Zhuangbin Chen School of Software Engineering, Sun Yat-sen University, Yichen LI The Chinese University of Hong Kong, Junjie Huang The Chinese University of Hong Kong, Yintong Huo The Chinese University of Hong Kong, Pinjia He Chinese University of Hong Kong, Shenzhen, Jiazhen Gu The Chinese University of Hong Kong, Michael Lyu The Chinese University of Hong Kong DOI Pre-print
17:39 18m Talk		TraStrainer: Adaptive Sampling for Distributed Traces with System Runtime State Research Papers Haiyu Huang Sun Yat-sen University, Xiaoyu Zhang HUAWEI CLOUD COMPUTING TECHNOLOGIES CO. LTD., Pengfei Chen Sun Yat-sen University, Zilong He Sun Yat-sen University, Zhiming Chen Sun Yat-sen University, Guangba Yu Sun Yat-sen University, Hongyang Chen Sun Yat-sen University, Chen Sun Huawei Pre-print