Thu 18 Jul 2024 16:18 - 16:36 at Acerola - Log Analysis and Debugging Chair(s): Domenico Bianculli

Data-intensive scalable computing (DISC) frameworks, such as Apache Spark, support runtimes in many popular languages. Yet, SQL is still the most commonly used front-end language for DISC applications due to its broad presence in new and legacy workflows and shallow learning curve. However, DISC-backed SQL introduces several layers of abstraction that significantly reduce the visibility and transparency of workflows, making it challenging for developers to find and fix errors in a query. When a query returns incorrect outputs, it takes a non-trivial, manual effort to comprehend every stage of the query execution and find the root cause of bugs among the input data and complex SQL query. We aim to bring the benefits of step-through interactive debugging to DISC-powered SQL with DeSQL. When a SQL query is executed on a DISC system, DeSQL automatically decomposes it into subqueries and closely monitors the execution to identify the precise intermediate data corresponding to every constituent subquery. This enables a complete interactive debugging experience with full access to the intermediate query states. We evaluate DeSQL’s scalability, overhead, and efficiency against two baselines. The experiment results show that DeSQL can provide a complete debugging view in 13% less time than the original job time while incurring an average overhead of 10% in addition to retaining Apache Spark’s scale-out and scale-up properties. Through a user study comprising 10 participants engaged in two debugging tasks, we find that participants utilizing DeSQL identify the root cause behind a wrong query output in 75% less time than the de-facto, manual debugging.

Thu 18 Jul

Displayed time zone: Brasilia, Distrito Federal, Brazil change

16:00 - 18:00
Log Analysis and DebuggingResearch Papers / Industry Papers at Acerola
Chair(s): Domenico Bianculli University of Luxembourg
16:00
18m
Talk
Go Static: Contextualized Logging Statement Generation
Research Papers
Yichen LI The Chinese University of Hong Kong, Yintong Huo The Chinese University of Hong Kong, Renyi Zhong The Chinese University of Hong Kong, Zhihan Jiang The Chinese University of Hong Kong, Jinyang Liu The Chinese University of Hong Kong, Junjie Huang The Chinese University of Hong Kong, Jiazhen Gu The Chinese University of Hong Kong, Pinjia He Chinese University of Hong Kong, Shenzhen, Michael Lyu The Chinese University of Hong Kong
16:18
18m
Talk
DeSQL: Interactive Debugging of SQL in Data-Intensive Scalable Computing
Research Papers
Sabaat Haroon Virginia tech, Chris Brown Virginia Tech, Muhammad Ali Gulzar Virginia Tech
16:36
18m
Talk
DTD: Comprehensive and Scalable Testing for Debuggers
Research Papers
Hongyi Lu Southern University of Science and Technology/Hong Kong University of Science and Technology, Zhibo Liu The Hong Kong University of Science and Technology, Shuai Wang The Hong Kong University of Science and Technology, Fengwei Zhang Southern University of Science and Technology
16:54
9m
Talk
Decoding Anomalies! Unraveling Operational Challenges in Human-in-the-Loop Anomaly Validation
Industry Papers
Dong Jae Kim Concordia University, Steven Locke , Tse-Hsun (Peter) Chen Concordia University, Andrei Toma ERA Environmental Management Solutions, Sarah Sajedi ERA Environmental Management Solutions, Steve Sporea , Laura Weinkam
17:03
18m
Talk
A Critical Review of Common Log Data Sets Used for Evaluation of Sequence-based Anomaly Detection Techniques
Research Papers
Max Landauer AIT Austrian Institute of Technology, Florian Skopik AIT Austrian Institute of Technology, Markus Wurzenberger AIT Austrian Institute of Technology
17:21
18m
Research paper
LILAC: Log Parsing using LLMs with Adaptive Parsing Cache
Research Papers
Zhihan Jiang The Chinese University of Hong Kong, Jinyang Liu The Chinese University of Hong Kong, Zhuangbin Chen School of Software Engineering, Sun Yat-sen University, Yichen LI The Chinese University of Hong Kong, Junjie Huang The Chinese University of Hong Kong, Yintong Huo The Chinese University of Hong Kong, Pinjia He Chinese University of Hong Kong, Shenzhen, Jiazhen Gu The Chinese University of Hong Kong, Michael Lyu The Chinese University of Hong Kong
DOI Pre-print
17:39
18m
Talk
TraStrainer: Adaptive Sampling for Distributed Traces with System Runtime StateDistinguished Paper Award
Research Papers
Haiyu Huang Sun Yat-sen University, Xiaoyu Zhang HUAWEI CLOUD COMPUTING TECHNOLOGIES CO. LTD., Pengfei Chen Sun Yat-sen University, Zilong He Sun Yat-sen University, Zhiming Chen Sun Yat-sen University, Guangba  Yu Sun Yat-sen University, Hongyang Chen Sun Yat-sen University, Chen Sun Huawei
Pre-print