DeSQL: Interactive Debugging of SQL in Data-Intensive Scalable Computing (FSE 2024 - Research Papers) - FSE 2024

Mon 15 - Fri 19 July 2024 Porto de Galinhas, Brazil, Brazil

Who

Sabaat Haroon, Chris Brown, Muhammad Ali Gulzar

Track

FSE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Thu 18 Jul 2024 16:18 - 16:36 at Acerola - Log Analysis and Debugging Chair(s): Domenico Bianculli

Abstract

Data-intensive scalable computing (DISC) frameworks, such as Apache Spark, support runtimes in many popular languages. Yet, SQL is still the most commonly used front-end language for DISC applications due to its broad presence in new and legacy workflows and shallow learning curve. However, DISC-backed SQL introduces several layers of abstraction that significantly reduce the visibility and transparency of workflows, making it challenging for developers to find and fix errors in a query. When a query returns incorrect outputs, it takes a non-trivial, manual effort to comprehend every stage of the query execution and find the root cause of bugs among the input data and complex SQL query. We aim to bring the benefits of step-through interactive debugging to DISC-powered SQL with DeSQL. When a SQL query is executed on a DISC system, DeSQL automatically decomposes it into subqueries and closely monitors the execution to identify the precise intermediate data corresponding to every constituent subquery. This enables a complete interactive debugging experience with full access to the intermediate query states. We evaluate DeSQL’s scalability, overhead, and efficiency against two baselines. The experiment results show that DeSQL can provide a complete debugging view in 13% less time than the original job time while incurring an average overhead of 10% in addition to retaining Apache Spark’s scale-out and scale-up properties. Through a user study comprising 10 participants engaged in two debugging tasks, we find that participants utilizing DeSQL identify the root cause behind a wrong query output in 75% less time than the de-facto, manual debugging.

Sabaat Haroon

Virginia tech

Chris Brown

Virginia Tech

United States

Muhammad Ali Gulzar

Virginia Tech

United States

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Thu 18 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

	16:00 - 18:00	Log Analysis and DebuggingResearch Papers / Industry Papers at Acerola Chair(s): Domenico Bianculli University of Luxembourg

	16:00 18m Talk		Go Static: Contextualized Logging Statement Generation Research Papers Yichen LI The Chinese University of Hong Kong, Yintong Huo The Chinese University of Hong Kong, Renyi Zhong The Chinese University of Hong Kong, Zhihan Jiang The Chinese University of Hong Kong, Jinyang Liu The Chinese University of Hong Kong, Junjie Huang The Chinese University of Hong Kong, Jiazhen Gu The Chinese University of Hong Kong, Pinjia He Chinese University of Hong Kong, Shenzhen, Michael Lyu The Chinese University of Hong Kong
	16:18 18m Talk		DeSQL: Interactive Debugging of SQL in Data-Intensive Scalable Computing Research Papers Sabaat Haroon Virginia tech, Chris Brown Virginia Tech, Muhammad Ali Gulzar Virginia Tech
	16:36 18m Talk		DTD: Comprehensive and Scalable Testing for Debuggers Research Papers Hongyi Lu Southern University of Science and Technology/Hong Kong University of Science and Technology, Zhibo Liu The Hong Kong University of Science and Technology, Shuai Wang The Hong Kong University of Science and Technology, Fengwei Zhang Southern University of Science and Technology
	16:54 9m Talk		Decoding Anomalies! Unraveling Operational Challenges in Human-in-the-Loop Anomaly Validation Industry Papers Dong Jae Kim Concordia University, Steven Locke , Tse-Hsun (Peter) Chen Concordia University, Andrei Toma ERA Environmental Management Solutions, Sarah Sajedi ERA Environmental Management Solutions, Steve Sporea , Laura Weinkam
	17:03 18m Talk		A Critical Review of Common Log Data Sets Used for Evaluation of Sequence-based Anomaly Detection Techniques Research Papers Max Landauer AIT Austrian Institute of Technology, Florian Skopik AIT Austrian Institute of Technology, Markus Wurzenberger AIT Austrian Institute of Technology
	17:21 18m Research paper		LILAC: Log Parsing using LLMs with Adaptive Parsing Cache Research Papers Zhihan Jiang The Chinese University of Hong Kong, Jinyang Liu The Chinese University of Hong Kong, Zhuangbin Chen School of Software Engineering, Sun Yat-sen University, Yichen LI The Chinese University of Hong Kong, Junjie Huang The Chinese University of Hong Kong, Yintong Huo The Chinese University of Hong Kong, Pinjia He Chinese University of Hong Kong, Shenzhen, Jiazhen Gu The Chinese University of Hong Kong, Michael Lyu The Chinese University of Hong Kong DOI Pre-print
	17:39 18m Talk		TraStrainer: Adaptive Sampling for Distributed Traces with System Runtime State Research Papers Haiyu Huang Sun Yat-sen University, Xiaoyu Zhang HUAWEI CLOUD COMPUTING TECHNOLOGIES CO. LTD., Pengfei Chen Sun Yat-sen University, Zilong He Sun Yat-sen University, Zhiming Chen Sun Yat-sen University, Guangba Yu Sun Yat-sen University, Hongyang Chen Sun Yat-sen University, Chen Sun Huawei Pre-print