TraStrainer: Adaptive Sampling for Distributed Traces with System Runtime State
Distributed tracing has been widely adopted in many microservice systems and plays an important role in monitoring and analyzing the system. However, trace data often come in large volumes, incurring substantial computational and storage costs. To reduce the quantity of traces, trace sampling has become a prominent topic of discussion, and several methods have been proposed in prior work. To attain higher-quality sampling outcomes, biased sampling has gained more attention compared to random sampling. Previous biased sampling methods primarily considered the importance of traces based on diversity, aiming to sample more edge-case traces and fewer common-case traces. However, we contend that relying solely on trace diversity for sampling is insufficient, system runtime state is another crucial factor that needs to be considered, especially in cases of system failures. In this study, we introduce \textit{TraStrainer}, an online sampler that takes into account both system runtime state and trace diversity. \textit{TraStrainer} employs an interpretable and automated encoding method to represent traces as vectors. Simultaneously, it adaptively determines sampling preferences by analyzing system runtime metrics. When sampling, it combines the results of system-bias and diversity-bias through a dynamic voting mechanism. Experimental results demonstrate that \textit{TraStrainer} can achieve higher quality sampling results and significantly improve the performance of downstream root cause analysis (RCA) tasks. It has led to an average increase of 32.63% in Top-1 RCA accuracy compared to four baselines in two datasets.
Thu 18 JulDisplayed time zone: Brasilia, Distrito Federal, Brazil change
16:00 - 18:00 | Log Analysis and DebuggingResearch Papers / Industry Papers at Acerola Chair(s): Domenico Bianculli University of Luxembourg | ||
16:00 18mTalk | Go Static: Contextualized Logging Statement Generation Research Papers Yichen LI The Chinese University of Hong Kong, Yintong Huo The Chinese University of Hong Kong, Renyi Zhong The Chinese University of Hong Kong, Zhihan Jiang The Chinese University of Hong Kong, Jinyang Liu The Chinese University of Hong Kong, Junjie Huang The Chinese University of Hong Kong, Jiazhen Gu The Chinese University of Hong Kong, Pinjia He Chinese University of Hong Kong, Shenzhen, Michael Lyu The Chinese University of Hong Kong | ||
16:18 18mTalk | DeSQL: Interactive Debugging of SQL in Data-Intensive Scalable Computing Research Papers | ||
16:36 18mTalk | DTD: Comprehensive and Scalable Testing for Debuggers Research Papers Hongyi Lu Southern University of Science and Technology/Hong Kong University of Science and Technology, Zhibo Liu The Hong Kong University of Science and Technology, Shuai Wang The Hong Kong University of Science and Technology, Fengwei Zhang Southern University of Science and Technology | ||
16:54 9mTalk | Decoding Anomalies! Unraveling Operational Challenges in Human-in-the-Loop Anomaly Validation Industry Papers Dong Jae Kim Concordia University, Steven Locke , Tse-Hsun (Peter) Chen Concordia University, Andrei Toma ERA Environmental Management Solutions, Sarah Sajedi ERA Environmental Management Solutions, Steve Sporea , Laura Weinkam | ||
17:03 18mTalk | A Critical Review of Common Log Data Sets Used for Evaluation of Sequence-based Anomaly Detection Techniques Research Papers Max Landauer AIT Austrian Institute of Technology, Florian Skopik AIT Austrian Institute of Technology, Markus Wurzenberger AIT Austrian Institute of Technology | ||
17:21 18mResearch paper | LILAC: Log Parsing using LLMs with Adaptive Parsing Cache Research Papers Zhihan Jiang The Chinese University of Hong Kong, Jinyang Liu The Chinese University of Hong Kong, Zhuangbin Chen School of Software Engineering, Sun Yat-sen University, Yichen LI The Chinese University of Hong Kong, Junjie Huang The Chinese University of Hong Kong, Yintong Huo The Chinese University of Hong Kong, Pinjia He Chinese University of Hong Kong, Shenzhen, Jiazhen Gu The Chinese University of Hong Kong, Michael Lyu The Chinese University of Hong Kong DOI Pre-print | ||
17:39 18mTalk | TraStrainer: Adaptive Sampling for Distributed Traces with System Runtime State Research Papers Haiyu Huang Sun Yat-sen University, Xiaoyu Zhang HUAWEI CLOUD COMPUTING TECHNOLOGIES CO. LTD., Pengfei Chen Sun Yat-sen University, Zilong He Sun Yat-sen University, Zhiming Chen Sun Yat-sen University, Guangba Yu Sun Yat-sen University, Hongyang Chen Sun Yat-sen University, Chen Sun Huawei Pre-print |