Wed 17 Jul 2024 11:54 - 12:12 at Acerola - Software Maintenance and Comprehension 1 Chair(s): Wesley Assunção

Because of the naturalness of software, developers often repeat the same code changes within a project or across different projects. We call these “code change patterns” (CPATs). Automating CPATs is crucial to expedite the software development process. While current Transformation by Example (TBE) techniques can automate CPATs, they are limited by the quality and quantity of the provided input examples. Thus, they miss transforming code variations that do not have the exact syntax, data-, or control-flow of the provided input examples, despite being semantically similar. Large Language Models (LLMs) are pre-trained on extensive datasets of source code. If we can harness LLMs’ creativity to produce semantically equivalent, yet previously unseen variants of the original CPAT, we can significantly increase the effectiveness of TBE systems.

In this paper, we discover best practices for harnessing LLMs to generate code variants that meet three criteria: correctness (semantic equivalence to the original CPAT), usefulness (absence of hallucinations), and applicability (aligning with the primary intent of the original CPAT). We instantiate these practices into our tool PyCraft, which synergistically combines static code analysis, dynamic analysis, and LLM capabilities. By employing chain-of-thought reasoning, PyCraft generates both variations of input examples and comprehensive test cases that can identify correct variations with an F-measure of 96.6%. Our algorithm uses a fixed-point iteration to create relevant variations and expands the original input examples by a factor of 44x. Using these richly generated examples, we inferred transformation rules and then automated these changes, resulting in an increase of up to 39x, with an average increase of 14x in target codes compared to a previous state-of-the-art tool that relies solely on static analysis. We submitted patches generated by PyCraft to a range of projects, notably esteemed ones like microsoft/DeepSpeed and IBM/inFairness. Their developers accepted and merged 83% the 86 CPAT instances submitted through 44 pull requests. This confirms the usefulness of these changes.

Wed 17 Jul

Displayed time zone: Brasilia, Distrito Federal, Brazil change

11:00 - 12:30
Software Maintenance and Comprehension 1Research Papers / Ideas, Visions and Reflections / Demonstrations at Acerola
Chair(s): Wesley Assunção North Carolina State University
11:00
18m
Talk
Enhancing Function Name Prediction using Votes-Based Name Tokenization and Multi-Task Learning
Research Papers
Xiaoling Zhang Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security, University of Chinese Academy of Sciences,, Zhengzi Xu Nanyang Technological University, shouguo yang Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China, Zhi Li Institute of Information Engineering, Chinese Academy of Sciences, China, zhiqiang shi Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security, University of Chinese Academy of Sciences,, Limin Sun Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security, University of Chinese Academy of Sciences,
DOI Pre-print
11:18
18m
Talk
Only diff is Not Enough: Generating Commit Messages Leveraging Reasoning and Action of Large Language ModelDistinguished Paper Award
Research Papers
Jiawei Li University of California, Irvine, David Faragó Innoopract GmbH & QPR Technologies, Christian Petrov Innoopract GmbH, Iftekhar Ahmed University of California, Irvine
11:36
18m
Talk
Towards Efficient Build Ordering for Incremental Builds with Multiple Configurations
Research Papers
Jun Lyu Nanjing University, Shanshan Li Software Institute, Nanjing University, He Zhang Nanjing University, Lanxin Yang Nanjing University, Bohan Liu Nanjing University, Manuel Rigger National University of Singapore
11:54
18m
Talk
Unprecedented Code Change Automation: The Fusion of LLMs and Transformation by Example
Research Papers
Malinda Dilhara University of Colorado Boulder, Abhiram Bellur University of Colorado Boulder, Timofey Bryksin JetBrains Research, Danny Dig University of Colorado Boulder, JetBrains Research
Pre-print
12:12
9m
Talk
Variability-Aware Differencing with DiffDetectiveBest Demo Paper
Demonstrations
Paul Maximilian Bittner Paderborn University, Alexander Schultheiß Paderborn University, Benjamin Moosherr University of Ulm, Timo Kehrer University of Bern, Thomas Thüm Paderborn University
Pre-print Media Attached
12:21
9m
Talk
From Models to Practice: Enhancing OSS Project Sustainability with Evidence-Based Advice
Ideas, Visions and Reflections
Nafiz Imtiaz Khan Department of Computer Science, University of California, Davis, Vladimir Filkov University of California at Davis, USA