Unprecedented Code Change Automation: The Fusion of LLMs and Transformation by Example (FSE 2024 - Research Papers)

Who

Malinda Dilhara, Abhiram Bellur, Timofey Bryksin, Danny Dig

Track

FSE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 17 Jul 2024 11:54 - 12:12 at Acerola - Software Maintenance and Comprehension 1 Chair(s): Wesley Assunção

Abstract

Because of the naturalness of software, developers often repeat the same code changes within a project or across different projects. We call these “code change patterns” (CPATs). Automating CPATs is crucial to expedite the software development process. While current Transformation by Example (TBE) techniques can automate CPATs, they are limited by the quality and quantity of the provided input examples. Thus, they miss transforming code variations that do not have the exact syntax, data-, or control-flow of the provided input examples, despite being semantically similar. Large Language Models (LLMs) are pre-trained on extensive datasets of source code. If we can harness LLMs’ creativity to produce semantically equivalent, yet previously unseen variants of the original CPAT, we can significantly increase the effectiveness of TBE systems.

In this paper, we discover best practices for harnessing LLMs to generate code variants that meet three criteria: correctness (semantic equivalence to the original CPAT), usefulness (absence of hallucinations), and applicability (aligning with the primary intent of the original CPAT). We instantiate these practices into our tool PyCraft, which synergistically combines static code analysis, dynamic analysis, and LLM capabilities. By employing chain-of-thought reasoning, PyCraft generates both variations of input examples and comprehensive test cases that can identify correct variations with an F-measure of 96.6%. Our algorithm uses a fixed-point iteration to create relevant variations and expands the original input examples by a factor of 44x. Using these richly generated examples, we inferred transformation rules and then automated these changes, resulting in an increase of up to 39x, with an average increase of 14x in target codes compared to a previous state-of-the-art tool that relies solely on static analysis. We submitted patches generated by PyCraft to a range of projects, notably esteemed ones like microsoft/DeepSpeed and IBM/inFairness. Their developers accepted and merged 83% the 86 CPAT instances submitted through 44 pull requests. This confirms the usefulness of these changes.

Link to Preprint

https://arxiv.org/abs/2402.07138

Malinda Dilhara

University of Colorado Boulder

United States

Abhiram Bellur

University of Colorado Boulder

Timofey Bryksin

JetBrains Research

Cyprus

Danny Dig

University of Colorado Boulder, JetBrains Research