CORE: Resolving Code Quality Issues Using LLMs (FSE 2024 - Posters)

Who

Nalin Wadhwa, Jui Pradhan, Atharv Sonwane, Surya Prakash Sahu, Nagarajan Natarajan, Aditya Kanade, Suresh Parthasarathy, Sriram Rajamani

Track

FSE 2024 Posters

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Jul 2024 15:30 - 16:00 at Lounge - Poster Session 3

Abstract

As software projects progress, quality of code assumes paramount importance as it affects reliability, maintainability and security of software. For this reason, static analysis tools are used in developer workflows to flag code quality issues. However, developers need to spend extra efforts to revise their code to improve code quality based on the tool findings. In this work, we investigate the use of (instruction-following) large language models (LLMs) to assist developers in revising code to resolve code quality issues.

We present a tool, CORE (short for COde REvisions), architected using a pair of LLMs organized as a duo comprised of a proposer and a ranker. Providers of static analysis tools recommend ways to mitigate the tool warnings and developers follow them to revise their code. The \emph{proposer LLM} of CORE takes the same set of recommendations and applies them to generate candidate code revisions. The candidates which pass the static quality checks are retained. However, the LLM may introduce subtle, unintended functionality changes which may go un-detected by the static analysis. The \emph{ranker LLM} evaluates the changes made by the proposer using a rubric that closely follows the acceptance criteria that a developer would enforce. CORE uses the scores assigned by the ranker LLM to rank the candidate revisions before presenting them to the developer.

We conduct a variety of experiments on two public benchmarks to show the ability of CORE: 1) to generate code revisions acceptable to both static analysis tools and human reviewers (the latter evaluated with human study on a subset of the Python benchmark), 2) to reduce human review efforts by detecting and eliminating revisions with unintended changes, 3) to readily work across multiple languages (Python and Java), static analysis tools (CodeQL and SonarQube) and quality checks (52 and 10 checks, respectively), and 4) to achieve fix rate comparable to a rule-based automated program repair tool but with much smaller engineering efforts (on the Java benchmark).

CORE could revise 59.2% Python files (across 52 quality checks) so that they pass scrutiny by both a tool and a human reviewer. The ranker LLM is able to reduce false positives by 25.8% in these cases. CORE produced revisions that passed the static analysis tool in 76.8% Java files (across 10 quality checks) comparable to 78.3% of a specialized program repair tool, with significantly much less engineering efforts.

Nalin Wadhwa

Microsoft Research, India

Jui Pradhan

Microsoft Research, India

Atharv Sonwane

Microsoft Research, India

Surya Prakash Sahu

Microsoft Research, India

Nagarajan Natarajan

Microsoft Research India

Aditya Kanade

Microsoft Research, India

Suresh Parthasarathy

Microsoft Research, India

Sriram Rajamani

Microsoft Research Indua

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 18 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

15:30 - 16:00	Poster Session 3Posters at Lounge

15:30 30m Poster		Can GPT-4 Replicate Empirical Software Engineering Research? Posters Jenny T. Liang Carnegie Mellon University, Carmen Badea Microsoft Research, Christian Bird Microsoft Research, Robert DeLine Microsoft Research, Denae Ford Microsoft Research, Nicole Forsgren Microsoft Research, Thomas Zimmermann Microsoft Research
15:30 30m Poster		Evaluating Directed Fuzzers: Are We Heading in the Right Direction? Posters Tae Eun Kim KAIST, Jaeseung Choi Sogang University, Seongjae Im KAIST, Kihong Heo KAIST, Sang Kil Cha KAIST Link to publication Media Attached File Attached
15:30 30m Poster		Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection Posters Yuxi Li Huazhong University of Science and Technology, Yi Liu Nanyang Technological University, Gelei Deng Nanyang Technological University, Ying Zhang Virginia Tech, Wenjia Song Virginia Tech, Ling Shi Nanyang Technological University, Kailong Wang Huazhong University of Science and Technology, Yuekang Li The University of New South Wales, Yang Liu Nanyang Technological University, Haoyu Wang Huazhong University of Science and Technology
15:30 30m Poster		Do Words Have Power? Understanding and Fostering Civility in Code Review Discussion Posters Md Shamimur Rahman University of Saskatchewan, Canada, Zadia Codabux University of Saskatchewan, Chanchal K. Roy University of Saskatchewan, Canada
15:30 30m Poster		CodePlan: Repository-level Coding using LLMs and Planning Posters Ramakrishna Bairi Microsoft Research, India, Atharv Sonwane Microsoft Research, India, Aditya Kanade Microsoft Research, India, Vageesh D C Microsoft Research, India, Arun Iyer Microsoft Research, India, Suresh Parthasarathy Microsoft Research, India, Sriram Rajamani Microsoft Research Indua, B. Ashok Microsoft Research. India, Shashank Shet Microsoft Research. India
15:30 30m Poster		Understanding and Detecting Annotation-induced Faults of Static Analyzers Posters Huaien Zhang The Hong Kong Polytechnic University, Yu Pei The Hong Kong Polytechnic University, Shuyun Liang Southern University of Science and Technology, Shin Hwei Tan Concordia University
15:30 30m Poster		Partial Solution Based Constraint Solving Cache in Symbolic Execution Posters Ziqi Shuai School of Computer, National University of Defense Technology, China, Zhenbang Chen College of Computer, National University of Defense Technology, Kelin Ma School of Computer, National University of Defense Technology, China, Kunlin Liu School of Computer, National University of Defense Technology, China, Yufeng Zhang Hunan University, Jun Sun School of Information Systems, Singapore Management University, Singapore, Ji Wang School of Computer, National University of Defense Technology, China
15:30 30m Poster		Characterizing Python Library Migrations Posters Mohayeminul Islam University of Alberta, Ajay Jha North Dakota State University, Ildar Akhmetov Northeastern University, Sarah Nadi New York University Abu Dhabi, University of Alberta File Attached
15:30 30m Poster		DeSQL: Interactive Debugging of SQL in Data-Intensive Scalable Computing Posters Sabaat Haroon Virginia tech, Chris Brown Virginia Tech, Muhammad Ali Gulzar Virginia Tech
15:30 30m Poster		BARO: Robust Root Cause Analysis for Microservices via Multivariate Bayesian Online Change Point Detection Posters Luan Pham RMIT University, Huong Ha RMIT University, Hongyu Zhang Chongqing University
15:30 30m Poster		Less Cybersickness, Please: Demystifying and Detecting Stereoscopic Visual Inconsistencies in Virtual Reality Applications Posters Shuqing Li The Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Jianping Zhang The Chinese University of Hong Kong, Yujia Zhang Harbin Institute of Technology, Yepang Liu Southern University of Science and Technology, Jiazhen Gu The Chinese University of Hong Kong, Yun Peng The Chinese University of Hong Kong, Michael Lyu The Chinese University of Hong Kong
15:30 30m Poster		CORE: Resolving Code Quality Issues Using LLMs Posters Nalin Wadhwa Microsoft Research, India, Jui Pradhan Microsoft Research, India, Atharv Sonwane Microsoft Research, India, Surya Prakash Sahu Microsoft Research, India, Nagarajan Natarajan Microsoft Research India, Aditya Kanade Microsoft Research, India, Suresh Parthasarathy Microsoft Research, India, Sriram Rajamani Microsoft Research Indua
15:30 30m Poster		Abstraction-Aware Inference of Metamorphic Relations Posters Agustin Nolasco University of Rio Cuarto, Facundo Molina IMDEA Software Institute, Renzo Degiovanni Luxembourg Institute of Science and Technology, Alessandra Gorla IMDEA Software Institute, Diego Garbervetsky Departamento de Computación, FCEyN, UBA, Mike Papadakis University of Luxembourg, Sebastian Uchitel Imperial College and University of Buenos Aires, Nazareno Aguirre University of Rio Cuarto and CONICET, Marcelo F. Frias Dept. of Software Engineering Instituto Tecnológico de Buenos Aires
15:30 30m Poster		State Reconciliation Defects in Infrastructure as Code Posters Md Mahadi Hassan Auburn University, John Salvador Auburn University, Shubhra Kanti Karmaker Santu Auburn University, Akond Rahman Auburn University

Information for Participants

Thu 18 Jul 2024 15:30 - 16:00 at Lounge - Poster Session 3

Info for room Lounge:

This room is conjoined with the Foyer to provide additional space for the coffee break, and hold poster presentations throughout the event.