AI-assisted Code Authoring at Scale: Fine-tuning, deploying, and mixed methods evaluation (FSE 2024 - Research Papers)

Mon 15 - Fri 19 July 2024 Porto de Galinhas, Brazil, Brazil

Who

Vijayaraghavan Murali, Chandra Sekhar Maddila, Imad Ahmad, Michael Bolin, Daniel Cheng, Negar Ghorbani, Renuka Fernandez, Nachiappan Nagappan, Peter C Rigby

Track

FSE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 17 Jul 2024 11:18 - 11:36 at Mandacaru - Human Aspects 1 Chair(s): Christoph Treude

Abstract

The rise of large language models (LLMs) has unlocked various applications of this technology in software development. In particular, generative LLMs have been shown to effectively power AI-based code authoring tools that can suggest entire statements or blocks of code during code authoring. In this paper we present ComposeCode, an AI-assisted code authoring tool developed and deployed at CompanyA internally. ComposeCode is based on the InCoder LLM that merges generative capabilities with bi-directionality. We have scaled up ComposeCode to serve tens of thousands of developers at CompanyA, across 9 programming lan- guages and several coding surfaces. We present our experience in making design decisions about the model and system architecture for ComposeCode that addresses these challenges.

To release a LLM model at this scale, we needed to first ensure that it is sufficiently accurate. In a random sample of 20K source code files, depending on the language, we are able to reproduce hidden lines between 40% and 58% of the time, an improvement of 1.4× and 4.1× over a model trained only on public data.

We gradually rolled ComposeCode out to developers. At the time of this writing, 16K developers have used it with 8% of their code coming directly from ComposeCode.

To triangulate our numerical findings, we conduct a thematic analysis on the feedback from 70 developers. We find that 91.5% of the feedback is positive, with the most common themes being discovering APIs, dealing with boilerplate code, and accelerating coding. CompanyA continues to integrate this feedback into ComposeCode.

Vijayaraghavan Murali

Meta Platforms Inc.

United States

Chandra Sekhar Maddila

Meta Platforms, Inc.

United States

Imad Ahmad

Meta Platforms, Inc.

Michael Bolin

Meta Platforms, Inc.

Daniel Cheng

Meta Platforms Inc.

Negar Ghorbani

Meta Platforms Inc.

United States

Renuka Fernandez

Meta Platforms, Inc.

Nachiappan Nagappan

Meta Platforms, Inc.

United States

Peter C Rigby

Meta / Concordia University

United States

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 17 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

11:00 - 12:30	Human Aspects 1Research Papers / Industry Papers at Mandacaru Chair(s): Christoph Treude Singapore Management University

11:00 18m Talk		"The Law Doesn’t Work Like a Computer": Exploring Software Licensing Issues Faced by Legal Practitioners Research Papers Nathan Wintersgill William & Mary, Trevor Stalnaker William & Mary, Laura A. Heymann William & Mary, Oscar Chaparro William & Mary, Denys Poshyvanyk William & Mary
11:18 18m Talk		AI-assisted Code Authoring at Scale: Fine-tuning, deploying, and mixed methods evaluation Research Papers Vijayaraghavan Murali Meta Platforms Inc., Chandra Sekhar Maddila Meta Platforms, Inc., Imad Ahmad Meta Platforms, Inc., Michael Bolin Meta Platforms, Inc., Daniel Cheng Meta Platforms Inc., Negar Ghorbani Meta Platforms Inc., Renuka Fernandez Meta Platforms, Inc., Nachiappan Nagappan Meta Platforms, Inc., Peter C Rigby Meta / Concordia University
11:36 18m Talk		An Analysis of the Costs and Benefits of Autocomplete in IDEs Research Papers Shaokang Jiang University of California, San Diego, Michael Coblenz University of California, San Diego Pre-print
11:54 18m Talk		Shadows in the Interface: A Comprehensive Study on Dark Patterns Research Papers Liming Nie Nanyang Technological University, Yangyang Zhao Zhejiang Sci-Tech University, Chenglin Li Zhejiang Sci-Tech University, Xuqiong Luo Changsha University of Science and Technology, Yang Liu Nanyang Technological University
12:12 9m Talk		Paths to Testing: Why Women Enter and Remain in Software Testing? Industry Papers Kleice Silva CESAR School, Ann Barcomb Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Ronnie de Souza Santos University of Calgary