Can Large Language Models Transform Natural Language Intent into Formal Method Postconditions? (FSE 2024 - Research Papers)

Mon 15 - Fri 19 July 2024 Porto de Galinhas, Brazil, Brazil

Who

Madeline Endres, Sarah Fakhoury, Saikat Chakraborty, Shuvendu K. Lahiri

Track

FSE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Jul 2024 11:18 - 11:36 at Pitanga - Program Analysis and Performance 2 Chair(s): Rahul Purandare

Abstract

Informal natural language that describes code functionality, such as code comments or function documentation, may contain substantial information about a program’s intent. However, there is typically no guarantee that a program’s implementation and natural language documentation are aligned. In the case of a conflict, leveraging information in code-adjacent natural language has the potential to enhance fault localization, debugging, and code trustworthiness. In practice, however, this information is often underutilized due to the inherent ambiguity of natural language which makes natural language intent challenging to check programmatically. The “emergent abilities” of Large Language Models (LLMs) have the potential to facilitate the translation of natural language intent to programmatically checkable assertions. However, it is unclear if LLMs can correctly translate informal natural language specifications into formal specifications that match programmer intent. Additionally, it is unclear if such translation could be useful in practice.

In this paper, we describe LLM4nl2post, the problem leveraging LLMs for transforming informal natural language to formal method postconditions, expressed as program assertions. We introduce and validate metrics to measure and compare different LLM4nl2post approaches, using the correctness and {\it discriminative power} of generated postconditions. We then perform qualitative and quantitative methods to assess the quality of LLM4nl2post postconditions, finding that they are generally correct and able to discriminate incorrect code. Finally, we find that LLM4nl2post via LLMs has the potential to be helpful in practice; specifications generated from natural language were able to catch 70 real-world historical bugs from Defects4J.

Madeline Endres

University of Massachusetts Amherst

United States

Sarah Fakhoury

Microsoft Research

United States

Saikat Chakraborty

Microsoft Research

United States

Shuvendu K. Lahiri

Microsoft Research

United States

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 18 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

11:00 - 12:30	Program Analysis and Performance 2Research Papers at Pitanga Chair(s): Rahul Purandare University of Nebraska-Lincoln

11:00 18m Talk		Adapting Multi-objectivized Software Configuration Tuning Research Papers Tao Chen University of Birmingham, Miqing Li University of Birmingham Pre-print
11:18 18m Talk		Can Large Language Models Transform Natural Language Intent into Formal Method Postconditions? Research Papers Madeline Endres University of Massachusetts Amherst, Sarah Fakhoury Microsoft Research, Saikat Chakraborty Microsoft Research, Shuvendu K. Lahiri Microsoft Research
11:36 18m Talk		Analyzing Quantum Programs with LintQ: A Static Analysis Framework for Qiskit Research Papers Matteo Paltenghi University of Stuttgart, Michael Pradel University of Stuttgart Pre-print
11:54 18m Talk		Abstraction-Aware Inference of Metamorphic Relations Research Papers Agustin Nolasco University of Rio Cuarto, Facundo Molina IMDEA Software Institute, Renzo Degiovanni Luxembourg Institute of Science and Technology, Alessandra Gorla IMDEA Software Institute, Diego Garbervetsky Departamento de Computación, FCEyN, UBA, Mike Papadakis University of Luxembourg, Sebastian Uchitel Imperial College and University of Buenos Aires, Nazareno Aguirre University of Rio Cuarto and CONICET, Marcelo F. Frias Dept. of Software Engineering Instituto Tecnológico de Buenos Aires
12:12 18m Talk		Predicting Configuration Performance in Multiple Environments with Sequential Meta-Learning Research Papers Jingzhi Gong Loughborough University, Tao Chen University of Birmingham Pre-print