PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models (FSE 2024 - Research Papers)

Who

Simin Chen, XiaoNing Feng, Xiaohong Han, Cong Liu, Wei Yang

Track

FSE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 19 Jul 2024 14:00 - 14:18 at Acerola - Security and Privacy 2 Chair(s): Kihong Heo

Abstract

In recent times, a plethora of Large Code Generation Models (LCGMs) have been proposed, showcasing significant potential in assisting developers with complex programming tasks. Within the surge of LCGM proposals, a critical aspect of code generation research involves effectively benchmarking the programming capabilities of each model. Benchmarking LCGMs necessitates the creation of a diverse programming problem set, comprising the prompt, canonical solution, and test inputs. The existing methods for constructing such a problem set can be categorized into two main types: manually-based and perturbation-based. However, both these methods exhibit major limitations. Firstly, manually-based methods require substantial human effort and are not easily scalable. Moreover, programming problem sets created manually struggle to maintain long-term data integrity due to the greedy training data collection mechanism in LCGMs. On the other hand, perturbation-based approaches primarily produce semantically homogeneous problems, resulting in generated programming problems with identical Canonical Solutions to the seed problem. These methods also tend to introduce typos to the prompt, easily detectable by IDEs, rendering them unrealistic. Addressing the aforementioned limitations presents several challenges: (1) How to automatically generate semantically diverse Canonical Solutions, (2) how to ensure long-term data integrity, and (3) how to generate grammatically correct programming problems. To tackle the first challenge, our key insight stems from viewing a program as a mapping from the input domain to the output domain. The output of one program can be utilized as the input for another. Building on this insight, we propose programming problem merging, which combines two existing programming problems to create semantically diverse ones. In addressing the second challenge, we introduce randomness to our programming problem generation process. By defining a large random search space, our tool can probabilistically guarantee no data repetition with two random trials with high confidence. To tackle the third challenge, we propose the concept of a Lambda Programming Problem, comprising a concise one-sentence task description in natural language accompanied by a corresponding program implementation. As the proposed task description is grammatically correct, our tool ensures the new program prompt is also grammatically correct. Additionally, the tool leverages return value type analysis to verify the correctness of newly created Canonical Solutions. In our empirical evaluation, we utilize our tool on two widely-used datasets and compare it against six baseline methods using eight code generation models. The results vividly demonstrate the effectiveness of our tool in generating challenging, diverse, and natural coding problems, surpassing the baselines.

Simin Chen

University of Texas at Dallas

United States

XiaoNing Feng

Taiyuan University of Technology

Xiaohong Han

Taiyuan University of Technology

Cong Liu

University of California, Riverside

United States

Wei Yang

University of Texas at Dallas

United States

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 19 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30	Security and Privacy 2Industry Papers / Research Papers at Acerola Chair(s): Kihong Heo KAIST

14:00 18m Talk		PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models Research Papers Simin Chen University of Texas at Dallas, XiaoNing Feng Taiyuan University of Technology, Xiaohong Han Taiyuan University of Technology, Cong Liu University of California, Riverside, Wei Yang University of Texas at Dallas
14:18 18m Talk		Demystifying Invariant Effectiveness for Securing Smart Contracts Research Papers Zhiyang Chen University of Toronto, Ye Liu Nanyang Technological University, Sidi Mohamed Beillahi University of Toronto, Yi Li Nanyang Technological University, Fan Long University of Toronto Link to publication Pre-print Media Attached
14:36 18m Talk		Static Application Security Testing (SAST) Tools for Smart Contracts: How Far Are We? Research Papers Kaixuan Li East China Normal University, Yue Xue Metatrust Labs, Sen Chen Tianjin University, Han Liu East China Normal University, Kairan Sun Nanyang Technological University, Ming Hu Singapore Management University, Haijun Wang Xi'an Jiaotong University, Yang Liu Nanyang Technological University, Yixiang Chen East China Normal University Pre-print
14:54 18m Talk		On the Contents and Utility of IoT Cybersecurity Guidelines Research Papers Jesse Chen University of Arizona, Dharun Anandayuvaraj Purdue University, James C. Davis Purdue University, Sazzadur Rahaman University of Arizona DOI Pre-print
15:12 18m Talk		CVECenter: Industry Practice of Automated Vulnerability Management for Linux Distribution Community Industry Papers Jing Luo Central South University, Heyuan Shi Central South University, Yongchao Zhang Alibaba, Runzhe Wang Alibaba Group, Yuheng Shen Tsinghua University, Yuao Chen Alibaba, Rongkai Liu Central South University, Xiaohai Shi Alibaba Group, Chao Hu Central South University, Yu Jiang Tsinghua University