Exploring and Unleashing the Power of Large Language Models in Automated Code Translation (FSE 2024 - Research Papers)

Who

Zhen Yang, Fang Liu, Zhongxing Yu, Jacky Keung, Jia Li, Shuo Liu, Hong Yifan, Xiaoxue Ma, Zhi Jin, Ge Li

Track

FSE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 17 Jul 2024 14:54 - 15:12 at Pitomba - AI4SE 1 Chair(s): Mauro Pezze

Abstract

Automated code translation tools, namely transpilers, are developed for source-to-source translation (e.g., Java to Python) in an automated fashion. Current state-of-the-art learning-based transpilers (e.g., TransCoder) have demonstrated impressive enhancement in both translation accuracy and readability compared with rule-based counterparts (e.g., j2py). This is largely attributed to their employment of task-specific pre-training on extensive monolingual corpora. Nonetheless, despite these advancements, their current performance remains unsatisfactory for practical deployment, and the associated training resources are also prohibitively expensive. Large Language Models (LLMs), pre-trained on huge amounts of human-written code/text, have shown remarkable performance in many software engineering fields (e.g., code generation and program repair) due to their powerful generality, even without task-specific re-training/fine-tuning. Thus, LLMs can potentially circumvent the above limitations, but they have not been exhaustively explored yet.

In this paper, we perform the first extensive study on five LLMs and three state-of-the-art learning-based transpilers for automated code translation tasks between Python, Java, and C++. Our investigation finds that, although certain LLMs have outperformed current transpilers, they still have some accuracy issues. Taking GPT-3.5, one of the state-of-the-art LLMs, as an example, we carry out an in-depth analysis and categorization of its failures. Results demonstrate most of the failures are induced by (1) a lack of comprehension of the source programs (38.51%), (2) missing clear instructions on Input/Output (I/O) types in translation (14.94%), and (3) ignoring the discrepancies between source and target programs (41.38%).

Enlightened by the above findings, we further propose \textbf{UniTrans}, an \textbf{Uni}fied code \textbf{Trans}lation framework, applicable to various LLMs, for unleashing their power in this field. Specifically, \textbf{UniTrans} first craft a series of test cases for target programs with the assistance of the source programs. Next, as test cases imply requirements of programs for comprehension and carry the specific I/O type instructions, \textbf{UniTrans} harnesses the above test cases to augment the code translation and then evaluate their correctness via execution. Afterward, to alleviate failures brought by discrepancy ignorance, \textbf{UniTrans} further repairs incorrectly translated programs prompted by test case execution results, where an option of iterative repair is also provided for practitioners. Extensive experiments are conducted on six settings of translation datasets between Python, Java, and C++. Three state-of-the-art LLMs of diverse sizes, including GPT-3.5, LLaMA-13B, and LLaMA-7B, are tested with \textbf{UniTrans}, and all achieve substantial improvements in terms of Computational Accuracy (CA) and Exact Match Accuracy (EM Acc) among almost all translation settings, showing the universal effectiveness of \textbf{UniTrans} in practice.

Link to Preprint

https://arxiv.org/abs/2404.14646

Zhen Yang

Shandong University

China

Fang Liu

Beihang University

China

Zhongxing Yu

Shandong University

China

Jacky Keung

City University of Hong Kong

Hong Kong SAR China

Jia Li

Peking University

China

Shuo Liu

City University of Hong Kong

Hong Yifan

City University of Hong Kong

Xiaoxue Ma

City University of Hong Kong

Zhi Jin

Peking University

China

Ge Li

Peking University

China

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 17 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30	AI4SE 1Research Papers at Pitomba Chair(s): Mauro Pezze USI Università della Svizzera Italiana & SIT Schaffhausen Institute of Technology

14:00 18m Talk		Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-context Learning Research Papers Yubo Mai Zhejiang University, Zhipeng Gao Shanghai Institute for Advanced Study - Zhejiang University, Xing Hu Zhejiang University, Lingfeng Bao Zhejiang University, Yu Liu Zhejiang University, JianLing Sun Zhejiang University
14:18 18m Talk		CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking Research Papers Zian Su Purdue University, Xiangzhe Xu Purdue University, Ziyang Huang Purdue University, Zhuo Zhang Purdue University, Yapeng Ye Purdue University, Jianjun Huang Renmin University of China, Xiangyu Zhang Purdue University
14:36 18m Talk		Enhancing Code Understanding for Impact Analysis by Combining Transformers and Program Dependence Graphs Research Papers Yanfu Yan William & Mary, Nathan Cooper William & Mary, Kevin Moran University of Central Florida, Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Denys Poshyvanyk William & Mary, Steve Rich Cisco Systems
14:54 18m Talk		Exploring and Unleashing the Power of Large Language Models in Automated Code Translation Research Papers Zhen Yang Shandong University, Fang Liu Beihang University, Zhongxing Yu Shandong University, Jacky Keung City University of Hong Kong, Jia Li Peking University, Shuo Liu City University of Hong Kong, Hong Yifan City University of Hong Kong, Xiaoxue Ma City University of Hong Kong, Zhi Jin Peking University, Ge Li Peking University Pre-print
15:12 18m Talk		Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection Research Papers Yuxi Li Huazhong University of Science and Technology, Yi Liu Nanyang Technological University, Gelei Deng Nanyang Technological University, Ying Zhang Virginia Tech, Wenjia Song Virginia Tech, Ling Shi Nanyang Technological University, Kailong Wang Huazhong University of Science and Technology, Yuekang Li The University of New South Wales, Yang Liu Nanyang Technological University, Haoyu Wang Huazhong University of Science and Technology