Software engineering activities such as package migration, fixing error reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code. We formulate these activities as repository-level coding tasks.
Recent tools like GitHub Copilot, which are powered by Large Language Models (LLMs), have succeeded in offering high-quality solutions to localized coding problems. Repository-level coding tasks are more involved and cannot be solved directly using LLMs, since code within a repository is inter-dependent and the entire repository may be too large to fit into the prompt. We frame repository-level coding as a planning problem and present a task-agnostic, neuro-symbolic framework called CodePlan to solve it. CodePlan synthesizes a multi-step chain-of-edits (plan), where each step results in a call to an LLM on a code location with context derived from the entire repository, previous code changes and task-specific instructions. CodePlan is based on a novel combination of an incremental dependency analysis, a change may-impact analysis and an adaptive planning algorithm (symbolic components) with the neural LLMs.
We evaluate the effectiveness of CodePlan on two repository-level tasks: package migration (C#) and temporal code edits (Python). Each task is evaluated on multiple code repositories, each of which requires inter-dependent changes to many files (between 2–97 files). Coding tasks of this level of complexity have not been automated using LLMs before. Our results show that CodePlan has better match with the ground truth compared to baselines. CodePlan is able to get 5/7 repositories to pass the validity checks (i.e., to build without errors and make correct code edits) whereas the baselines (without planning but with the same type of contextual information as CodePlan) cannot get any of the repositories to pass them. We are making our (non-proprietary) data and evaluation scripts available for review. We will open-source them upon publication.
Wed 17 JulDisplayed time zone: Brasilia, Distrito Federal, Brazil change
11:00 - 12:30 | Code Search and CompletionIndustry Papers / Research Papers at Pitomba Chair(s): Akond Rahman Auburn University | ||
11:00 18mTalk | Leveraging Large Language Models for the Auto-remediation of Microservice Applications - An Experimental Study Industry Papers Komal Sarda York University, Zakeya Namrud York University, Marin Litoiu York University, Canada, Larisa Shwartz IBM T.J. Watson Research, Ian Watts IBM Canada | ||
11:18 18mTalk | CodePlan: Repository-level Coding using LLMs and Planning Research Papers Ramakrishna Bairi Microsoft Research, India, Atharv Sonwane Microsoft Research, India, Aditya Kanade Microsoft Research, India, Vageesh D C Microsoft Research, India, Arun Iyer Microsoft Research, India, Suresh Parthasarathy Microsoft Research, India, Sriram Rajamani Microsoft Research Indua, B. Ashok Microsoft Research. India, Shashank Shet Microsoft Research. India | ||
11:36 18mTalk | An Empirical Study of Code Search in Intelligent Coding Assistant: Perceptions, Expectations, and Directions Industry Papers Chao Liu Chongqing University, Xindong Zhang Alibaba Cloud Computing Co. Ltd., Hongyu Zhang Chongqing University, Zhiyuan Wan Zhejiang University, Zhan Huang Chongqing University, Meng Yan Chongqing University | ||
11:54 18mTalk | DeciX: Explain Deep Learning Based Code Generation Applications Research Papers Simin Chen University of Texas at Dallas, Zexin Li University of California, Riverside, Wei Yang University of Texas at Dallas, Cong Liu University of California, Riverside | ||
12:12 18mTalk | IRCoCo: Immediate Rewards-Guided Deep Reinforcement Learning for Code Completion Research Papers Bolun Li Shandong Normal University, Zhihong Sun Shandong Normal University, Tao Huang Shandong Normal University, Hongyu Zhang Chongqing University, Yao Wan Huazhong University of Science and Technology, Chen Lyu Shandong Normal University, Ge Li Peking University, Zhi Jin Peking University |