DiffCoder: Enhancing Large Language Model on API Invocation via Analogical Code Exercises
The task of code generation aims to generate code solutions based on given programming problems. Recently, code large language models (code LLMs) have shed new light on this task, owing to their formidable code generation capabilities. While these models are powerful, they seldom focus on further improving the accuracy of library-oriented API invocation. Nonetheless, programmers frequently invoke APIs in routine coding tasks. In this paper, we aim to enhance the proficiency of existing code LLMs regarding API invocation by mimicking \textit{analogical learning}, which is a critical learning strategy for humans to learn through differences among multiple instances. Motivated by this, we propose a simple yet effective approach, namely DiffCoder, which excels in API invocation by effectively training on the differences (diffs) between analogical code exercises. To assess the API invocation capabilities of code LLMs, we conduct experiments on seven existing benchmarks that focus on mono-library API invocation. Additionally, we construct a new benchmark, namely PanNumEval, to evaluate the performance of multi-library API invocation. Extensive experiments on eight benchmarks demonstrate the impressive performance of DiffCoder. Furthermore, we develop a VSCode plugin for DiffCoder, and the results from twelve invited participants further verify the practicality of DiffCoder.