Thu 18 Jul 2024 14:54 - 15:12 at Pitomba - Software Maintenance and Comprehension 3 Chair(s): Xin Xia

Code summaries are pivotal in software engineering, serving to improve code readability, maintainability, and collaboration. While recent advancements in Large Language Models (LLMs) have opened new avenues for automatic code summarization, existing metrics for evaluating summary quality, such as BLEU and BERTScore, have notable limitations. Specifically, these existing metrics either fail to capture the nuances of semantic meaning in summaries or are further limited in understanding domain-specific terminologies and expressions prevalent in code summaries. In this paper, we introduce SimLLM, a novel LLM-based approach designed to more precisely evaluate the semantic similarity of code summaries. Built upon an autoregressive LLM using a specialized pretraining task on permutated inputs and a pooling-based pairwise similarity measure, SimLLM overcomes the shortcomings of existing metrics. Our empirical evaluations demonstrate that SimLLM not only outperforms existing metrics but also shows a significantly high correlation with human ratings.

Thu 18 Jul

Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30
Software Maintenance and Comprehension 3Research Papers / Journal First at Pitomba
Chair(s): Xin Xia Huawei Technologies
14:00
18m
Talk
Revealing Software Development Work Patterns with PR-Issue Graph Topologies
Research Papers
Cleidson de Souza Federal University of Pará, Brazil, Emilie Ma University of British Columbia, Jesse Wong University of British Columbia, Dongwook Yoon University of British Columbia, Ivan Beschastnikh University of British Columbia
14:18
18m
Talk
Using acceptance tests to predict merge conflict risk
Journal First
Thaís Rocha UFAPE - Universidade Federal do Agreste de Pernambuco, Paulo Borba Federal University of Pernambuco
Pre-print
14:36
18m
Talk
Generative AI for Pull Request Descriptions: Adoption, Impact, and Developer Interventions
Research Papers
Tao Xiao Nara Institute of Science and Technology, Hideaki Hata Shinshu University, Christoph Treude Singapore Management University, Kenichi Matsumoto Nara Institute of Science and Technology
Pre-print Media Attached
14:54
18m
Talk
SimLLM: Measuring Semantic Similarity in Code Summaries Using a Large Language Model-Based Approach
Research Papers
Xin Jin Meta, Zhiqiang Lin The Ohio State University
15:12
18m
Talk
Sharing Software-Evolution Datasets: Practices, Challenges, and Recommendations
Research Papers
David Broneske DZHW Hannover, Germany, Sebastian Kittan Otto-von-Guericke Unviersity Magdeburg, Germany, Jacob Krüger Eindhoven University of Technology