Thu 18 Jul 2024 15:12 - 15:30 at Pitomba - Software Maintenance and Comprehension 3 Chair(s): Xin Xia

Sharing research artifacts (e.g., software, data, protocols) is an immensely important topic for improving transparency, replicability, and reusability in research, and has recently gained more and more traction in software engineering. For instance, recent studies have focused on artifact reviewing, the impact of open science, and specific legal or ethical issues of sharing artifacts. Most of such studies are concerned with artifacts created by the researchers themselves (e.g., scripts, algorithms, tools) and processes for quality assuring these artifacts (e.g., through artifact-evaluation committees). In contrast, the practices and challenges of sharing software-evolution datasets (i.e., republished version-control data with person-related information) have only been scratched in such works. To tackle this gap, we conducted a meta study of software-evolution datasets published at the International Conference on Mining Software Repositories from 2017 until 2021 and snowballed a set of papers that build upon these datasets. Investigating 200 papers, we elicited what types of software-evolution datasets have been shared following what practices and what challenges researchers experienced with sharing or using the datasets. We discussed our findings with an authority on research-data management and ethics reviews through a semi-structured interview to put the practices and challenges into context. Through our meta study, we provide an overview of the sharing practices for software-evolution datasets and the corresponding challenges. The expert interview enriched this analysis by discussing how to solve the challenges and defining recommendations for sharing software-evolution datsets in the future. Our results extend and complement current research, and we are confident that they help researchers share software-evolution datasets (as well as datasets involving the same types of data) in a reliable, ethical, and trustworthy way.

Thu 18 Jul

Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30
Software Maintenance and Comprehension 3Research Papers / Journal First at Pitomba
Chair(s): Xin Xia Huawei Technologies
14:00
18m
Talk
Revealing Software Development Work Patterns with PR-Issue Graph Topologies
Research Papers
Cleidson de Souza Federal University of Pará, Brazil, Emilie Ma University of British Columbia, Jesse Wong University of British Columbia, Dongwook Yoon University of British Columbia, Ivan Beschastnikh University of British Columbia
14:18
18m
Talk
Using acceptance tests to predict merge conflict risk
Journal First
Thaís Rocha UFAPE - Universidade Federal do Agreste de Pernambuco, Paulo Borba Federal University of Pernambuco
Pre-print
14:36
18m
Talk
Generative AI for Pull Request Descriptions: Adoption, Impact, and Developer Interventions
Research Papers
Tao Xiao Nara Institute of Science and Technology, Hideaki Hata Shinshu University, Christoph Treude Singapore Management University, Kenichi Matsumoto Nara Institute of Science and Technology
Pre-print Media Attached
14:54
18m
Talk
SimLLM: Measuring Semantic Similarity in Code Summaries Using a Large Language Model-Based Approach
Research Papers
Xin Jin Meta, Zhiqiang Lin The Ohio State University
15:12
18m
Talk
Sharing Software-Evolution Datasets: Practices, Challenges, and Recommendations
Research Papers
David Broneske DZHW Hannover, Germany, Sebastian Kittan Otto-von-Guericke Unviersity Magdeburg, Germany, Jacob Krüger Eindhoven University of Technology