Thu 18 Jul 2024 12:12 - 12:30 at Pitomba - Software Maintenance and Comprehension 2 Chair(s): Denys Poshyvanyk

Modern software systems are increasingly dependent upon code from external packages (i.e., dependencies). Building upon external packages boosts developer productivity and allows software reuse to seamlessly span across projects. Package maintainers regularly release updated versions to provide new features, fix defects, and address security vulnerabilities. Due to the potential for regression, managing dependencies is not just a trivial matter of selecting the latest versions. Since it is perceived to be less risky to retain a dependency than remove it, as projects evolve, they tend to accrue dependencies, exacerbating the difficulty of dependency management. It is not uncommon for a considerable proportion of external packages to be unused by the projects that list them as a dependency. Although such unused dependencies are not required to build and run the project, updates to their dependency specifications will still trigger Continuous Integration (CI) builds. The CI builds that are initiated by updates to unused dependencies are fundamentally wasteful. Considering that CI build time is a finite resource that is directly associated with project development and service operational costs, understanding the consequences of unused dependencies within this CI context is of practical importance.

In this paper, we conduct the first study on the CI waste that is generated by updates to unused dependencies. We collect a dataset of 20,743 commits that are solely updating dependency specifications (i.e., the package.json file) and their corresponding builds, spanning 1,487 projects that adopt Node Package Manager (NPM) for managing their dependencies. Our findings illustrate that 55.88% of the build time that is associated with dependency updates is only triggered by unused dependencies. At the project level, the median project spends 54.45% of its dependency-related build time on updates to unused dependencies. Moreover, we find that automated bots are the primary producers of dependency-induced CI waste, contributing 89.12% of the build time that is spent on unused dependencies. The popular Dependabot is responsible for updates to unused dependencies that account for 74.52% of that waste.

To mitigate the impact of unused dependencies on CI resources, we introduce Dep-sCImitar, an approach to cut down wasted CI time by identifying and skipping CI builds that are triggered due to unused-dependency commits. A retrospective evaluation of the 20,743 studied commits shows that Dep-sCImitar reduces 68.34% of the wasted CI build time by skipping wasteful builds with a precision of 94%. We make this approach available as a prototype tool that can be integrated with any JavaScript project that uses NPM for handling dependencies to automatically skip CI builds that are triggered by unused-dependency commits.

Thu 18 Jul

Displayed time zone: Brasilia, Distrito Federal, Brazil change

11:00 - 12:30
Software Maintenance and Comprehension 2Research Papers at Pitomba
Chair(s): Denys Poshyvanyk William & Mary
11:00
18m
Talk
Bloat beneath Python's Scales: A Fine-Grained Inter-Project Dependency Analysis
Research Papers
Georgios-Petros Drosos ETH Zurich, Thodoris Sotiropoulos ETH Zurich, Diomidis Spinellis Athens University of Economics and Business & Delft University of Technology, Dimitris Mitropoulos University of Athens
DOI Pre-print
11:18
18m
Research paper
Characterizing Python Library Migrations
Research Papers
Mohayeminul Islam University of Alberta, Ajay Jha North Dakota State University, Ildar Akhmetov Northeastern University, Sarah Nadi New York University Abu Dhabi, University of Alberta
DOI Pre-print
11:36
18m
Talk
PyRadar: Towards Automatically Retrieving and Validating Source Code Repository Information for PyPI Packages
Research Papers
Kai Gao University of Science and Technology Beijing, Weiwei Xu Peking University, Wenhao Yang Peking University, Minghui Zhou Peking University
DOI Pre-print
11:54
18m
Talk
Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models
Research Papers
zejun zhang Australian National University, Zhenchang Xing CSIRO's Data61, Xiaoxue Ren Zhejiang University, Qinghua Lu Data61, CSIRO, Xiwei (Sherry) Xu Data61, CSIRO
12:12
18m
Talk
Dependency-Induced Waste in Continuous Integration: An Empirical Study of Unused Dependencies in the NPM Ecosystem
Research Papers
Nimmi Weeraddana University of Waterloo, Mahmoud Alfadel University of Waterloo, Shane McIntosh University of Waterloo
DOI Pre-print