Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow (FSE 2024 - Journal First)

Mon 15 - Fri 19 July 2024 Porto de Galinhas, Brazil, Brazil

Who

Florian Tambon, Amin Nikanjam, Le An, Foutse Khomh, Giuliano Antoniol

Track

FSE 2024 Journal First

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Jul 2024 14:36 - 14:54 at Acerola - Empirical Studies 3 Chair(s): Shane McIntosh

Abstract

Deep Learning (DL) frameworks are now widely used, simplifying the creation of complex models as well as their integration into various applications even among non-DL experts. However, like any other programs, they are prone to bugs. This paper deals with the subcategory of bugs named silent bugs: they lead to wrong behavior but they do not cause system crashes or hangs, nor show an error message to the user. Such bugs are even more dangerous in DL applications and frameworks due to the “black-box” and stochastic nature of the DL systems (i.e., the end user can not understand how the model makes decisions). This paper presents the first empirical study of the silent bugs in Tensorflow, specifically its high-level API Keras, and their impact on users’ programs. We extracted closed issues related to Keras API from the TensorFlow GitHub repository. Out of the 1,168 issues that we gathered, 77 were reproducible silent bugs affecting users’ programs. We categorized the bugs based on the effects on the users’ programs and the components where the issues occurred, using information from the issue reports. We then derived a threat level for each of the issues, based on the impact they had on the users’ programs. To assess the relevance of identified categories and the impact scale, we conducted an online survey with 103 DL developers. The participants generally agreed with the significant impact of silent bugs in DL frameworks and how they impact users and acknowledged our findings (i.e., categories of silent bugs and the proposed impact scale).

Link to Publication

https://link.springer.com/article/10.1007/s10664-023-10389-6

Authorizer Link

https://rdcu.be/dFsN0

DOI

https://doi.org/10.1007/s10664-023-10389-6

Florian Tambon

Polytechnique Montréal

Canada

Amin Nikanjam

École Polytechnique de Montréal

Canada

Le An

Polytechnique Montreal

Canada

Foutse Khomh

Polytechnique Montréal

Canada

Giuliano Antoniol

Polytechnique Montréal

Canada

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 18 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30	Empirical Studies 3Research Papers / Journal First at Acerola Chair(s): Shane McIntosh University of Waterloo

14:00 18m Talk		Understanding the Impact of APIs Behavioral Breaking Changes on Client Applications Research Papers Dhanushka Jayasuriya University of Auckland, Valerio Terragni University of Auckland, Jens Dietrich Victoria University of Wellington, Kelly Blincoe University of Auckland
14:18 18m Talk		Analyzing the BizDev Interface in an Enterprise Context: A Case of Developers Acting in Business Journal First Breno de França UNICAMP, Caique Moreira Instituto de Computação - Universidade Estadual de Campinas, Tayana Conte Universidade Federal do Amazonas Link to publication DOI File Attached
14:36 18m Talk		Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow Journal First Florian Tambon Polytechnique Montréal, Amin Nikanjam École Polytechnique de Montréal, Le An Polytechnique Montreal, Foutse Khomh Polytechnique Montréal, Giuliano Antoniol Polytechnique Montréal Link to publication DOI Authorizer link
14:54 18m Talk		AROMA: Automatic Reproduction of Maven Artifacts Research Papers Mehdi Keshani Delft University of Technology, Tudor-Gabriel Velican Delft University of Technology, Gideon Bot Delft University of Technology, Sebastian Proksch Delft University of Technology
15:12 18m Talk		An Empirical Study of Task Infections in Ansible Scripts Journal First Akond Rahman Auburn University, Dibyendu Brinto Bose Graduate Student, Yue Zhang Auburn University, Rahul Pandita GitHub, Inc. Link to publication Authorizer link Pre-print