A Miss Is as Good as A Mile: Metamorphic Testing for Deep Learning Operators (FSE 2024 - Research Papers)

Who

Jinyin Chen, Chengyu Jia, Yunjie Yan, Jie Ge, haibin zheng, Yao Cheng

Track

FSE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Jul 2024 16:54 - 17:12 at Pitanga - Testing 3 Chair(s): Qi Xin

Abstract

Deep learning (DL) is a critical tool for real-world applications, and comprehensive testing of DL models is vital to ensure their quality before deployment. However, recent studies have shown that even subtle deviations in DL operators can result in catastrophic consequences, underscoring the importance of rigorous testing of these components. Unlike testing other DL system components, operator analysis poses unique challenges due to complex inputs and uncertain outputs. The existing DL operator testing approach has limitations in terms of testing efficiency and error localization. In this paper, we propose Meta, a novel operator testing framework based on metamorphic testing that automatically tests and assists bug location based on metamorphic relations (MRs). Meta distinguishes itself in three key ways: (1) it considers both parameters and input tensors to detect operator errors, enabling it to identify both implementation and precision errors; (2) it uses MRs to guide the generation of more effective inputs (i.e., tensors and parameters) in less time; (3) it assists the precision error localization by tracing the error to the input level of the operator based on MR violations. We designed 21 metamorphic relations (MRs) for testing 10 widely used DL operators. To assess the effectiveness of our proposed Meta, we conducted experiments on 9 released versions of 5 popular DL libraries. Our results revealed that Meta successfully detected 32 bugs, including 14 new ones that were reported to the respective platforms and partially confirmed. Additionally, Meta demonstrated high efficiency, outperforming the baseline by detecting x~1.89 times more errors while only requiring x~ 0.5 times the cost of the baseline.

Jinyin Chen

Zhejiang University of Technology

China

Chengyu Jia

Zhejiang University of Technology

China

Yunjie Yan

Zhejiang University of Technology

Jie Ge

Zhejiang University of Technology

China

haibin zheng

Zhejiang University of Technology

China

Yao Cheng

TÜV SÜD Asia Pacific Pte. Ltd.

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 18 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

16:00 - 18:00	Testing 3Ideas, Visions and Reflections / Demonstrations / Research Papers / Journal First at Pitanga Chair(s): Qi Xin Wuhan University

16:00 18m Talk		Search-based Software Testing Driven by Automatically Generated and Manually Defined Fitness Functions Journal First Federico Formica McMaster University, Tony Fan McMaster University, Claudio Menghi University of Bergamo; McMaster University
16:18 9m Talk		Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is Significantly More Executed Ideas, Visions and Reflections Andre Hora UFMG Pre-print Media Attached
16:36 18m Talk		Finding and Understanding Defects in Static Analyzers by Constructing Automated Oracles Research Papers weigang he East China Normal University / University of Technology Sydney, Peng Di Ant Group, Mengli Ming East China Normal University, Chengyu Zhang ETH Zurich, Ting Su East China Normal University, Shijie Li Ant Group, Yulei Sui UNSW
16:54 18m Talk		A Miss Is as Good as A Mile: Metamorphic Testing for Deep Learning Operators Research Papers Jinyin Chen Zhejiang University of Technology, Chengyu Jia Zhejiang University of Technology, Yunjie Yan Zhejiang University of Technology, Jie Ge Zhejiang University of Technology, haibin zheng Zhejiang University of Technology, Yao Cheng TÜV SÜD Asia Pacific Pte. Ltd.
17:12 9m Talk		ExLi : An Inline-Test Generation Tool for Java Demonstrations Yu Liu University of Texas at Austin, Aditya Thimmaiah The University of Texas at Austin, Owolabi Legunsen Cornell University, Milos Gligoric The University of Texas at Austin
17:21 9m Talk		ATheNA-S: a Testing Tool for Simulink Models Driven by Software Requirements and Domain Expertise Demonstrations Federico Formica McMaster University, Mohammad Mahdi Mahboob McMaster University, Mehrnoosh Askarpour McMaster University, Claudio Menghi University of Bergamo; McMaster University
17:30 9m Talk		Test Polarity: Detecting Positive and Negative Tests Ideas, Visions and Reflections Andre Hora UFMG Pre-print Media Attached
17:39 18m Talk		Java JIT Testing with Template Extraction Research Papers Zhiqiang Zang The University of Texas at Austin, Fu-Yao Yu The University of Texas at Austin, Aditya Thimmaiah The University of Texas at Austin, August Shi The University of Texas at Austin, Milos Gligoric The University of Texas at Austin DOI Pre-print