Wed 17 Jul 2024 16:54 - 17:03 at Pitanga - Testing 2 Chair(s): Wing Lam

Benchmarks are among the main drivers of progress in software engineering research, especially in software testing and debugging. However, current benchmarks in this field could be better suited for specific research tasks, as they rely on weak system oracles like crash detection, come with few unit tests only, need more elaborative research, or cannot verify the outcome of system tests.

Our Tests4Py benchmark addresses these issues. It is derived from the popular BugsInPy benchmark, including 61 bugs from 7 real-world Python applications and, in addition, 6 bugs from 4 example programs. Each subject in Tests4Py comes with an oracle to verify the functional correctness of system inputs. Besides, it enables the generation of system tests and unit tests, allowing for qualitative studies by investigating essential aspects of test sets and extensive evaluations. These opportunities make Tests4Py a next-generation benchmark for research in test generation, debugging, and automatic program repair.

Wed 17 Jul

Displayed time zone: Brasilia, Distrito Federal, Brazil change

16:00 - 18:00
16:00
18m
Talk
Metamorphic Testing of Secure Multi-Party Computation (MPC) Compilers
Research Papers
Dongwei Xiao Hong Kong University of Science and Technology, Zhibo Liu The Hong Kong University of Science and Technology, Qi Pang Carnegie Mellon University, Shuai Wang The Hong Kong University of Science and Technology, Yichen LI Hong Kong University of Science and Technology
16:18
18m
Talk
Mobile Bug Report Reproduction via Global Search on the App UI Model
Research Papers
Zhaoxu Zhang University of Southern California, Fazle Mohammed Tawsif University of Southern California, Komei Ryu University of Southern California, Tingting Yu University of Connecticut, William G.J. Halfond University of Southern California
16:36
18m
Talk
FinHunter: Improved Search-based Test Generation for Structural Testing of FinTech Systems
Industry Papers
Xuanwen Ding East China Normal University, Qingshun Wang East China Normal University, Dan Liu East China Normal University, Lihua Xu New York University Shanghai, Jun Xiao Ant Group Co. Ltd., Bojun Zhang Ant Group Co. Ltd., Xue Li Ant Group Co. Ltd., Liang Dou East China Normal University, Liang He East China Normal University, Tao Xie Peking University
16:54
9m
Talk
Tests4Py: A Benchmark for System Testing
Demonstrations
Marius Smytzek CISPA Helmholtz Center for Information Security, Martin Eberlein Humboldt University of Berlin, Batuhan Serce CISPA Helmholtz Center for Information Security, Lars Grunske Humboldt-Universität zu Berlin, Andreas Zeller CISPA Helmholtz Center for Information Security
Pre-print Media Attached
17:03
9m
Talk
On Polyglot Program Testing
Ideas, Visions and Reflections
Philémon Houdaille DIVERSE Team, IRISA-INRIA, CNRS, Université Rennes 1, Djamel Eddine Khelladi CNRS, IRISA, University of Rennes, Benoit Combemale University of Rennes, Inria, CNRS, IRISA, Gunter Mussbacher McGill University
DOI Pre-print
17:12
9m
Talk
Ctest4J: A Practical Configuration Testing Framework for Java
Demonstrations
Shuai Wang University of Illinois at Urbana-Champaign, Xinyu Lian University of Illinois at Urbana-Champaign, Qingyu Li University of Illinois at Urbana-Champaign, Darko Marinov University of Illinois at Urbana-Champaign, Tianyin Xu University of Illinois at Urbana-Champaign
Pre-print
17:21
9m
Talk
Predicting Test Results without Execution
Ideas, Visions and Reflections
Pre-print Media Attached
17:30
9m
Talk
Py-holmes: Causal Testing for Deep Neural Networks in Python
Demonstrations
Wren McQueary George Mason University, sadia afrin mim George Mason University, Nishat Raihan George Mason University, Justin Smith Lafayette College, Brittany Johnson George Mason University
Pre-print
17:39
9m
Talk
AndroLog: Android Instrumentation and Code Coverage Analysis
Demonstrations
Jordan Samhi CISPA Helmholtz Center for Information Security, Andreas Zeller CISPA Helmholtz Center for Information Security
DOI Pre-print
17:48
9m
Talk
PathSpotter: Exploring Tested Paths to Discover Missing Tests
Demonstrations
Pre-print Media Attached