An Empirical Study on Focal Methods in Deep-Learning-Based Approaches for Assertion Generation
Unit testing is widely recognized as an essential aspect of the software development process. Generating high-quality assertions automatically is one of the most important and challenging problems in automatic unit test generation. To generate high-quality assertions, deep-learning-based approaches have been proposed in recent years. For state-of-the-art DLAGs (deep-learning-based approaches for assertion generation), the focal method (i.e., the main method under test) for a unit test case plays an important role of being a required part of the input to these approaches. To use DLAGs in practice, there are two main ways to provide a focal method for these approaches: (1) manually providing a developer-intended focal method or (2) identifying a likely focal method from the given test prefix (i.e., complete unit test code excluding assertions) with test-to-code traceability techniques. However, the state-of-the-art DLAGs are all evaluated on the ATLAS dataset, where the focal method for a test case is assumed as the last non-JUnit-API method invoked in the complete unit test code (i.e., code from both the test prefix and assertion portion). There exist two issues of the existing empirical evaluations of DLAGs, causing inaccurate assessment of DLAGs toward adoption in practice. First, it is unclear whether the last method call before assert (LCBA) technique can accurately reflect developer-intended focal methods. Second, when applying DLAGs in practice, the assertion portion of a unit test is not available as a part of the input to DLAGs (actually being the output of DLAGs); thus, the assumption made by the ATLAS dataset does not hold in practical scenarios of applying DLAGs. To address the first issue, we conduct a study of seven test-to-code traceability techniques in the scenario of assertion generation. We find that the LCBA technique is not the best among the seven techniques and can accurately identify focal methods with only 43.38% precision and 38.42% recall; thus, the LCBA technique cannot accurately reflect developer-intended focal methods, raising a concern on using the ATLAS dataset for evaluation. To address the second issue along with the concern raised by the preceding finding, we apply all seven test-to-code traceability techniques, respectively, to identify focal methods automatically from only test prefixes and construct a new dataset named ATLAS+ by replacing the existing focal methods in existing ATLAS with the focal methods identified by the seven traceability techniques, respectively. On a test set from new ATLAS+, we evaluate four state-of-the-art DLAGs trained on a training set from existing ATLAS. We find that all of the four DLAGs achieve lower accuracy on a test set in ATLAS+ than the corresponding test set in existing ATLAS, indicating that DLAGs should be (re)evaluated with a test set in ATLAS+, which better reflects practical scenarios of providing focal methods than existing ATLAS. In addition, we evaluate three state-of-the-art DLAGs trained on training sets in ATLAS+. We find that using training sets in ATLAS+ helps effectively improve the accuracy of the ATLAS approach and T5 approach over the DLAG approach trained using the corresponding training set from existing ATLAS.
Wed 17 JulDisplayed time zone: Brasilia, Distrito Federal, Brazil change
14:00 - 15:30 | Empirical Studies 1Industry Papers / Research Papers / Journal First at Mandacaru Chair(s): Ronnie de Souza Santos University of Calgary | ||
14:00 18mTalk | An Empirical Study on Focal Methods in Deep-Learning-Based Approaches for Assertion Generation Research Papers Yibo He Peking University, Jiaming Huang Peking University, Hao Yu Peking University, Tao Xie Peking University | ||
14:18 18mTalk | Less Cybersickness, Please: Demystifying and Detecting Stereoscopic Visual Inconsistencies in Virtual Reality Applications Research Papers Shuqing Li The Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Jianping Zhang The Chinese University of Hong Kong, Yujia Zhang Harbin Institute of Technology, Yepang Liu Southern University of Science and Technology, Jiazhen Gu The Chinese University of Hong Kong, Yun Peng The Chinese University of Hong Kong, Michael Lyu The Chinese University of Hong Kong DOI Pre-print | ||
14:36 18mTalk | Decision Making for Managing Automotive Platforms: An Interview Survey on the Sate-of-Practice Industry Papers Philipp Zellmer Volkswagen AG & Harz University of Applied Sciences, Jacob Krüger Eindhoven University of Technology, Thomas Leich Harz University of Applied Sciences, Germany | ||
14:54 18mTalk | Evaluation framework for autonomous systems: the case of Programmable Electronic Medical Systems Journal First Andrea Bombarda University of Bergamo, Silvia Bonfanti University of Bergamo, Martina De Sanctis Gran Sasso Science Institute, Angelo Gargantini University of Bergamo, Patrizio Pelliccione Gran Sasso Science Institute, L'Aquila, Italy, Elvinia Riccobene Computer Science Dept., University of Milan, Patrizia Scandurra University of Bergamo, Italy | ||
15:12 18mTalk | Insights into Transitioning towards Electrics/Electronics Platform Management in the Automotive Industry Industry Papers Lennart Holsten Volkswagen AG & Harz University of Applied Sciences, Jacob Krüger Eindhoven University of Technology, Thomas Leich Harz University of Applied Sciences, Germany |