Bin2Summary: Beyond Function Name Prediction in Stripped Binaries with Functionality-specific Code Embeddings
Nowadays, closed-source software only with stripped binaries still dominates the ecosystem, which brings obstacles to understanding the functionalities of the software and further conducting the security analysis. With such an urgent need, research has traditionally focused on predicting function names, which can only provide fragmented and abbreviated information about functionality. To advance the state-of-the-art, this paper presents Bin2Summary to automatically summarize the functionality of the function in stripped binaries with natural language sentences. Specifically, the proposed framework includes a functionality-specific code embedding module to facilitate fine-grained similarity detection and an attention-based seq2seq model to generate summaries in natural language. Based on 16 widely-used projects (e.g., Coreutils), we have evaluated Bin2Summary with 38,167 functions, which are filtered from 162,406 functions, and all of them have a high-quality comment. Bin2Summary achieves 0.728 in precision and 0.729 in recall on our datasets, and the functionality-specific embedding module can improve the existing assembly language model by up to 109.5% and 109.9% in precision and recall. Meanwhile, the experiments demonstrated that Bin2Summary has outstanding transferability in analyzing the cross-architecture (i.e., in x64 and x86) and cross-environment (i.e., in Cygwin and MSYS2) binaries. Finally, the case study illustrates how Bin2Summary outperforms the existing works in providing functionality summaries with abundant semantics beyond function names.
Fri 19 JulDisplayed time zone: Brasilia, Distrito Federal, Brazil change
14:00 - 15:30 | Program Analysis and Performance 3Research Papers at Mandacaru Chair(s): Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University | ||
14:00 18mTalk | Bin2Summary: Beyond Function Name Prediction in Stripped Binaries with Functionality-specific Code Embeddings Research Papers Zirui Song The Chinese University of Hong Kong, Jiongyi Chen National University of Defense Technology, Kehuan Zhang The Chinese University of Hong Kong | ||
14:18 18mTalk | Active Monitoring Mechanism for Control-based Self-Adaptive Systems Research Papers Yi Qin State Key Laboratory for Novel Software Technology, Nanjing University, Yanxiang Tong State Key Laboratory for Novel Software Technology, Nanjing University, Yifei Xu State Key Laboratory for Novel Software Technology, Nanjing University, Chun Cao State Key Laboratory for Novel Software Technology, Nanjing University, Xiaoxing Ma State Key Laboratory for Novel Software Technology, Nanjing University | ||
14:36 18mTalk | Cut to the Chase: An Error-Oriented Approach to Detect Error-Handling Bugs Research Papers Haoran Liu National University of Defense Technology, Zhouyang Jia National University of Defense Technology, Shanshan Li National University of Defense Technology, Yan Lei Chongqing University, Yue Yu National University of Defense Technology, Yu Jiang Tsinghua university, Xiaoguang Mao National University of Defense Technology, Liao Xiangke National University of Defense Technology | ||
14:54 18mTalk | DAInfer: Inferring API Aliasing Specifications from Library Documentation via Neurosymbolic Optimization Research Papers Chengpeng Wang The Hong Kong University of Science and Technology, Jipeng Zhang The Hong Kong University of Science and Technology, Rongxin Wu School of Informatics, Xiamen University, Charles Zhang The Hong Kong University of Science and Technology | ||
15:12 18mTalk | Decomposing Software Verification Using Distributed Summary Synthesis Research Papers DOI Media Attached File Attached |