Who

Nathan Cassee, Andrei Agaronian, Eleni Constantinou, Nicole Novielli, Alexander Serebrenik

Track

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Jul 2024 17:30 - 17:48 at Pitomba - AI4SE 3 Chair(s): Maliheh Izadi

Abstract

Sentiment analysis has been used to study aspects of software engineering, such as issue resolution, toxicity, and self-admitted technical debt. The automatic classification of software engineering texts into three different polarity classes (negative, neutral, and positive) makes it possible to understand how developers communicate. To address the peculiarities of software engineering texts, sentiment analysis tools often consider the specific technical lingo practitioners use. With the emergence of more advanced deep-learning models, it has become increasingly important to understand the performance and limitations of sentiment analysis tools when applied to software engineering data. This is especially true because existing replications of software engineering studies that apply sentiment analysis tools show that tool choice can influence the conclusions obtained. Moreover, we believe that it is important to assess the performance of newer deep-learning tools and models and compare their performance to that of existing tools.

Therefore, we validated two existing recommendations made in software engineering literature: The recommendation to use pre-trained transformer models to classify sentiment and the recommendation to replace non-natural language elements with meta-tokens. 
The recommendations were validated in a set of rigorous benchmarks. 
We picked five different sentiment analysis tools, paying attention to select a diverse set of tools of two pre-trained transformer models and three machine learning tools. 
Because recent benchmarks show that ChatGPT is not competitive in sentiment analysis compared to fine-tuned tools, we do not select it for these benchmarks.
To train and evaluate the selected tools we use two state-of-the-art, manually labeled datasets sampled from GitHub and StackOverflow to evaluate the performance of the sentiment analysis tools. 

Based on the results of the benchmarks we conclude that these ``common-knowledge'' guidelines actually do not work as previously believed. 
We find that pre-trained transformers outperform the best machine learning tool on one of the two datasets, 
and that the performance difference is a few percentage points. 
Therefore, we recommend that software engineering researchers should not just consider predictive performance when selecting a sentiment analysis tool because the best-performing sentiment analysis tools perform very similarly to each other (within 4 percentage points). 
Additionally, we find that meta-tokenization, or the practice of pre-processing datasets to remove more non-natural language elements, does not further improve the predictive performance of sentiment analysis tools.  
These findings are relevant to researchers who apply sentiment analysis tools to software engineering data, as this information can help them select the appropriate tool. 
These findings also help tool builders of sentiment analysis tools we seek to further adapt software engineering specific sentiment analysis tools to software engineering.

Information

The article was accepted for publication by the Springer Journal of Empirical Software Engineering (EMSE) on the 23rd of February 2024. Currently the article has not yet been published by EMSE itself, however, the camera-ready version submitted to the Author Services can be accessed online.\footnote{\url{https://cassee.dev/files/meta-tokenization-transformers.pdf}} The article is an original, journal-first article. It has not been presented, nor is it under consideration, for any other journal-first program or other conferences.

Nathan Cassee

Eindhoven University of Technology

Netherlands

Andrei Agaronian

Eindhoven University of Technology

Eleni Constantinou

University of Cyprus

Cyprus

Nicole Novielli

University of Bari

Italy

Alexander Serebrenik

Eindhoven University of Technology

Netherlands

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 18 Jul
Displayed time zone: Brasilia, Distrito Federal, Brazil change

16:00 - 18:00	AI4SE 3Industry Papers / Demonstrations / Journal First at Pitomba Chair(s): Maliheh Izadi Delft University of Technology

16:00 18m Talk		Rethinking Software Engineering in the Era of Foundation Models Industry Papers Ahmed E. Hassan Queen’s University, Dayi Lin Centre for Software Excellence, Huawei Canada, Gopi Krishnan Rajbahadur Centre for Software Excellence, Huawei, Canada, Keheliya Gallaba Centre for Software Excellence, Huawei Canada, Filipe Cogo Centre for Software Excellence, Huawei Canada, Boyuan Chen Centre for Software Excellence, Huawei Canada, Haoxiang Zhang Huawei, Kishanthan Thangarajah Centre for Software Excellence, Huawei Canada, Gustavo Oliva Centre for Software Excellence, Huawei Canada, Jiahuei (Justina) Lin Centre for Software Excellence, Huawei Canada, Wali Mohammad Abdullah Centre for Software Excellence, Huawei Canada, Zhen Ming (Jack) Jiang York University
16:18 18m Talk		LM-PACE: Confidence Estimation by Large Language Models for Effective Root Causing of Cloud Incidents Industry Papers Shizhuo Zhang University of Illinois Urbana-Champaign, Xuchao Zhang Microsoft, Chetan Bansal Microsoft Research, Pedro Las-Casas Microsoft, Rodrigo Fonseca Microsoft Research, Saravan Rajmohan Microsoft
16:36 18m Talk		Application of Quantum Extreme Learning Machines for QoS Prediction of Elevators' Software in an Industrial Context Industry Papers Xinyi Wang Simula Research Laboratory and University of Oslo, Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University, Aitor Arrieta Mondragon University, Paolo Arcaini National Institute of Informatics , Maite Arratibel Orona
16:54 18m Talk		X-lifecycle Learning for Cloud Incident Management using LLMs Industry Papers Drishti Goel Microsoft, Fiza Husain Microsoft, Aditya Kumar Singh Microsoft, Supriyo Ghosh Microsoft, Anjaly Parayil Microsoft, Chetan Bansal Microsoft Research, Xuchao Zhang Microsoft, Saravan Rajmohan Microsoft Media Attached
17:12 18m Talk		Neat: Mobile App Layout Similarity Comparison based on Graph Convolutional Networks Industry Papers Zhu Tao ByteDance, Yongqiang Gao ByteDance, Jiayi Qi ByteDance, Chao Peng ByteDance, China, Qinyun Wu Bytedance Ltd., Xiang Chen ByteDance, Ping Yang Bytedance Network Technology
17:30 18m Talk		Transformers and Meta-Tokenization in Sentiment Analysis for Software Engineering Journal First Nathan Cassee Eindhoven University of Technology, Andrei Agaronian Eindhoven University of Technology, Eleni Constantinou University of Cyprus, Nicole Novielli University of Bari, Alexander Serebrenik Eindhoven University of Technology
17:48 9m Talk		EM-Assist: Safe automated ExtractMethod refactoring with LLMs Demonstrations Dorin Pomian University of Colorado Boulder, Abhiram Bellur University of Colorado Boulder, Malinda Dilhara University of Colorado Boulder, Zarina Kurbatova JetBrains Research, Egor Bogomolov JetBrains Research, Andrey Sokolov JetBrains Research, Timofey Bryksin JetBrains Research, Danny Dig University of Colorado Boulder, JetBrains Research Pre-print

Transformers and Meta-Tokenization in Sentiment Analysis for Software Engineering

Abstract