This program is tentative and subject to change.

Thu 18 Jul 2024 17:12 - 17:30 at Baobá 1 - SE4AI 2

Automated Machine Learning aka AutoML toolkits are low/no-code software that aim to democratize ML system application development by ensuring rapid prototyping of ML models and by enabling collaboration across different stakeholders in ML system design (e.g., domain experts, data scientists, etc.). It is thus important to know the state of current AutoML toolkits and the challenges ML practitioners face while using those toolkits. In this paper, we first offer a characterization of currently available AutoML toolits by analyzing 37 top AutoML tools and platforms. We find that the top AutoML platforms are mostly cloud-based. Most of the tools are optimized for the adoption of shallow ML models. Second, we present an empirical study of 14.3K AutoML related posts from Stack Overflow (SO) that we analyzed using topic modelling algorithm LDA (Latent Dirichlet Allocation) to understand the challenges of ML practitioners while using the AutoML toolkits. We find 13 topics in the AutoML related discussions in SO. The 13 topics are grouped into four categories: MLOps (43% of all questions), Model (28% questions), Data (27% questions), and Documentation (2% questions). Most questions are asked during Model training (29%) and Data preparation (25%) phases. AutoML practitioners find the MLOps topic category most challenging. Topics related to the MLOps category are the most prevalent and popular for cloud-based AutoML toolkits. Based on our study findings, we provide 15 recommendations to improve the adoption and development of AutoML toolkits.

This program is tentative and subject to change.

Thu 18 Jul

Displayed time zone: Brasilia, Distrito Federal, Brazil change

16:00 - 18:00
16:00
18m
Talk
Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large Language Models
Research Papers
Yan Wang Central University of Finance and Economics, Xiaoning Li Central University of Finance and Economics, Tien N. Nguyen University of Texas at Dallas, Shaohua Wang Central University of Finance and Economics, Chao Ni School of Software Technology, Zhejiang University, Ling Ding Central University of Finance and Economics
16:18
18m
Talk
On Reducing Undesirable Behavior in Deep-Reinforcement-Learning-Based Software
Research Papers
Ophir Carmel The Hebrew University of Jerusalem, Guy Katz The Hebrew University of Jerusalem
16:36
9m
Talk
GAISSALabel: A tool for energy labeling of ML models
Demonstrations
Pau Duran Universitat Politècnica de Catalunya (UPC), Joel Castaño Fernández Universitat Politècnica de Catalunya (UPC), Cristina Gómez Universitat Politècnica de Catalunya, Silverio Martínez-Fernández UPC-BarcelonaTech
Link to publication Pre-print
16:45
9m
Talk
Decide: Knowledge-based Version Incompatibility Detection in Deep Learning Stacks
Demonstrations
Zihan Zhou The University of Hong Kong, Zhongkai Zhao National University of Singapore, Bonan Kou Purdue University, Tianyi Zhang Purdue University
16:54
18m
Talk
Test input prioritization for Machine Learning Classifiers
Journal First
Xueqi Dang University of Luxembourg, Yinghua LI University of Luxembourg, Mike Papadakis University of Luxembourg, Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg
17:12
18m
Talk
How Far Are We with Automated Machine Learning? Characterization and Challenges of AutoML Toolkits
Journal First
Md Abdullah Al Alamin University of Calgary, Gias Uddin York University, Canada