Data-Centric AutoML and Benchmarks with Optimal Transport

Prabhant Singh

Research Engineer at TU Eindhoven

Automated machine learning (AutoML) aims to make easier and more accessible use of machine learning algorithms for researchers with varying levels of expertise. However, AutoML systems, including classical ones such as Auto-Sklearn and Neural Architecture Search (NSGANet, ENAS, DARTS), still face challenges with starting from scratch for their search process commonly referred to as the cold-start problem. This issue can cause the search phase to take longer and presents a challenge for these systems in situations with time constraints or limited data availability.
To address these problems, our project aims to develop methods and systems that utilize Optimal Transport-based measures to calculate dataset similarity in these scenarios. This approach can be used to warm-start these systems for downstream tasks in supervised and unsupervised learning problems. The project is a collaboration between Prabhant Singh, an AutoML researcher, and proposed visitor, and Dr. Carola Doerr from the LIP6 Lab at Sorbonne Université in Paris, who is an expert in black-box optimization and benchmarking. It will contribute to several TAILOR objectives, precisely WP7 Auto AI’s main challenges like `Ever learning AI’ and `Beyond standard supervised learning’.

More information about Prabhant: