IFML Seminar: Contextual Battling Bandits: Learning to Make User-Dependent Predictions Through Preference Elicitation

Aadirupa Saha, Research Scientist, Apple

12:15 - 1 pm

The University of Texas at Austin
Gates Dell Complex (GDC 6.302)
United States

Abstract: Customer statistics collected in several real-world systems have reflected that users often prefer eliciting their liking for a given pair of items, say (A,B), in terms of relative queries like: "Do you prefer Item A over B?", rather than their absolute counterparts: ``How much do you score items A and B on a scale of [0-10]?".

Drawing inspirations, in the search for a more effective feedback collection mechanism, led to the famous formulation of Dueling Bandits (DB), which is a widely studied online learning framework for efficient information aggregation from relative/comparative feedback. However despite the novel objective, unfortunately, most of the existing DB techniques were limited only to simpler settings of finite decision spaces, and stochastic environments, which are unrealistic in practice.

In this talk, we will start with the basic problem formulations for DB and familiarize ourselves with some of the breakthrough results. Following this, will dive deep into a more practical framework of contextual dueling bandits (C-DB) where the goal of the learner is to make customized predictions based on the user contexts: We will see a new algorithmic approach that can efficiently achieve the optimal O(\sqrt T) regret performance for this problem, resolving an open problem from Dudík et al. [COLT, 2015]. We will conclude the talk with some interesting open problems.

[Major part of the talk on the C-DB setup is based on joint work with Akshay Krishnamurthy, "Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability", ALT 2022.]

Bio: Aadirupa is currently a research scientist at Apple ML research, broadly working in the area of Machine Learning theory. She just finished a short-term research visit at Toyota Technological Institute at Chicago (TTIC), and completed her postdoc stinct at Microsoft Research New York City before that. Aadirupa obtained her P.h.D from the department of Computer Science, Indian Institute of Science, Bangalore, advised by Aditya Gopalan and Chiranjib Bhattacharyya and interned at Microsoft Research, INRIA, and Google AI.

Her research interests include Bandits, Reinforcement Learning, Optimization, Learning theory, Algorithms. Off late, she is also very interested in working on problems at the intersection of ML and Game theory, Algorithmic fairness, and Privacy.

Event Registration