Online learning is a canonical paradigm that sheds light on how to act effectively at the face of uncertainty. Classical results on non-stochastic online learning show robust worst-case guarantees on the performance without making any assumption on the input. Recently there is emerging interest to understand how one can use the existence of some nice structure in the data to obtain enhanced guarantees when this structure indeed exists without sacrificing the worst-case performance.
In this talk, I will focus on one such nice structure: the existence of an alternative with very good performance and aim for guarantees that degrade with its performance (small-loss bounds). I will focus on a general graph-based feedback setting where, at each round, a decision-maker selects an action from a finite set of alternatives and receives feedback based on a combinatorial graph-based feedback model introduced by Mannor and Shamir. This encapsulates as special cases important partial feedback settings such as bandits, online routing, and contextual bandits. Small-loss bounds are of particular interest in these partial-feedback settings as, to be satisfied, they require algorithms that do not suffer from an important drawback of classical worst-case techniques: the need to explore often all alternatives, including suboptimal ones. I will provide a general black-box reduction for these settings achieving high-probability small-loss bounds for the graph-based feedback setting and illustrate how our technique actively avoids this over-exploration.
The talk will be mostly based on joint work with Karthik Sridharan and Eva Tardos that appeared in COLT 18 and is currently under revision in Mathematics of Operations Research.
Thodoris Lykouris is a Postdoctoral Researcher in the Machine Learning group of Microsoft Research NYC. He recently graduated with a Ph.D. in Computer Science from Cornell University where he was advised by Eva Tardos. His research spans across the areas of machine learning, algorithms, optimization, and economics, with a particular emphasis on enhancing online decision-making in complex multi-agent systems. Thodoris is a recipient of the 2018 Google Ph.D. Fellowship on "Algorithms, Optimization, and Markets" and a finalist in the 2018 INFORMS Nicholson and 2017 Applied Probability Society Student Paper Competitions. He was a research intern at Microsoft Research Redmond (summer 2018), Google Research NYC (summer 2017), and Microsoft Research India (summer 2015). He was also a visiting student at TTI-Chicago (March-May 2018) and the Simons Institute at Berkeley (fall 2015, spring 2018). Thodoris received his Diploma from the EECS department of the National Technical University of Athens.