Algorithms research has traditionally been problem centric, with an emphasis on rigorous guarantees for worst case instances. This approach, however, often leads to pessimistic predictions that do not accurately reflect the nature of real life instances.
With the availability of data at a large scale across numerous domains, comes a golden opportunity. The aim is to develop new frameworks and algorithms for machine learning that can use the nature of data to gain a more fine grained understanding of the problem at hand. I call this as data aware modeling.
I will showcase how data aware models and algorithms can lead to new algorithmic techniques with rigorous guarantees for various different problems in machine learning. In particular, I will present specific examples involving classical problems such as k-means and spectral clustering, and more modern applications in semi-supervised and interactive learning.