The last few years have witnessed the rise of the big data era, which features the prevalence of data sets that are high-dimensional, noisy, and dynamically generated. As a consequence, the gap between the limited availability of computational resources and the rapid pace of data generation has become ubiquitous in real-world applications, and has in turn made it indispensable to develop provable learning algorithms with efficient computation, economic memory usage, and noise-tolerant mechanisms.
Our work is driven inherently by practical large-scale problems, and aims to understand the fundamental limits imposed by the characteristics of the problems (e.g. high-dimensional, noisy, sequential), explore the benefits of geometric structures (e.g. sparsity, low rank), and offer scalable optimization tools to balance the trade-off between model accuracy, computational efficiency and sample complexity.
The dissertation mainly investigates three important problem areas: Sparse recovery; Online and Stochastic Optimization; Estimation from Quantized Data. The predefense will describe these topics and some of the results I have obtained.