In machine learning, many inference problems require a known structure
which is unobserved in the real world. This structure is usually expressed
as a graphical model which provides a way to represent entities along with
their dependencies. Structure learning refers to the problem of learning
the topology (and parameters) of the graphical model from observed data.
My research in this area focus on developing computationally and
statistically efficient algorithms, understanding their behavior using
concepts such as convergence and sample complexity, and designing new
modeling paradigms such as models rooted in game theory.
This talk will first focus on the problem of learning probabilistic and
game-theoretical structures from data. Motivated by these goals, this talk
introduces recent techniques for the approximate optimization of a
discontinuous and NP-hard objective function, as well as convergence rates
for a class of objective functions for which only biased estimates of the
gradient are available. On the applied side, we will discuss issues of
interpretability and regularization, and show interesting findings in
political science (U.S. congressional voting records), neuroscience
(cocaine addiction) and genetics (cancer).
The above learning problems are instances of a more general problem:
regularized loss minimization (RLM). Understanding the behavior of this
general problem is of great importance, given that RLM is a common
approach to solving many machine learning problems. This talk introduces
novel results of loss consistency, norm consistency, sparsistency and sign
consistency for RLM. These results pertain to several problems in the
literature, such as the estimation of exponential family distributions
(e.g. learning the graph structure of Gaussian or discrete MRFs),
generalized linear models (e.g. linear regression, compressed sensing),
matrix factorization problems (e.g. exponential-family PCA, max-margin
matrix factorization), nonparametric linear regression (with an infinite
set of basis functions), nonparametric clustering with exponential
families (where the number of clusters is not fixed and possibly
infinite), PAC-Bayes learning (e.g. classification, structured prediction)
as well as several regularization techniques.