# DIMACS Theory of Computation Seminar - Spring 2016

Timings : Wednesdays, 11:00 am - 12:00 noon

Venue : CoRE 301

Organizers : Eric Allender, Pranjal Awasthi, Michael Saks and Mario Szegedy

Seminar webpages for previous semesters can be found here.

## Talks

• January 27, Periklis Papakonstantinou, Rutgers Business School

• Title : Efficient depth reduction for composites is possible
• Abstract :

In 1989 it was shown by Allender and Hertrampf that every circuit of depth $d$ and gates AND,OR,NOT, and MODp can be reduced to a depth 3 circuit of size $2^{(\log n)^{O(d)}}$. The question about MODm gates was handled a year later by Yao, and subsequently by Beigel and Tarui, with a triple-exponentially size bound, i.e. $2^{(\log n)^{2^{O(d)}}}$.

We resolve the question for composites obtaining the same asymptotic result as Allender-Hertampf.

Depth reduction is a fundamental question on its own. It also has significant implcations. For example, one of its immediate consequences is an exponential depth-improvement in Williams' program for separations of NEXP.

This is joint work with Shiteng Chen.

• February 3, Siyao Guo, Courant Institute (New York University)

• Title : Negation-Limited Formulas
• Abstract :

We give an efficient structural decomposition theorem for formulas that depends on their negation complexity and demonstrate its power with the following applications.

• We prove that every formula that contains $t$ negation gates can be shrunk using a random restriction to a formula of size $O(t)$ with the shrinkage exponent of monotone formulas. As a result, the shrinkage exponent of formulas that contain a constant number of negation gates is equal to the shrinkage exponent of monotone formulas.
• We give an efficient transformation of formulas with $t$ negation gates to circuits with $\log(t)$ negation gates. This transformation provides a generic way to cast results for negation-limited circuits to the setting of negation-limited formulas. For example, using a result of Rossman (CCC '15), we obtain an average-case lower bound for formulas of polynomial-size on $n$ variables with $n^{1/2-\epsilon}$ negations.
In addition, we prove a lower bound on the number of negations required to compute one-way permutations by polynomial-size formulas.

Joint work with Ilan Komargodski.

• Presentation available here.

• February 10, Lin F. Yang, Johns Hopkins University

• Title : Streaming Symmetric Norms via Measure Concentration
• Abstract :

We characterize the streaming space complexity of every symmetric norm $\ell$ (a norm on $R^n$ invariant under sign-flips and coordinate-permutations), by relating this space complexity to the measure-concentration characteristics of $\ell$. Specifically, we provide matching upper and lower bounds (up to polylog factors) on the space complexity of approximating the norm of the stream, where both bounds depend on the median and maximum of $\ell(x)$ when $x$ is drawn uniformly from the $\ell_2$ unit sphere. The same quantity governs many phenomena in high-dimensional spaces, such as large-deviation bounds and the critical dimension in Dvoretzky's Theorem. The family of symmetric norms contains several well-studied norms, such as all $\ell_p$ norms, and indeed we provide a new explanation for the disparity in space complexity between $p \le 2$ and $p>2$. In addition, we apply our general results to easily derive bounds for several norms that were not studied before in the streaming model, including for example the top-$k$ norm and the $k$-support norm, which was recently shown to be effective for machine learning tasks.

Joint work with:
Jaroslaw Blasiok, Vladimir Braverman, Stephen R. Chestnut, Robert Krauthgamer

The current version of the paper can be found at: http://arxiv.org/abs/1511.01111

• February 17, Alexander Golovnev , NYU

• Title : Generalizations of the gate elimination method
• Abstract :

In this talk we consider Boolean circuits over the full binary basis. It is known that almost all Boolean functions of $n$ variables require circuits of size $\Theta(2^n/n)$. At the same time, the best known lower bound of $3n-o(n)$ for an explicit Boolean function (that is, a function from NP) was given by Blum in 1984. This bound, as well as most of the previous bounds, is based on a simple method called gate elimination. The idea of the method is as follows: roughly, to prove a lower bound of $cn$ for a function from a certain class one shows that in any circuit computing a function from the class one can find a substitution of a constant to an input variable that eliminates at least $c$ gates from a circuit and keeps the function in the same class; the bound then follows by induction.

In this talk, we discuss generalizations of gate elimination: we consider more involved substitutions, circuit complexity measures, and generalized circuit models that simplify and improve the analysis.

We show that affine dispersers are not computable by circuits of size $3.01n$. We also show that an explicit construction of dispersers for quadratic varieties (currently unknown) implies stronger lower bounds.

The talk is based on joint works with Magnus G. Find, Edward A. Hirsch, and Alexander S. Kulikov:
http://eccc.hpi-web.de/report/2015/166/
http://eccc.hpi-web.de/report/2015/170/

• March 2, Morteza Monemizadeh, Rutgers University

• Title : Matching in Data Streams
• Abstract :

As noted by Lovasz and Plummer in their classic book:

"Matching Theory is a central part of graph theory, not only because of its applications, but also because it is the source of important ideas developed during the rapid growth of combinatorics during the last several decades.''
We consider variants of matching in data streams when the insertion and the deletion of edges are revealed in a streaming fashion. In particular,
• When the input graph is planar, we present a simple and elegant streaming algorithm that with high probability estimates the size of a maximum matching within a constant factor using $O(n^{2/3})$ space, where n is the number of vertices. The approach generalizes to the family of graphs that have bounded arboricity, which include graphs with an excluded constant-size minor.
• We further reduce the required memory size to $O(\sqrt{n})$ for three restricted settings: (i) when the input graph is a forest; (ii) when we have 2-passes and the input graph has bounded arboricity; and (iii) when the edges arrive in random order and the input graph has bounded arboricity.
• We present a simple but powerful subgraph sampling primitive that is applicable in a variety of computational models including dynamic graph streams (where the input graph is defined by a sequence of edge/hyperedge insertions and deletions) and distributed systems such as MapReduce. In the case of dynamic graph streams, we use this primitive to prove that there exists an $O(k^2)$ space algorithm that returns the edges of a maximum matching on the assumption the cardinality is at most k. The best previous algorithm used $O(kn)$ space we prove our result is optimal up to logarithmic factors. Our algorithm has $O(1)$ update time.
• We also show that there exists an $O(n^2/\alpha^3)$ space algorithm that returns an \alpha-approximation for matchings of arbitrary size. In independent work, Assadi et al.~(SODA 2016) proved this approximation algorithm is optimal and provided an alternative algorithm. We generalize our exact and approximate algorithms to weighted matching. For graphs with low arboricity such as planar graphs, the space required for constant approximation can be further reduced. While there has been a substantial amount of work on approximate matching in insert-only graph streams, these are the first non-trivial results in the dynamic setting.

• March 9, Neil Lutz, Rutgers

• Title : Algorithmic information, plane Kakeya sets, and conditional dimension
• Abstract :

We formulate the conditional Kolmogorov complexity of $x$ given $y$ at precision $r$, where $x$ and $y$ are points in Euclidean spaces and $r$ is a natural number. We demonstrate the utility of this notion in two ways.

1. We prove a point-to-set principle that enables one to use the (relativized, constructive) dimension of a single point in a set $E$ in a Euclidean space to establish a lower bound on the (classical) Hausdorff dimension of $E$. We then use this principle, together with conditional Kolmogorov complexity in Euclidean spaces, to give a new proof of the known, two-dimensional case of the Kakeya conjecture. This theorem of geometric measure theory, proved by Davies in 1971, says that every plane set containing a unit line segment in every direction has Hausdorff dimension 2.

2. We use conditional Kolmogorov complexity in Euclidean spaces to develop the lower and upper conditional dimensions dim($x|y$) and Dim($x|y$) of $x$ given $y$, where $x$ and $y$ are points in Euclidean spaces. Intuitively these are the lower and upper asymptotic algorithmic information densities of $x$ conditioned on the information in $y$. We prove that these conditional dimensions are robust and that they have the correct information-theoretic relationships with the well studied dimensions dim($x$) and Dim($x$) and mutual dimensions mdim($x:y$) and Mdim($x:y$).

Joint work with Jack Lutz.

• March 23, Andrew McGregor, University of Massachusetts

• Title : Algorithms for Massive Graphs via Linear Sketching
• Abstract :

In this talk, we will survey recent work on using random linear projections, a.k.a. sketches, to analyze massive graphs. Sketches are useful in a variety of computational models including the dynamic graph stream model were the input is defined by a stream of edge insertions and deletions that need to be processed in small space. A large number of problems have now been studied in this model including edge and vertex connectivity, spectral sparsification, triangle counting, densest subgraph, correlation clustering, vertex cover, and matching.

• March 30, Aravindan Vijayaraghavan, Northwestern University

• Title : Learning Communities in the Presence of Errors
• Abstract :

The Stochastic Block Model or the Planted Partition Model is the most widely used probabilistic model for community detection and clustering graphs in various fields, including machine learning, statistics, and social sciences. Many existing algorithms successfully learn the communities or clusters when the data is drawn exactly according to the model, but they do not work well in the presence of errors. In this talk, I will address the following question: Can we design robust polynomial time algorithms for learning probabilistic models for community detection that work in the presence of adversarial modelling errors?

I will present robust algorithms for (partially) recovering communities or clusters in graphs drawn from the Stochastic Block Model, in the presence of modeling errors or noise. These algorithms allow two types of adversarial errors: edge outlier errors (where an adversary can corrupt an arbitrary $\epsilon$ fraction of the edges), and any amount of Feige-Kilian or monotone errors. Mossel, Neeman and Sly (STOC 2015) posed an open question about whether an almost exact recovery is possible when the adversary is allowed to add $o(n)$ edges. Our work answers this question affirmatively even in the case of $k>2$ communities.

Finally, I will describe how our algorithms recover the clusters, even when the modelling error is captured using Kullback-Leibler (KL) divergence: these algorithms work when the instances come from any distribution of graphs that is $\epsilon m$ close to the Stochastic Block Model in the KL divergence (this result also handles adversarial errors).

This is based on joint work with Konstantin Makarchev and Yury Makarychev.

• April 6, Yingyu Liang, Princeton University

• Title : Recovery Guarantee of Weighted Low-Rank Approximation via Alternating Minimization
• Abstract :

Many applications require recovering a ground truth low-rank matrix from noisy observations of the entries. In practice, this is typically formulated as a weighted low-rank approximation problem and solved using non-convex optimization heuristics such as alternating minimization. Such non-convex techniques have few guarantees. Even worse, weighted low-rank approximation is NP-hard for even the most simple case when the ground truth is a rank-1 matrix.

Under moderate assumptions on the weight matrix and the ground truth, we prove that the vanilla alternating minimization algorithm with a simple and cheap "clipping" step, which zeroes out rows with high $l_2$ norms in the intermediate solutions, recovers the ground truth. In particular, we bound the spectral norm of the difference between the recovered matrix and the ground truth, by the spectral norm of the weighted noise plus an additive error term that decreases exponentially with the number of rounds of alternating minimization. This provides the first recovery guarantee for weighted low-rank approximation via alternating minimization with non-binary deterministic weights. It is a significant generalization of the results for matrix completion, the special case with binary weights, since our assumptions are similar or weaker than those made in existing works. Additionally, our proof applies for random initialization, unlike prior approaches that typically require an SVD-based initialization.

• April 13, Rocco Servedio, Columbia University

• Title : Deterministic approximate counting for low-degree polynomial threshold functions
• Abstract :

Consider the following algorithmic problem: You are given a degree-d real polynomial over $n$ variables $x_1,...,x_n$. You would like to (approximately) compute the fraction of points in the Boolean hypercube $\{0,1\}^n$ on which the polynomial takes a non-negative value. (Equivalently, you are given a degree-$d$ polynomial threshold function, and you would like to approximately count its satisfying assignments.)

This problem is trivial to solve using random sampling...but what if you are not allowed to use randomness? In this talk we describe an efficient deterministic algorithm for this approximate counting problem. The algorithm employs many ingredients including invariance principles, new central limit theorems for low-degree polynomials, and new structural decompositions of such polynomials. No prior background will be assumed for the talk.

Joint work with Anindya De.

• April 20, Pranjal Awasthi

• Title : Learning Mixtures of Ranking Models
• Abstract :

Probabilistic modeling of ranking data is an extensively studied problem with applications ranging from understanding user preferences in electoral systems and social choice theory, to more modern learning tasks in online web search, crowd-sourcing and recommendation systems. This work concerns learning the Mallows model -- one of the most popular probabilistic models for analyzing ranking data. In this model, the user's preference ranking is generated as a noisy version of an unknown central base ranking. The learning task is to recover the base ranking and the model parameters using access to noisy rankings generated from the model.

Although well understood in the setting of a homogeneous population (a single base ranking), the case of a heterogeneous population (mixture of multiple base rankings) has so far resisted algorithms with guarantees on worst case instances. In this talk I will present the a polynomial time algorithm which provably learns the parameters and the unknown base rankings of a mixture of two Mallows models.

Joint work with Avrim Blum, Or Sheffet and Aravindan Vijayaraghavan

• April 27, Daniel Hsu, Columbia University

• Title : Reducing contextual bandits to supervised learning
• Abstract :

I'll describe a fast and simple algorithm for the contextual bandit learning problem, where the learner repeatedly takes an action in response to the observed context, observing the reward only for that action. The algorithm assumes access to an oracle for solving cost-sensitive classification problems and achieves the statistically optimal regret guarantee with only $\tilde{O}(\sqrt{T})$ oracle calls across all $T$ rounds. Joint work with Alekh Agarwal, Satyen Kale, John Langford, Lihong Li, and Rob Schapire.