Approximating the Smallest Grammar: Kolmogorov Complexity in Natural Models

Manoj Prabhakaran, Dept. of Computer Science, Princeton University

May 7, 4:30 PM, Rutgers Univ. CORE 431

Abstract.

We consider the problem of finding the smallest context-free grammar that generates exactly one given string of length n. The size of this grammar is of theoretical interest as an efficiently computable variant of Kolmogorov complexity. The problem is of practical importance in areas such as data compression and pattern extraction.

The smallest grammar is known to be hard to approximate to within a constant factor, and an o(log n/log log n) approximation would require progress on a long-standing algebraic problem. Previously, the best proved approximation ratio was O(n^{1/2}) for the Bisection algorithm. Our main result is an exponential improvement of this ratio; we give an O(log (n / g*)) approximation algorithm, where g* is the size of the smallest grammar.

We then consider other computable variants of Kolomogorov complexity. In particular we give an O(log^2 n) approximation for the smallest non-deterministic finite automaton with advice that produces a given string. We also apply our techniques to ``advice-grammars'' and ``edit-grammars'', two other natural models of string complexity.

Joint work with Moses Charikar, Eric Lehman, Ding Liu, Rina Panigrahy, April Rasala, Amit Sahai and Abhi Shelat



Back to Discrete Math/Theory of Computing seminar