PhD Defense
10/29/2009 11:00 am
CoRE B (Room 305)

Private Approximation and Streaming Protocols: Theory and Applications

Andre Madeira, Rutgers University

Defense Committee: S. Muthukrishnan (chair), Rebecca Wright, Danfeng Yao and Moti Yung(Columbia University and Google)

Abstract

This thesis studies efficient and private approximations to function problems on massive datasets.

Increasingly, the number of applications generating, collecting and computing on vast quantities of digital data is becoming widespread. As non-trivial analyses of these datasets typically require expensive computations, the search for efficient solutions has been the predominant theme and concern of several different areas of applied and theoretical Computer Science research. Another important and growing concern is the privacy implications of computing on data considered sensitive; specifically regarding distributed computations among mutually distrustful yet collaborating parties. Although substantial work on both fronts exists independently, these issues have rarely been addressed simultaneously.

In this thesis, we tackle both challenges concurrently. We present a general feasibility result that gives sufficient conditions for a deterministic function $f$ to admit an efficient private approximation $g$. A common theme among this and our other results is proving functional privacy, an inherent and necessary property of any protocol computing a private approximation. Our approaches to ensuring functional privacy while keeping  resources  at  a minimal might be of independent interest and prove useful to the design of other private approximations. The specific protocols presented in this thesis are:

- Efficient private approximation protocols for the $L_p$ distance, $p\in(0,2]$, of two $n$-dimensional real-valued vectors using polylogarithmic communication  in $n$.  Our first protocol, for $p=2$, improves upon a previous result by   reducing a quadratic computation to a quasi-linear one. Our more generic   result for $p\in(0,2]$, however, applies also to the case where the input is  presented as a data stream. The latter is the first known private approximation protocol for any streaming problem.  We believe the  techniques used can spur other private streaming results.

- Efficient private approximation protocols for a variety of hard counting problems, including:  
 - the #P-complete problem #DNF, or estimating the number of satisfiable  assignments to a Boolean formula in disjunctive normal form,  
 - every problem within two subclasses of #P similar to Ron Fagin's logic-based  characterization of NP), and  
 - few fixed parameter tractable (FPT) problems.

- Efficient protocols for the 1-out-of-$n$ single-server Computationally-Private   Information Retrieval (CPIR) problem for $\ell$-bit strings and its symmetric  variant. Unlike previous results, we achieve simultaneous   polylogarithmic storage, communication, and local computation by relaxing   accuracy requirements, a novel approach.

Print Login