

Instructor 
Sepehr Assadi 
Credits 
3 units 
Schedule 
Tuesdays 12:00 PM  3:00 PM in
TIL116 (Livingston campus) 
Prerequisites 
Undergraduate courses on algorithms, complexity theory, discrete mathematics, and probability; mathematical maturity. 
Syllabus 
The full course syllabus is available here. This webpage contains the highlights of course syllabus that are potentially updated as the semester progresses. 
With the emergence of massive datasets across different application domains, there is a rapidly growing
interest in solving various problems over immense amounts of data. However, even most basic algorithms can become computationally prohibitive when processing massive datasets as the inputs are often too large
to be stored in one place or read even once. As a result, a new set of algorithmic tools and ideas are needed for computing with exteremly constrained resources. This is the focus of
sublinear algorithms, namely, algorithms whose resource requirements (e.g. time or space) are substantially smaller than the size of the input that they operate on.
We will study various advanced algorithmic ideas through the lens of sublinear algorithms in this course. In particular, we consider two most canonical models of sublinear algorithms, namely, sublinear time algorithms and streaming algorithms,
and cover several key algorithmic techniques in these (and related) models, as well as discuss limitations inherent to computing with constrained resources.
Important Update: Following
Rutgers's response to COVID19 situation, all inperson instructions for this course are suspended
and we will continue our course through online lectures and online meetings for office hours. Please see the
Canvas page for the
course for more information (send me an email if you do not have access to Canvas).
 Instructor: Sepehr Assadi
 Instructor Email: sepehr.assadi@rutgers.edu
 Lecture Schedule: Tuesdays 12:00 PM  3:00 PM
(
TIL116  online lectures)
 Office hours: Mondays 3:00 PM to 4:00 PM, or by appointment (
CoRE 310  online meetings)
This course has no recitation sections.
The following is a tentative list of topics that will be covered in this course.
 Sublinear Time Algorithms: Which problems can be solved in time faster than even reading the entire input once?
We will cover sublinear time algorithms for property testing, distribution testing, and graph problems. We will also examine query complexity as a main tool for proving lower bound on
the performance of sublinear time algorithms.
 Streaming Algorithms: Which problems can be solved in space smaller than what is needed to store the entire input? We will cover streaming algorithms for statistical estimation, numerical linear algebra, and graph problems. We will also examine communication complexity as a main tool for proving lower bound on the performance of streaming algorithms.
Along the way, we will learn about various key ideas such as probabilistic analysis of algorithms, compressed sensing, dimensionality reduction, sparsification, sketching, coresets, etc. that are used extensively in algorithm design as a whole and sublinear algorithms in particular.
The final grade for the course will be based on the following weights:
 40% Problem sets
 40% Project
 20% Scribe notes and participation
Students are expected to follow
Rutgers academic integrity policy for all their work in this course.
See the
course syllabus for more information.
Problem sets: There will be
three two problem sets in the course and a tentative schedule of release and due dates are available on the
course calendar.
Solutions must be typeset in LaTeX and submitted via Canvas by 11:59pm EST on
Tuesday the problem set is due. Problem sets can (and probably should) be done in teams of up to three students. However, (1) the students should write their solutions completely independently (in particular, you should understand and be able to explain everything that is written in your solution); (2) you should include the name of your collaborators in your solutions.
Update: Due to the recent changes in the course to account for the COVID19 situation, we are going to only have two problem sets. Instead, each lecture will also contain
a single practice problem on the topics of the lecture.
You do not need to turn in a solution for these practice problems.
Project: There is a final project that will consist of exploring a topic of interest related to this course. This particularly involves reading one or two recent research papers in complete details to get a sense of the background on a research problem and then exploring ideas for addressing this problem. More details on the project will be released later in the semester
in the
Project section.
Scribe notes and participation:
For each lecture, there will be one team (of one or two students) in charge of taking detailed notes, typing them in LaTeX, preparing any needed figures, and sending them to the Instructor
by 11:59pm on Friday after the lecture to be posted on the course website. When preparing scribe notes, please use this LaTeX
template  just edit it to
include your notes. Further instruction on scribing the notes is available in the template.
The
course syllabus has further information about each of these assignments.
The schedule below the
red line is
tentative and subject to change.
# 
Date 
Topics 
References 
Lecture notes and Remarks 
1 
Tue 01/21 
Introduction, Course Policy, Probabilistic Analysis 
 
Lecture Notes 1 
2 
Tue 01/28 
Sublinear Time Algorithms: Connected Components, Average Degree 
CRT05, F06, GR08, S15 
Lecture Notes 2 
3 
Tue 02/04 
Query Complexity: OR Function and Connectivity 
BW02 
Lecture Notes 3  Pset 1 release: [PDF] 
4 
Tue 02/11 
Property Testing: Testing Sortedness, Uniform Distribution Testing 
EKKRV98, BFRSW00 
Lecture Notes 4 
5 
Tue 02/18 
Local Computation Algorithms (LCA): Maximal Independent Set 
PR07, RTVX11 
Lecture Notes 5 
6 
Tue 02/25 
Compressed Sensing and Sparse Recovery 
BHRRS18, RSW18 
Lecture Notes 6  Pset 1 due

7 
Tue 03/03 
Streaming Algorithms: Frequency Moments Estimation 
AMS96, BJKST02 
Lecture Notes 7 
Pset 2 release: [PDF] 
8 
Tue 03/10 
Communication Complexity: Equality, Index 
A96, T16 
Lecture Notes 8


Tue 03/17 
No Class: Spring Recess 


9 
Tue 03/24 
Streaming Algorithms: Regression via Dimensionality Reduction 
CW09 
Lecture Notes 9 
Pset 2 due

10 
Tue 03/31 
Streaming Algorithms: Clustering via Coresets 
GMMMO03, G09 
Lecture Notes 10,
[Practice Problem] 
Pset 2 due 
11 
Tue 04/07 
Graph Streaming Algorithms: Connectivity, Shortest Paths, Coloring 
FKMSZ04, ACK19 
[Practice Problem]

12 
Tue 04/14 
Detour: The Multiplicative Weight Update Method (MWU) 
AHK12 

13 
Tue 04/21 
MultiPass Streaming Algorithms for Bipartite Matching via MWU 
AG11 

14 
Tue 04/28 
Graph Sketching: AGM Sketch for Connectivity 
AGM12 

The project can take one of the following forms:
 Solve an open theory problem, formulate a new problem, or make some other contribution to the study of sublinear algorithms and/or lower bounds.
 Write a survey on a few related papers. A good approach is to first try to solve an open problem, which generally requires reading several background papers first, then switching to a survey
if the problem evades solution. However, even in this case, recording any partial progress is very important.
A list of project ideas (including open theory problems and some directions to explore) will be posted sometime in March. However, you are strongly encouraged to approach the Instructor with any project idea you have on sublinear algorithms before this date to pick as your own project  note that your project does not need to be limited to the topics discussed in class as long as it is loosly related to sublinear algorithms.
Project policies:
 The projects are done in teams of three students. Each team needs to submit a single writeup for the project and the presentation time will be split evenly between the team members.
The team members receive the same grade for their project. You are encouraged to discuss your progress on your project with the Instructor and get feedback.
 The writeup for the project is due by the end of the semester
and the presentation is done in the last lecture.
Updated Project Expectations and Requirements
 You should learn a significant amount about your chosen project topic. This will involve closely and carefully reading literature on your specific project topic (likely to be a paper or two).
You will demonstrate this aspect of your project in the "Background" section of your project report, which should be a clear exposition of the topic in your own words. You can format this according to the scribe notes of the class  basically, for this portion of the project you should turn in a polished and highquality set of scribe notes, as if you had been the scribe for a lecture on your chosen topic.
 You should gain research experience in this area; i.e. make a serious effort to contribute to the state of knowledge on your project topic by (i) identifying an interesting open question or direction for future research related to your project topic; (ii) coming up with an approach to make progress; and (iii) working to carry out your approach. You will demonstrate this aspect of your project by explaining in detail what you did for (i), (ii) and (iii) in the rest of your project report
and during your project presentation.
 New: Instead of the problem set three and the project presentation, you are now required to provide weekly reports on your project following the
updated timetable below. Exactly one report per group is required and it should be emailed to the Instructor by 11:59PM on the date specified in the timetable.
Updated Timetable for Projects:
 Tuesday, March 17: For those of you who are planning to work on the suggested project ideas, send me an email listing your top three favorite projects.
 Tuesday, March 24: For those of you who have sent their favorite projects from the list of suggested project ideas, I will send out an assignment of the project.
For the remainder of you, turn in a brief (12 paragraph) project proposal to me by email, describing your chosen topic, the sources you will use, and the portions of those sources that you will cover.
Update: The following parts are updated to account for the new changes in the course structure.
 Tuesday, April 7: A one to two page document on the following: (1) The formal description of the problem you are going to work on during your project; (2) a quick literature review including the main results known for your problem and the main paper(s) in the literature you are going to read for this project; (3) a summary of concrete goals you would like to pursue during the project.
 Tuesday, April 14: A five to ten page document on the following: A technical summary of the main prior work in the literature, including (1) the formal statement of their result, (2) a high level overview of their proof idea, (3) a lower level description of their proof including the main technical lemmas and claims, and (4) either complete proofs or proof sketches of these technical parts in your own words  this document should be at a level that one could get the full picture of the result you are describing solely based on your writeup.
 Tuesday, April 21: A two to ten page document on the following: (1) A high level description of a concrete plan for approaching your problem and the type of result you hope to obtain, (2) a lower level set of lemmas and claims that if proven would imply your desired result.

Tuesday, April 28: Presentations in the class.
 Tuesday, April 28: A two to ten page document on the following: (1) The proofs of the technnical lemmas and claims from the previous part, or (2) concrete and technical reasons for why your attempts failed to prove the desired technical results and any possible update that you came along the way that may allow you to bypass these barriers. This document should be at a level that one (including yourself in near future!) could pickup the project from this part and continue making progress on the main problem.
 Wednesday, May 6: Email the instructor your final project report by combining the reports above and make any final changes (including any new results since the last report). Your report would be held to the standard of a formal publication.
There is no official textbook for this course and all required materials will be posted on this webpage.
The following is a list of some helpful supplementary materials (this list is by no means comprehensive):
 Background on Randomized Algorithms:
 Randomized Algorithms by Motwani and Raghavan;
 The Probabilistic Method by Alon and Spencer;
 Concentration of Measure for the Analysis of Randomised Algorithms by Dubhashi and Panconesi.
 Useful Books and Surveys:
 Related Courses:
And last but not the least, you should definitely check the
List of Open Problems in Sublinear Algorithms as one of the best places to get recent pointers on sublinear algorithms.
This is a (rather incomprehensive) list of the papers related to the topics discussed in the lectures. The list will be updated after each lecture to add the new relevant papers.
A96 
Farid M. Ablayev,
Lower Bounds for OneWay Probabilistic Communication Complexity and Their Application to Space Complexity. Theor. Comput. Sci. 1996, ICALP 1993. 
AG11 
Kook Jin Ahn, Sudipto Guha,
Linear Programming in the Semistreaming Model with Application to the Maximum Matching Problem. ICALP 2011. 
AGM12 
Kook Jin Ahn, Sudipto Guha, Andrew McGregor,
Analyzing Graph Structure via Linear Measurements. SODA 2012. 
AMS96 
Noga Alon, Yossi Matias, and Mario Szegedy,
The space complexity of approximating the frequency moments. STOC 1996. 
AHK12 
Sanjeev Arora, Elad Hazan, Satyen Kale,
The Multiplicative Weights Update Method: a MetaAlgorithm and Applications. Theory of Computing 2012. 
ACK19 
Sepehr Assadi, Yu Chen, Sanjeev Khanna,
Sublinear Algorithms for (Δ+1) Vertex Coloring. SODA 2019. 
BFRSW00 
Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, Patrick White,
Testing that distributions are close. FOCS 2000. 
BJKST02 
Ziv BarYossef, T. S. Jayram, Ravi Kumar, D. Sivakumar, Luca Trevisan,
Counting Distinct Elements in a Data Stream.
RANDOM 2002. 
BHRRS18 
Paul Beame, Sariel HarPeled, Sivaramakrishnan Natarajan Ramamoorthy, Cyrus Rashtchian, Makrand Sinha,
Edge Estimation with Independent Set Oracles.
ITCS 2018. 
BW02 
Harry Buhrman, Ronald de Wolf, Complexity Measures and Decision Tree Complexity: A Survey.
Theor. Comput. Sci., 2002. 
CW09 
Kenneth L. Clarkson, David P. Woodruff,
Numerical Linear Algebra in the Streaming Model. STOC 2009. 
CRT05 
Bernard Chazelle, Ronitt Rubinfeld, Luca Trevisan,
Approximating the Minimum Spanning Tree Weight in Sublinear Time. SIAM Journal of Computing 2005, ICALP 2001. 
EKKRV98 
Funda Ergün, Sampath Kannan, Ravi Kumar, Ronitt Rubinfeld, Mahesh Viswanathan,
SpotCheckers. STOC 1998. 
F06 
Uriel Feige, On Sums of Independent Random Variables with Unbounded Variance and Estimating the Average Degree in a Graph. SIAM Journal of Computing 2006, STOC 2004. 
FKMSZ04 
Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, Jian Zhang On Graph Problems in a Semistreaming Model. ICALP 2004. 
G09 
Sudipto Guha, Tight Results for Clustering and Summarizing Data Streams. ICDT 2009. 
GMMMO03 
Sudipto Guha, Adam Meyerson, Nina Mishra, Rajeev Motwani, Liadan O'Callaghan, Clustering Data Streams: Theory and Practice. IEEE Trans. Knowl. Data Eng. 2003, FOCS 2000. 
GR08 
Oded Goldreich, Dana Ron, Approximating average parameters of graphs. Random Structures and Algorithms 2006, APPROXRANDOM 2006. 
PR07 
Michal Parnas, Dana Ron, Approximating the Minimum Vertex Cover in Sublinear Time and a Connection
to Distributed Algorithms. Theor. Comput. Sci., 2007. 
RSW18 
Aviad Rubinstein, Tselil Schramm, S. Matthew Weinberg, Computing Exact Minimum Cuts Without Knowing the Graph. ITCS 2018. 
RTVX11 
Ronitt Rubinfeld, Gil Tamir, Shai Vardi, Ning Xie, Fast Local Computation Algorithms. I(T)CS 2011. 
S15 
C. Seshadhri, A simpler sublinear algorithm for approximating the triangle count. available on arXiv. 
T16 
Tim Roughgarden, Communication Complexity (for Algorithm Designers). Foundations and Trends in Theoretical Computer Science 2016. 
You can download LaTeX for free
here. For the purpose of this course, you do not even need to install LaTeX and
can instead use an online LaTeX editor such as
Overleaf.
Two great introductory resources for LaTeX are
A Short Introduction to LaTeX by
Allin Cottrell
(for general purpose LaTeX)
and
LaTeX for Undergraduates by
Jim Hefferson (for undergraduates mathematics)
accompanied by the following
cheatsheet (note that this document use "\( MATH \)" notation compared to the perhaps more
widely used "$ MATH $"  both are completely fine in LaTeX).
You can also use this wonderful tool
Detexify by
Daniel Kirsch for finding the
LaTeX commands of a symbol (just draw the symbol!).
If you are interested in learning more about LaTeX (beyond what is needed for this course), check the
Wikibook on LaTeX and
the
Wikibook on LaTeX for Mathematics.