Title 
Code 
Notes 
CountMin Sketch 
countmin.h countmin.c 
CountMin sketches (Cormode,
Muthukrishnan '04). They support: Finding frequent items, Returning point estimates, Approximating Innerproducts Finding quantiles 
Group testing 
cgt.h cgt.c 
Described in What's Hot and
What's Not: Tracking Frequent Items Dynamically, PODS 2003. They support: finding frequent items. 
CCFC Sketch 
ccfc.h ccfc.c 
Count sketches from Charikar,
Chen, FarachColton '02. They support: finding frequent items, returning point estimates 
AMS Sketch 
ams.h ams.c 
"Tug of War" sketches due to
Alon, Matias and Szegedy 96, and
Alon, Gibbons, Matias and Szegedy, 99. Some hashing tricks used for faster implementation. They support: Returning point estimates, Approximating Innerproducts, Estimating the L2 norm of a vector 
Stable distributions and sketches 
stable.h stable.c 
Sketches based on stable
distributions.
Based on Indyk'00, Cormode, Indyk, Koudas, Muthukrishnan '02, and Cormode, Data, Indyk, Muthukrishnan, '02. They support estimating the L2 norm of a vector, estimating the L1 norm of a vector, estimating the L0 norm of a vector, estimating arbitrary Lp norms for 0 < p < 2 
Random number generators 
prng.h prng.c 

Common routines 
massdal.h massdal.c 

Change (deltoid) detection 
change.h change.c 
Implements the change detection
group testing for absolute, relative,
and variational changes
(CormodeMuthukrishnan, Infocom 2004). See also the sample file changewrapper.c for examples of usage. 
Frequent items 
frequent.h frequent.c 
Algorithm Frequent for insert only streams. They are derived from code used in the paper What's Hot and What's Not: Tracking Frequent Items Dynamically, PODS 2003. 
Lossy Counting 
lossycount.h lossycount.c 
Lossy Counting algorithm due to Manku and Motwani 
hotitems.c teststab.c changewrapper.c makefile 
Each of the .h files exports
various methods: Init  Initialize the data structure. Takes some parameters defining the accuracy and quality of the results. Returns a pointer to an object which must be supplied to the following routines. Update  Takes an item identifier. Some routines take an additional parameter of the count to add on for this arrival (positive or negative increment). Others interpret a positive count as an arrival, a negative count as a departure. Output  Takes an integer threshold, returns a list of items which should exceed this threshold. Size  Returns the space used by the data structure Destroy  Destroys the data structure and frees the associated memory. Example usage of the above routines, finding frequent items on a stream drawn from zipf distributions. Example usage of the above routines to estimate the L1, L2, and L0 norms of data streams. Example usage of the change.c library to find items with large changes. makefile A makefile for the above code 

Code BUNDLE 
massdalsketches.zip  All the above .c and .h files together in one easy to download .zip file. 
Title 
Code 
Notes 
Lossy Counting 
LossyCountingJava.zip 
This works for text streams.
README.txt file explains usage. 