CS442-Fall '19. WHAT TO READ BEFORE CLASS
This shows what I intend to cover in each class, and may be used
as a guide for preparation. After the class I will edit its entry - in
this way the entries will become a "diary" of what was actually covered.
1. September 4, 2019: Describe course and how it will
run. Informal discussion of some algorithmic problems that have been better
understood through probabilistic approach. Begin a "brief" review of concepts
and notations from Probability: random experiment, sample space,
events, probability measure. Conditional probablity and independence;
Examples, including ordered sampling r times with and without
replacement from a set of n elements; the Monte Hall game.
2. Sept. 9, 2019: ...continued. Conditional prob.; Bayes Rule, and
examples. Independence, k-wise independence.
3. Sept. 11, 2019: ... continued. Random permutations, derangements.
4. Sept. 16, 2019: Review random variables: frequency fn.; expectation
(and its interpretation as center of probability mass), (and its interpretation
as a linear functional on random variables). Indicator random variables
and the "method of indicators" in calculating expected values;
5. Sept. 18, 2019:...more examples of the method of indicators. Studied
V(X) - the variance of a random variable X, of aX+b (a linear function of a
random variable), and of the sum of independent random variables; explained
the fact that if X and Y are independent r.v.'s V(X+Y)=V(X)+V(Y), but NOT
CONVERSELY. Derived Tchebycheff's inequality and a key consequence, the
(weak) Law of Large Numbers.
6a. Sept. 23, 2019: Cancelled for illness.
6b. Sept. 25, 2019: Talked about the concept of generating
independent, uniformly distributed random numbers from (0,1) and using
it, how to generate a random sample of size k .leq. n from an given n
element set as well as generating permutations of an n element set.
Then began the study of the Floyd-Rivest probabilistic selection algorithm.
When finished we began discussion of "random fingerprinting" illustrated by
Frievald's probabilistic method to test if, given n by n matrices A,B, and C,
C is the product of A and B.
7. Sept. 30, 2019: Returned to some discussion of the Floyd-Rivest
selection alg. and aspects of its complexity. Then returned to topic of
"Random Fingerprinting" and Frievald's algorithm. The idea begind the alg.
is to check whether Dx = 0, where D is C-A*B and x is an n-vector, each
component of which is 0 or 1, and each component is determined independently
by the toss of a fair coin. IF Dx is not 0 Frievald's alg will say "NO" and
this is correct. Otherwise it says "YES" and this is correct with prob.
1/2, both by virtue of the way it says "YES". The algorithm has
quadratic complexity, to be compared to the cost of multiplying A and
B and comparing the product to C.
8. Oct. 2, 2019: Discussed variations of Frievald's alg. that decrease
the probability of error at the expense of increasing its quadratic cost.
Then covered an analogous application to decide if C=A*B where here, A
and B are degree n polynomials and C is a given 2n degree
polynomial. Two deterministic algs were described - both with quadratic
complexity - followed by a probabilistic one that runs in linear time
(in n).
9. Oct.7, 2019: Another variation on Frievalds' using (instead of
random 0-1 n-vectors) n-vectors, each entry of which is chosen at random
from a set of m distinct reals, WITH replacement. Saw that this change
just improves the complexity by improving the base of a logarithm!
A final application in the spirit of Frievalds was "checking" supposed
univariate polynoial identities, for which we described two different
deterministic algs. with quadratic complexity and then one linear-time
(expected complexity) randomized alg.
10. Oct. 14, 2019: First discussed the "secretary problem" and the
optimal (probabilistic) algorithm for choosing the best of the n
candidates. Then introduced the Poisson(t) random variable and showed
that it is the limit (as n → infty) of the sum of n independent Bernoulli
(0,1) r.v.'s X_i, each being 1 with with probability t/n. Next, began
the "Poisson Paradign" or "heuristic", generally suggesting that a Poisson
limit might be expected even when some of the above conditions fail to
hold (e.g. maybe "the sum of n 'nearly independent', non-negative r.v.'s
X_i > 0 with small expectations might converge to a certain Poisson r.v."??).
An example is the n-hat experiment and the rand. var. N counting how
many of the n people get their own hat. N → Poisson(1) as n →
infty....
11. Oct. 16, 2019: ...Another example is the birthday paradox: QUESTION:
If n randomly chosen people are in a room, what's the probability two or
more have the same birthday? If you think of this as n-choose-2
B-trials with 1/365 success probability, the heuristic suggests that
the number of pairs having the same birthday (call it N) converges to
Poisson (t=n(n-1)/[2(365)]. If so, the prob. that N=0 is
Poisson(0)=exp(-t), and this is less than 1/2 when n (# of people) >
22.47. All this says that when there are 23 OR MORE (RANDOMLY CHOSEN)
PEOPLE IN A ROOM AT LEAST ONE PAIR ARE BORN ON THE SAME DAY WITH
PROB. > .5!!! (Paradoxical Answer)...
12. Oct. 28, 2019: ...Another (similar) example of the heuristic is when
placing r distinguishable balls into n distinguishable boxes, each ball
equally likely to go in any of the boxes: the question is "for which r > 1
does it first become likely (P > .5) that two or more balls will be in the
same box? Applying the heuristic reveals that once r > 1.17*sqrt(n),
P(A)=P(some box has more than 1 ball) >.5., agreeing with direct computation
of P(A). Now begin new topic(s) involving graphs and started with
D. Karger's probabilistic alg. for min-cut in an undirected graph....
13. Oct. 30, 2019: ...A strong motivating idea is that a randomly
chosen edge is unlikely to be in a given min-cut [(n-2)/n prob.]. So the
alg. chooses a random edge e=uv and "contracts" it to make one super-vertex
whose neighbors are the union of those of u and of v. Continuing this
process until two super-vertices remain, we showed that the remaining
edges form min-cut with probability at least 2/n^2 (>0).
14a. Nov. 4, 2019 cancelled.
14b. Nov. 6, 2019: Reviewed the alg and its analysis, then began
the topic of "boosting" its success probability: if the entire algorithm were
repeated independently at random cn^2 times, and if the smallest cut so
far observed was saved along the way, the last saved cut would be a min-cut
with probability at least 1 - e**(-2c) → 1 as c increases! The complexity of
Kargers' algorithm is cn^4 (there are faster deterministic algs). Next,
discussed the structural approach of the alg. of Karger and Stein [without
details], but noting that at a computational cost of O(n^2 log(n))
it will return a min-cut with probability at least c/log(n). This can
be boosted so that after b*log(n) repetitions, the smallest cut
observed will be a min-cut with probability at least 1-e**(-c*b).
15. Nov. 11, 2019: Begin random graphs and the two models: (i) G_{n,p}
has graphs G=(V,E) where the vertex set is V = {v_1,v_2,...v_n} and each
of the n-choose-2 possible edges v_iv_j, i < j, is in IN E with probability
p - decided independently for each i < j. The other model is G(m,n) which
generates a graph with n vertices and with m edges which are chosen
independently at random (without replacement). As edges are successively
added to the empty graph in G(m,n), it is interesting to study when certain
properties are likely to appear: we "showed that for large n,
its likely NOT to have any degree = 2 vertices until G has about .58*sqrt(n)
random edges. We also showed that G is not likely to be connected until it has
more than .5*n*log(n) edges.
16. Nov. 13, 2019: Derived the "sharp threshold in G(m,n) for
connectedness, namely at m=(n log(n))/2 and the corresponding one of
p=(log n)/n for G_{n,p}. Used the Poisson heuristic to derive sharp
threshold in G(m,n) for appearance of vertices of degree=2, namely
about .58sqrt(n) edges. Continued with more examples of the probabilistic method
applied to graph theory, beginning e.g. with a simple special case of
a theorem of Paul Erdos: every undirected graph on 6 vertices has a
complete subgraph of size 3, or an "empty" subgraph of size 3, or
both. Discussed the general version of this statement.
17. Nov. 20, 2019: Mentioned the geometric analog of Erdos'
theorem, then began topic of tournaments, and Hamilton paths. Showed
that each tournament has at least one Ham. path and proved Szele's thm.
that there exists an n-vertex tournament with at least n!/2^{n-1}
Ham-paths (i.e., "many"). Stated without proof Alon's upper bound on
the max. number Hamilton paths that are possible for an n vertex tournament.
18. Nov. 25, 2019: Finished topic of tournaments and Hamilton
paths with the question of how to find a tournament with many Ham. paths.
Then began the topic of Prob. Algs. in Applied Math. with Numerical
Integration - how to accurately approximate a given definite
integral - beginning with a little motivation. Then presented the
algorithm of Siegel and Zambuto, a randomized version of methods like
Simpsons' rule, or a Gauss quadrature. Its virtue is that the
approximate integral is a random variable with expected error
ZERO!.
19. Dec. 2, 2019: (Class cancelled by Weather)
19b. Dec 4, 2019: Random number generation.
20. Dec. 9, 2019: Graph Drawing, The crossing number and Crossing
number lemma.