Math 290: Reading seminar in Theoretical Computer Science 

Randomized Algorithms

Winter 2005



Course: Math 290-66, CRN 80749
Title: Reading Seminar "Randomized Algorithms"
When: Tuesday 3 - 4:30 pm
Where: Kerr 693
Instructor: Roman Vershynin
Office: Kerr 671
Email: vershynin_at_math.ucdavis.edu
Office Hours: W 2:30-4

Randomized algorithms is a hot topic in computer science. They include steps at which a decision is made at random, according to a pre-designed probabilistic law. Randomized algorithms are often simpler and the faster than deterministic algorithms. However, they work correctly only with a certain probability, which is usually overwhelmingly big, much higher than the chance of not having a bug in a complex deterministic algorithm.

The seminar will be broad and focus on the mathematical problems that arise in randomized algorithms. Topics include: ideas from linear algebra, combinatorics, convex geometry, and analysis that lead to design of randomized algorithms, pseudorandomness and derandomization, statistical learning, optimization, signal processing; other topics suggested by the participants.

Students who attend regularly can register for one credit; students who present a topic can register for three credits. Please let me know if you are willing to make a presentation.

See below for the list of available topics and the current schedule.

Texts: the book "Randomized Algorithms" by R.Montavi and O.Raghavan and "The random projection method" by S.Vempala will be available in my mailbox. Please free to use them during the day, but return them into my mailbox by the end of the day.




Available topics

Schedule

January 11                    
Roman Vershynin                              
Sampling in massive data

I will begin this seminar by two lectures on massive data (large matrices). In practise, these arise in managing DNA microarray data, facial recognition, web search models, ets. The data is so huge that it can be stored only on disk, not in RAM. Therefore, computations on such matrices (i.e. multiplication or diagonalization) is impossible, because this would require many passes through the entire data on the disk, which is slow. The idea is to do a Monte-Carlo type algorithm. We sample a few random entries of these matrices, which form much smaller and thus handy submatrices, and do computation on these. The mathematical challenge is to prove that the result of such a computation  is a good approximation to the (practiacally imposible) computation on the whole matrices. Mathematics to solve this problem includes in particular analysis (noncommutative operator theory).
January 18
Roman Vershynin
Sampling in massive data II.

I will get to the more interesting part of my talk on sampling in massive data. How one can do a Monte-Carlo method for matrices rather than scalars? What is a Law of Large Numbers for matrices? This will bring us to the basics of non-commutative analysis.
January 25 Jaideep Mulherkar
Randomized sorting (QuickSort) from the Introduction to the book "Randomized Algorithms" by R.Montavi and O.Raghavan. Long path problem from the Introduction to randomized algorithms by A.Gupta.
February 1
Naoki Saito
The Spike Process: A Simple Test Case for Independent or Sparse Component Analysis.

We examine curious behaviors of the Independent Component Analysis (ICA) and Sparse Component Analysis (SCA) when they are applied to simple synthetic processes called the `simple' and `generalized' spike processes. Both processes put a single spike at a random location in an n-dimensional vector for each realization. The simple spike process puts a unit impulse at a time whereas the generalized spike process puts a sample from the standard normal distribution. We obtained interesting set of theorems for these processes. The behavior of SCA to these processes turned out to be much simpler than that of ICA.
Our results are useful for validating any ICA/SCA software package because it is very easy to simulate these processes and the correct answers are known from our theorems.
February 8
No seminar

February 15
Roger Wets Stochastic Quasi-Gradient Methods. Slides of the talk

Stochastic quasi-gradient methods were introduced to solve problems of the following type: Min E{f(w,x)} such that x in S, where S is a subset of R^n and `w' is a random N-vector. This problem is diffcult to solve mostly because obtaining any information about the function E{f(w,.)} at any point `x' requires the evaluation of multi-dimensional integrals; information means here: the value, the gradient (or subgradient), the Hessian (or a pseudo-Hessian).
   The stochastic quasi-gradient methods suggest finding a solution --a minimizer-- by replacing the gradient (at each iterate) by a stochastic estimate (the quasi-gradient). The lecture will describe the overall strategy, some more specific instances when(almost sure) convergence can be guaranteed, and some applications.
February 22
John Steinberger
Oded Goldreich, Randomized methods in computation. Lecture 1: Probability and the Probabilistic Method. Applications to Max-SAT problems.
March 1
Chip Martel
Analyzing Routing and Diameter in Kleinberg's Small World Model. Slides of the talk

In Small-World networks edges are much more likely to connect ``neighbor nodes'' than distant nodes. Small-world networks have been an active and common topic in many disciplines, including the social and natural sciences. These networks possess a striking property, the so called small-world phenomenon, also often spoken of as ``six degrees of separation" (between any two people in the United States), discovered by S. Milgram in his pioneering workin the 1960's. Since many real networks exhibit Small-world properties, a number of network models have been proposed as a framework to study this phenomenon.
     Recently, Kleinberg proposed a family of small-world networks to study another compelling aspect of Milgram's original findings: a decentralized algorithm operating only on local information can construct short paths.
     We present Kleinberg's model (and its varients), analyze the expected performance of a simple greedy routing algorithm, and analyze how the expected diameter of the graph changes as we progressively favor links to near nodes more heavily.

Web: http://www.math.ucdavis.edu/~vershynin/teaching/RFG04/Reading-winter.html