Schedule
| January 11
|
Roman Vershynin
|
Sampling in massive data I will begin this seminar by two lectures on massive data (large matrices). In practise, these arise in managing DNA microarray data, facial recognition, web search models, ets. The data is so huge that it can be stored only on disk, not in RAM. Therefore, computations on such matrices (i.e. multiplication or diagonalization) is impossible, because this would require many passes through the entire data on the disk, which is slow. The idea is to do a Monte-Carlo type algorithm. We sample a few random entries of these matrices, which form much smaller and thus handy submatrices, and do computation on these. The mathematical challenge is to prove that the result of such a computation is a good approximation to the (practiacally imposible) computation on the whole matrices. Mathematics to solve this problem includes in particular analysis (noncommutative operator theory). |
| January 18 |
Roman Vershynin |
Sampling in massive data II. I will get to the more interesting part of my talk on sampling in massive data. How one can do a Monte-Carlo method for matrices rather than scalars? What is a Law of Large Numbers for matrices? This will bring us to the basics of non-commutative analysis. |
| January 25 | Jaideep Mulherkar |
Randomized sorting (QuickSort) from the Introduction to the book "Randomized Algorithms" by R.Montavi and O.Raghavan. Long path problem from the Introduction to randomized algorithms by A.Gupta. |
| February 1 |
Naoki Saito |
The Spike Process: A Simple Test
Case for Independent or Sparse Component Analysis. We examine curious behaviors of the Independent Component Analysis (ICA) and Sparse Component Analysis (SCA) when they are applied to simple synthetic processes called the `simple' and `generalized' spike processes. Both processes put a single spike at a random location in an n-dimensional vector for each realization. The simple spike process puts a unit impulse at a time whereas the generalized spike process puts a sample from the standard normal distribution. We obtained interesting set of theorems for these processes. The behavior of SCA to these processes turned out to be much simpler than that of ICA. Our results are useful for validating any ICA/SCA software package because it is very easy to simulate these processes and the correct answers are known from our theorems. |
| February 8 |
No seminar |
|
| February 15 |
Roger Wets | Stochastic Quasi-Gradient
Methods. Slides of the talk Stochastic quasi-gradient methods were introduced to solve problems of the following type: Min E{f(w,x)} such that x in S, where S is a subset of R^n and `w' is a random N-vector. This problem is diffcult to solve mostly because obtaining any information about the function E{f(w,.)} at any point `x' requires the evaluation of multi-dimensional integrals; information means here: the value, the gradient (or subgradient), the Hessian (or a pseudo-Hessian). The stochastic quasi-gradient methods suggest finding a solution --a minimizer-- by replacing the gradient (at each iterate) by a stochastic estimate (the quasi-gradient). The lecture will describe the overall strategy, some more specific instances when(almost sure) convergence can be guaranteed, and some applications. |
| February 22 |
John Steinberger |
Oded Goldreich, Randomized methods in computation. Lecture 1: Probability and the Probabilistic Method. Applications to Max-SAT problems. |
| March 1 |
Chip Martel |
Analyzing Routing and Diameter
in Kleinberg's Small World Model. Slides
of the talk In Small-World networks edges are much more likely to connect ``neighbor nodes'' than distant nodes. Small-world networks have been an active and common topic in many disciplines, including the social and natural sciences. These networks possess a striking property, the so called small-world phenomenon, also often spoken of as ``six degrees of separation" (between any two people in the United States), discovered by S. Milgram in his pioneering workin the 1960's. Since many real networks exhibit Small-world properties, a number of network models have been proposed as a framework to study this phenomenon. Recently, Kleinberg proposed a family of small-world networks to study another compelling aspect of Milgram's original findings: a decentralized algorithm operating only on local information can construct short paths. We present Kleinberg's model (and its varients), analyze the expected performance of a simple greedy routing algorithm, and analyze how the expected diameter of the graph changes as we progressively favor links to near nodes more heavily. |