Mathematical Foundations for Big Data (Spring 2016)

Course: MAT 280
CRN: 49752
Title: Mathematical Foundations for Big Data
Class: MF 1:30pm-3:00pm, 2112 Math. Sci. Bldg.

Instructor: Thomas Strohmer
Office: 3144 MSB
Email:"my last name" at math.ucdavis.edu
Office Hours: By appointment

Course Objective:

Experiments, observations, and numerical simulations in many areas of science nowadays generate massive amounts of data. This rapid growth heralds an era of "data-centric science," which requires new paradigms addressing how data are acquired, processed, distributed, and analyzed. This course will cover mathematical models and concepts for developing algorithms that can deal with some of the challenges posed by Big Data. Prerequisite:

Linear algebra and a basic background in probability as well as basic experience in programming (preferably Matlab) will be required. Some basic knowledge in optimization is recommended. List of topics: (subject to minor changes)

Principal Component Analysis, Singular Value Decomposition.
Probability in high dimensions. Concentration of measure, matrix concentration inequalities. Curses and blessings of dimensionality.
Data clustering, community detection.
Dimension reduction. Johnson-Lindenstrauss, sketching, random projections.
Stochastic gradient descent.
Kernel regression.
Randomized numerical linear algebra.
Compressive sensing. Efficient acquisition of data, sparsity, low-rank matrix recovery.
Diffusion maps, manifold learning, intrinsic geometry of massive data sets.
Some basics on Deep Learning (if time permits).

Textbooks:

C. Bishop. Pattern Recognition and Machine Learning.
F. Cucker, D. X. Zho. Learning Theory: an approximation theory viewpoint.
S. Foucart and H. Rauhut. A mathematical introduction to compressive sensing.
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction.
M. Mahoney. Randomized Algorithms for Matrices and Data.

Grading Scheme:

10% Scribing Lectures
30% Homework
60% Final Project

Scribing Lectures:

Depending on the class size, each student may have to scribe 1-2 lectures. Scribe notes must be typeset in LaTeX. A template and more details will be posted later. Homework:

here

Final Project:

Describe how some of the methods you learned in this course will be used in your research.
Find a practical application yourself (not copying from papers/books) using the methods you learned in this course; describe how to use them; describe the importance of that application; what impact would you expect if you are successful?
A report describing a thorough numerical comparison of existing algorithms related to one of the topics of this couse for a specific application or problem.

More information about the class can be found here