Mathematics for Data Analysis & Decision Making

Class Time/Place: MWF 1:10 PM - 2:00 PM at Chemistry 176

Instructor: Jesus A. De Loera
TA:Ji Chen


Course Description: Mathematical models are at the heart of all data science applications such as information searching (Google), machine learning (e.g., face recognition algorithms), airline-crew scheduling, social network analysis, and more.

This course discusses the mathematics used in the analysis of data and the models used to make optimal decisions. Methods include advanced linear algebra, graph theory, optimization, probability, and geometry. These are some of the mathematical tools necessary for the data classification, machine learning, clustering and pattern recognition and for planning scheduling, and ranking.

The course should be useful to those students interested in data sciences and in decisions models who wish to learn the basic mathematical theory used in algorithms and software.

WARNING: This course is intensely hands-on. Grade is all based on computer projects and tries to simulate real life working experience.

References:

Unfortunately there is not a unique undergraduate textbook that contains all the relevant mathematics (yet!!). Here are some of my sources below, but they are NOT required, do not buy! I will try to provide students with my notes.

1) Optimization Models, by G. Calafiore and L. El Ghaoui, Cambridge Press, 2015

2) Matrix Methods in Data Mining and Pattern Recognition (Fundamentals of Algorithms), by Lars Elden, Published by SIAM
Note that this textbook has its official website: author's web site. There, you can find a lot of useful information (e.g., errata).

3) A gentle introduction to optimization, by B. Guenin, J. Koenemann, L. Tuncel Cambridge University Press, 2015.

4) Who's #1? The science of rating and ranking, by A. Langville and C.D. Meyer.

Here is the

Syllabus (order may change)

Some Data Analysis and Decision Projects


Prerequisite and Expectations
Grading:
The grades will be calculated using the average and standard deviation of the class. 100 points are possible which will be divided as follows: Some important rules will be followed:

SOFTWARE and other RESOURCES:

An introduction to ZIMPL (the language used to program SCIP) is available in ZIMPL Manual. THe best way to learn it is to follow the numerous examples provided in the text.

For MATLAB, please take a look at the following highly useful MATLAB primers and tutorials.



HOMEWORKS & HANDOUTS

  • Homework 1, due April 18th 11:55pm

  • Homework 2, due May 4nd 11:55pm:

  • Homework 3, due May 23rd 11:55pm:

  • Homework 4, due Friday June 8th 11:55pm:

  • final project, due June 11th 6pm