Mathematics for Data Analysis & Decision Making

Class Time/Place: MWF 10:00 AM - 10:50 AM at Wellman 226

Instructor: Jesus A. De Loera
TA:Ji Chen


Course Description: Mathematical models are at the heart of all data science applications such as information searching (Google),
machine learning (e.g., face recognition algorithms), and all logistic and planning challenges (e.g., airline-crew scheduling, social
network analysis). To make intelligent decisions math is indispensable!

This course discusses the mathematics used in the analysis of data and the models used to make optimal decisions.
Methods include advanced linear algebra, graph theory, optimization, probability, and geometry. These are some
of the mathematical tools necessary for the data classification, machine learning, clustering, pattern recognition,
and for planning scheduling, optimal allocation, and ranking.

This course is great for students who wish to learn the mathematical theory behind data science and decision making algorithms and software.

References:

Unfortunately there is not a unique undergraduate textbook that contains all the relevant mathematics (yet!!).
I will share my notes with the class. Students who give me corrections will receive extra credit.

Most of my notes are based on the following source, but you are NOT required to buy them!

1) Optimization Models, by G. Calafiore and L. El Ghaoui, Cambridge Press, 2015

2) Matrix Methods in Data Mining and Pattern Recognition (Fundamentals of Algorithms), by Lars Elden, Published by SIAM
Note that this textbook has its official website: author's web site. There, you can find a lot of useful information (e.g., errata).

3) A gentle introduction to optimization, by B. Guenin, J. Koenemann, L. Tuncel Cambridge University Press, 2015.

4) Who's #1? The science of rating and ranking, by A. Langville and C.D. Meyer.

Here is the

Syllabus (order may change)

Some Data Analysis and Decision Projects


Prerequisite and Expectations
Grading:
The grades will be calculated using the average and standard deviation of the class. 100 points are possible which will be divided as follows: Important rules will be followed:

SOFTWARE and other RESOURCES:

An introduction to ZIMPL (the language used to program SCIP) is available in ZIMPL Manual. THe best way to learn it is to follow the numerous examples provided in the text.

For MATLAB, please take a look at the following highly useful MATLAB primers and tutorials.