Department of Mathematics Syllabus

This syllabus is advisory only. For details on a particular instructor's syllabus (including books), consult the instructor's course page. For a list of what courses are being taught each quarter, refer to the Courses page.

MAT 160: Mathematics for Data Analytics and Decision Making
Approved: 2017-03-14, Jesus De Loera

Units/Lecture:

4

Suggested Textbook: (actual textbook varies by instructor; check your instructor)
Optimization Models by G. Calafiore and L. El Ghaoui
Search by ISBN on Amazon: 978-1107050877

Prerequisites:

MAT 167

Course Description:

This course discusses mathematical models used in analytics and operations research. The basic models discussed serve as an introduction to the analysis of data and methods for optimal decision and planning. Mathematical methods and algorithms discussed include advanced linear algebra, convex and discrete optimization, and probability. These are some of the tools necessary for the data classification, machine learning, clustering and pattern recognition, and for problems in planning, resource allocation, scheduling, and ranking. The course should be useful to those students interested in operations research, business analytics, and data sciences. This class targets seniors or advanced juniors with knowledge of how to write proofs. Computer programming is required: A modeling language such as ZIMPL/SCIP will be used. MATLAB knowledge is assumed from the start.

Suggested Schedule:

Lecture(s)

Comments/Topics

Week 1

Lecture 1: Overview of the course. What is data analytics? What is operations research? Discuss at least two examples of applications (e.g., optimal assignment marriage stable problem and medical school applicants). Singular value decomposition model for classification (briefly recall SVD if they do not remember it). Quick review of MATLAB.
First homework project is assigned: SVD and classification of digits.

Lecture 2: Convex sets and convex functions, convex optimization (Section 8.1- 8.3 in main textbook). Introduction to ZIMPL/SCIP software.

Lecture 3: Optimality conditions of convex problems (section 8.4 main textbook). Continue introduction to ZIMPL/SCIP

Week 2

Lecture 4: Duality: Lagrange dual, Karush-Kuhn Tucker (section 8.5 main textbook). More examples on using ZIMPL/SCIP.

Lecture 5 and 6: Linear and quadratic optimization models (section 9.2-9.5 main textbook)

Week 3

Lecture 7: Discuss basics of supervised learning LASSO (section 13.2 main textbook).

Lecture 8 and 9: Discuss basics of binary classification (Support vector machines) (Section 13.3 main textbook).
General supervised learning problem, kernel methods.
Assign second homework project: Breast cancer classification data.

Week 4

Lecture 10: Basics of robust optimization, (sections 10.1, 10.3 main textbook).

Lecture 11: Semidefinite models (discuss matrix completion problem) (section 11.1 and 11.2 main textbook).

Lecture 12: More on semidefinite programming applications (Section 11.4 main textbook).

Week 5 (halfway)

Lecture 13, 14: Discuss basics of unsupervised learning models PCA, graphical models. Non-negative Matrix factorization. (section 13.5 main textbook).

Lecture 15: Clustering models: K-means and first mention of discrete models for partitioning, packing problems, Knapsack models.
Assign third homework project: K-means for Cancer classification

Weeks 6

Lecture 16: Ranking algorithms. Massey and Colley's methods (Chapters 1 to 3 of Langville-Meyer book).

Lecture 17-18: Discrete models: Shortest path problem, minimum cost perfect matching. (Sections 1.4-1.5 in Guenin et al book).

Lecture 18: More discrete models: Traveling salesman, ranking ordering models through binary integer optimization (this portion is in in Chapter 15 Langville-Meyer book and chapter 6 of Guenin et al.)

Weeks 7

Lecture 19: Quick survey of convex optimization algorithms. Smooth unconstrained minimization. (Sections 12.1-12.2 main textbook).

Lecture 20: Algorithms for convex constrained optimization. Gradient descent. (Sections 12.3 and 12.4 main textbook).

Lecture 21: Coordinate descent methods. Brief mention of interior point methods (Section 12.5 main textbook).

Assign fourth homework project: Text feature extraction via LASSO and L1LR models

Weeks 8

Lecture 22: Integer Discrete models versus continuous models (see Guenin et al book in chapters 6,7 and appendix for this material.)

Lecture 23: Cutting planes and the simplex method. Branch and bound. (Guenin et al chapter 6).

Lecture 24: Discuss non-convex optimization problems. Karush-Kuhn Tucker for non-convex problems. (Guenin et al. chapter 7)

Weeks 9

Lecture 25: More Discrete models, polynomial versus exponential algorithms. NP-hardness. (appendix Guenin et al book).

Assign Fifth homework project: Facility location models.

Lectures 26-27: Business finance applications, portfolio optimization (section 14.1-14.2 main textbook).

Hand students the Final Project description by the end of this week (project due on day of the final exam).

Weeks 10

Lectures 28-30: These three extra days can be used to catch up, or for holding in class midterm(s), or to do course review, e.g., before an in class final exam, if the instructor chooses to have one instead of final project.

Additional Notes:

Other Texts: Who is number 1? The science of rating and ranking, by A. Langville and C Meyer, Princeton Univ. press. A gentle introduction to optimization, by Guenin, Koenemann, and Tuncel, Cambridge 2015. Matrix Methods in Data Mining and Pattern Recognition (Fundamentals of Algorithms), by Lars Elden, Published by SIAM. Note that this textbook has its official website: author's web site. There, you can find a lot of useful information and projects (e.g., errata).

Learning Goals:

Students will become familiar with key models used to extract information from data and the basic optimization models for making decisions. Students will also become familiar with the basic algorithms in linear algebra, convex optimization, and discrete optimization necessary to solve models from input data. Students will have opportunities to write mathematical models and compare them. Students will learn to recognize the different levels of difficulty of models (e.g. convex vs non-convex, continuous vs. combinatorial) and recognize the wide range of algorithms and heuristics available to solve them. Students will be able to model various applications through mathematical abstraction. Students will have opportunity to go through the process of manipulating real data. They will put to use and test many of the skills learned in other courses of the major in analytics and operation research, e.g., math 167 and math 168.

Assessment:

Because this course is tied to practical modeling, it is strongly recommended the grade be heavily based on projects associated to the material. Each project should include a main task, requiring to use some data and modeling, then followed by some theoretical analysis. Projects will make extensive use of software, such as ZIMPL/SCIP, and or MATLAB.