Mathematics Colloquia and Seminars

Return to Colloquia & Seminar listing

Stochastic Algorithms for Large-Scale Machine Learning Problems

Special Events

Speaker: Shiqian Ma, The Chinese Univ. of Hong Kong
Location: 2112 MSB
Start time: Mon, Jan 30 2017, 5:10PM

Stochastic gradient descent (SGD) method and its variants are the
main approaches for solving machine learning problems that involve
large-scale training dataset. This talk addresses two issues in SGD. (i)
One of the major issues in SGD is how to choose the step size while running
the algorithm. Since the traditional line search technique does not apply
for stochastic optimization algorithms, the common practice in SGD is
either to use a diminishing step size, or to tune a fixed step size by
hand, which can be time consuming in practice. We propose to use the
Barzilai-Borwein method to automatically compute step sizes for SGD and its
variant: stochastic variance reduced gradient (SVRG) method, which leads to
two algorithms: SGD-BB and SVRG-BB. We prove that SVRG-BB converges
linearly for strongly convex objective functions. Numerical results on
standard machine learning problems are reported to demonstrate the
advantages of our methods. (ii) Another issue is how to incorporate the
second-order information to SGD. We propose a stochastic quasi-Newton
method for solving nonconvex learning problems. Note that all existing
stochastic quasi-Newton methods can only handle convex problems.
Convergence and complexity results of our method are established. Numerical
results on classification problems using SVM and neural networks are reported.