Mathematics Colloquia and Seminars

Return to Colloquia & Seminar listing

Special Colloquium (Data Science): Implicit bias of optimization algorithms and generalization of over-parameterized neural networks

Special Events

Speaker: Chao Ma, Stanford
Related Webpage:
Location: 1147 MSB
Start time: Wed, Jan 25 2023, 3:10PM

Modern neural networks are usually over-parameterized—the number of parameters exceeds the number of training data. In this case the loss function tends to have many (or even infinite) global minima, which imposes a challenge of minima selection on optimization algorithms besides the convergence. Specifically, when training a neural network, the algorithm not only has to find a global minimum, but also needs to select minima with good generalization among many others. We study the mechanisms that facilitate global minima selection of optimization algorithms, as well as its connection with good generalization performance. First, with a linear stability theory, we show that stochastic gradient descent (SGD) favors global minima with flat and uniform landscape. Then, we build a theoretical connection of flatness and generalization performance based on a special multiplicative structure of neural networks. Connecting the two results, we develop generalization bounds for neural networks trained by SGD. Our bounds take the optimization process into consideration. Furthermore, we study the behavior of optimization algorithms around manifold of minima and reveal the exploration of algorithms from one minimum to another.

Reception in the MSB lobby starting at 2:30pm