Mathematics Colloquia and Seminars

Return to Colloquia & Seminar listing

Learning Dynamics and Implicit Bias of Gradient Flow in Overparameterized Linear Models

Mathematics of Data & Decisions

Speaker: Rene Vidal, University of Pennsylvania
Location: 1025 PSEL
Start time: Tue, Apr 9 2024, 3:10PM

Contrary to the common belief that overparameterization may hurt generalization and optimization, recent work suggests that overparameterization may bias the optimization algorithm towards solutions that generalize well — a phenomenon known as implicit regularization or implicit bias — and may also accelerate convergence — a phenomenon known as implicit acceleration. This talk will provide a detailed analysis of the dynamics of gradient flow in overparameterized two-layer linear models showing that convergence to equilibrium depends on the imbalance between input and output weights (which is fixed at initialization) and the margin of the initial solution. The talk will also provide an analysis of the implicit bias, showing that large hidden layer width, together with (properly scaled) random initialization, constrains the network parameters to converge to a solution which is close to the min-norm solution.