Mathematics Colloquia and Seminars

Return to Colloquia & Seminar listing

Breaking the Sample Size Barrier in Reinforcement Learning

Mathematics of Data & Decisions

Speaker: Yuxin Chen, University of Pennsylvania
Location: zoom
Start time: Tue, Apr 18 2023, 12:10PM

Emerging reinforcement learning (RL) applications necessitate the design of sample-efficient solutions in order to accommodate the explosive growth of problem dimensionality. Despite the empirical success, however, our understanding about the statistical limits of RL remains highly incomplete. In this talk, I will present some recent progress towards settling the sample complexity in three RL scenarios. The first one is concerned with RL in the presence of a simulator, and we demonstrate the minimax optimality of the model-based RL approach (a.k.a. the plug-in approach), without suffering from a sample size barrier that was present in all past works. The second part studies offline RL, which learns using pre-collected data and needs to accommodate distribution shifts and limited data coverage. We prove that model-based offline RL achieves minimal-optimal sample complexity without any burn-in cost. The insights from offline RL further motivate optimal algorithm design in online RL with reward-agnostic exploration, a scenario where the learner is unaware of the reward functions during the exploration stage. (See https://arxiv.org/abs/2005.12900, https://arxiv.org/abs/2204.05275, https://yuxinchen2020.github.io/publications/Reward-free-exploration.pdf for more details).

This is based on joint work with Gen Li, Laixi Shi, Yuling Yan, Yuejie Chi, Jianqing Fan, and Yuting Wei.