Randomized coordinate descent methods with arbitrary samplingAlgebra & Discrete Mathematics
|Peter Richtarik, University of Edinbrough
|Mon, Jun 1 2015, 4:00PM
Randomized coordinate descent methods with arbitrary sampling are optimization algorithms which at every iteration update a random subset of coordinates (i.i.d. throughout the iterations), with the distribution of this random set-valued mapping allowed to be arbitrary. It turns out that methods of this type work as long as every coordinate has a positive probability of being chosen (and hence updated) by the sampling. This is clearly a necessary condition if we want the method to converge from any starting point. However, it turns out it is also sufficient. Naturally, certain characteristics of the distribution of the random set-valued mapping (or "sampling", for simplicity) manifest itself in the complexity bound. This opens the possibility to design samplings optimizing the complexity bound. If we restrict our attention to the case of samplings picking a single coordinate at a time, the optimal distribution is known as importance sampling. Usually, the difference between the uniform and importance sampling in terms of complexity is in the replacement of the maximum of certain problem-dependent quantities in the bound by their average. The above general setup opens the possibility to efficiently solve optimization problems arising in applications where it is more natural to update structured subsets of variables (e.g., overlapping blocks) and in situations where the sampling is implicitly defined by the computing environment (e.g., faulty processors). I will talk about the first three methods of this type developed in the literature: NSync (unconstrained minimization of a smooth strongly convex function), Quartz (a primal-dual method regularized empirical loss minimization) and ALPHA (minimization of the sum of a smooth convex function and a separable regularizer).