Random coordinate descent

Randomized (Block) Coordinate Descent Method is an optimization algorithm popularized by Nesterov (2010) and Richtárik and Takáč (2011). The first analysis of this method, when applied to the problem of minimizing a smooth convex function, was performed by Nesterov (2010).^[1] In Nesterov's analysis the method needs to be applied to a quadratic perturbation of the original function with an unknown scaling factor. Richtárik and Takáč (2011) provide iteration complexity bounds that do not require this assumption, meaning the method is applied directly to the objective function. Additionally, they generalize the framework to the problem of minimizing a composite function, specifically the sum of a smooth convex function and a (possibly nonsmooth) convex block-separable function.

$F(x)=f(x)+\Psi (x),$

where $\Psi (x)=\sum _{i=1}^{n}\Psi _{i}(x^{(i)}),$ $x\in R^{N}$ is decomposed into $n$ blocks of variables/coordinates: $x=(x^{(1)},\dots ,x^{(n)})$ and $\Psi _{1},\dots ,\Psi _{n}$ are (simple) convex functions.

Example (block decomposition): If $x=(x_{1},x_{2},\dots ,x_{5})\in R^{5}$ and $n=3$ , one may choose $x^{(1)}=(x_{1},x_{3}),x^{(2)}=(x_{2},x_{5})$ and $x^{(3)}=x_{4}$ .

Example (block-separable regularizers):

$n=N;\Psi (x)=\|x\|_{1}=\sum _{i=1}^{n}|x_{i}|$
$N=N_{1}+N_{2}+\dots +N_{n};\Psi (x)=\sum _{i=1}^{n}\|x^{(i)}\|_{2}$ , where $x^{(i)}\in R^{N_{i}}$ and $\|\cdot \|_{2}$ is the standard Euclidean norm.

^ Nesterov, Yurii (2010), "Efficiency of coordinate descent methods on huge-scale optimization problems", SIAM Journal on Optimization, 22 (2): 341–362, CiteSeerX 10.1.1.332.3336, doi:10.1137/100802001

[1] Nesterov, Yurii (2010), "Efficiency of coordinate descent methods on huge-scale optimization problems", SIAM Journal on Optimization, 22 (2): 341–362, CiteSeerX 10.1.1.332.3336, doi:10.1137/100802001

[1]