1. Accelerated Proximal Gradient Descent

    Thu 25 April 2013

    \(\def\prox{\text{prox}}\) In a previous post, I presented Proximal Gradient, a method for bypassing the \(O(1 / \epsilon^2)\) convergence rate of Subgradient Descent. This method relied on assuming that the objective function could be expressed as the sum of 2 functions, \(g(x)\) and \(h(x)\), with …

  2. Accelerated Gradient Descent

    Fri 12 April 2013

    In the mid-1980s, Yurii Nesterov hit the equivalent of an academic home run. At the same time, he established the Accelerated Gradient Method, proved that its convergence rate superior to Gradient Descent (\(O(1/\sqrt{\epsilon})\) iterations instead of \(O(1/\epsilon)\)), and then proved that no other first-order (that …