1. # The Big Table of Convergence Rates

Fri 01 August 2014

In the past 50+ years of convex optimization research, a great many algorithms have been developed, each with slight nuances to their assumptions, implementations, and guarantees. In this article, I'll give a shorthand comparison of these methods in terms of the number of iterations required to reach a desired accuracy …

Sat 26 July 2014

At first blush, ADMM and Proximal Gradient Descent (ProxGrad) appear to have very little in common. The convergence analyses for these two methods are unrelated, and the former operates on an Augmented Lagrangian while the latter directly minimizes the primal objective. In this post, we'll show that after a slight …

Sun 20 July 2014

When I originally wrote about the Alternating Direction Method of Multipliers algorithm, the community's understanding of its convergence properties was light to say the least. While it has long been known (See Boyd's excellent article, Appendix A) that ADMM will converge, it is only recently that the community has begun …

4. # Frank-Wolfe Algorithm

Sat 04 May 2013

In this post, we'll take a look at the Frank-Wolfe Algorithm also known as the Conditional Gradient Method, an algorithm particularly suited for solving problems with compact domains. Like the Proximal Gradient and Accelerated Proximal Gradient algorithms, Frank-Wolfe requires we exploit problem structure to quickly solve a mini-optimization problem. Our …

5. # Variational Inference

Sun 28 April 2013

Variational Inference and Monte Carlo Sampling are currently the two chief ways of doing approximate Bayesian inference. In the Bayesian setting, we typically have some observed variables $$x$$ and unobserved variables $$z$$, and our goal is to calculate $$P(z|x)$$. In all but the simplest cases, calculating \(P(z …