Gradient Support Pursuit (GraSP)

 

This algorithm can be used as an approximate solver for sparsity constrained optimization problems. The algorithm generalizes the CoSaMP algorithm used in Compressed Sensing, and can handle cost functions that are non-quadratic. Note that our implementation of GraSP is NOT optimized for speed or memory efficieny. Please contact me via email in case you encounter any bugs in the code. Download

Pseudo-Code of the Algorithm
\(k=0\)
\(\mathbf{x}^{\left(k\right)}=\mathbf{0}\)
Repeat
\(\mathbf{z}^{\left(k\right)}=\nabla f\left(\mathbf{x}^{\left(k\right)}\right)\)
\(\mathcal{T}_k=\text{supp}\left(\mathbf{x}^{\left(k\right)}\right)\cup\text{supp}\left(\mathbf{z}_{2s}^{\left(k\right)}\right)\)
(*) \(\mathbf{b}^{\left(k\right)}=\arg\min_{\mathbf{x}}\ f\left(\mathbf{x}\right)\quad\text{s.t. } \left.\mathbf{x}\right\vert_{\mathcal{T}_k^\mathrm{c}}=\mathbf{0}\)
\(\mathbf{x}^{\left(k+1\right)}=\mathbf{b}^{\left(k\right)}_s\)
\(k\longleftarrow k+1\)
Until halting condition holds
Return \(\mathbf{x}^\left(k\right)\)
Algorithm Modes

In addition to the original form of the algorithm described by the pseudo-code, we also implemented two other variants of the algorithm that substitute the inner optimization step by computationally simpler steps. The algorithm is implemented with the following modes.

  1. Restricted Minimization: In this mode the inner optimization is performed as described by the pseudo-code. This mode has the highest computational cost per iteration, though it is more stable than the other two methods. For the inner optimization step performed in this mode (i.e., (*) in the pseudo-code) we rely on Mark Schmidt's convex optimization packages, minFunc, and minConf. Make sure that these packages are accessible to GraSP by including their directories in the MATLAB path. Supplying a separate Hessian-vector multiplication function, or using a completely Hessian-free optimization technique to solve (*) can improve the speed in this mode.
  2. Restricted Gradient Descent: In this mode, the inner optimization is simply replaced by a gradient descent step for the restriction of the cost function to the coordinates in T. It is easy to see that in this mode the algorithm would be equivalent to a gradient descent with hard thresholding. This mode has the lowest computational complexity per iteration.
  3. Restricted Newton's Method: In this mode in each iteration of the algorithm the crude estimate b is obtained by taking a newton step for the restriction of the cost function to the coordinates in T, rather than minimizing the cost function over the vectors whose support is a subset of T. Computational cost per iteration in this mode is less than that in the full mode.
Bounded Iterates / \(\ell_2\)-regularized Iterate

Generally for non-quadratic cost functions one needs to explicitly bound the iterates or augment the cost function by an \(\ell_2\)-regularization term. The implementation of GraSP incorporates these two formulations as well.

How to Use GraSP in MATLAB?

To run the algorithm you can call

xhat = GraSP(F,s,n,options);
in MATLAB. The input arguments are described below:

F : This is a function handle that allows GraSP to access the cost function and its derivatives. Note that in general the function F should take two inputs and return three outputs as in [f, g, H] = F(x,T). In a nutshell, it will return the value (f), the gradient (g), and the Hessian (H) of the cost function at x when the cost function is restricted to the set of coordinates enumerated by T.  In the “full” and “gradient” modes of the GraSP algorithm, supplying the Hessian is optional as it is not required by the algorithm. However, in most cases having access to the Hessian improves the convergence speed of the inner (convex) optiomization step in the “full” mode.

s : A positive integer that determines the desired sparsity level.

n : The dimensionalty of the solution.

options : This is an optional structure with the following fields that can be used to configure GraSP.

.HvFunc :  This is a function handle for a user-supplied Hessian-vector multiplication that can be used by solver of (*). In our implementation, for a cost function \(g\left(\cdot\right)\), HvFunc(v,x) is supposed to evaluate the Hessian-vector poduct \(\nabla^2 g\left(\mathbf{x}\right)\mathbf{v}\). This field is optional.

.maxIter : The maximum number of iterations for GraSP. The default value is 100.

.tolF : GraSP will halt if the value of the cost function at the current iterate is less than tolF. The default value is 1e-3.

.tolG : GraSP will halt if restriction of the gradient to its \(3s\)-largest entries has an \(\ell_2\)-norm less than tolG. The default value is 1e-3.

.mu : This parameter allows \(\ell_2\)-regularization of the cost function. If mu is positive it determines the radius of the sphere of the feasible points in the constrained form. If mu is less than or equal to zero, it determines the penalty coefficient in the penalized form. The default value for this argument is zero.

.Method : This agument takes values 'F', 'G', or 'H' to select the modes 1,2, or 3 of the algorithm described above. The default value is 'F'.

.eta : If GraSP runs in the Restricted Gradient or Hessian modes (i.e., .Mode is set to 'G' or 'H'), then the associated step-size is determined by the positive number eta. The default value is one.

.debias : A logical variable that indicates whether or not the debiasing is performed over the estimated support set at the end of each iteration. The default value is false.