没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
TO APPEAR IN THE IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2007. 1
Gradient Projection for Sparse Reconstruction:
Application to Compressed Sensing and Other
Inverse Problems
M´ario A. T. Figueiredo, Robert D. Nowak, Stephen J. Wright
Abstract—Many problems in signal processing and statistical
inference involve finding sparse solutions to under-determined,
or ill-conditioned, linear systems of equations. A standard
approach consists in minimizing an objective function which
includes a quadratic (squared ℓ
2
) error term combined with
a sparseness-inducing (ℓ
1
) regularization term.Basis pursuit, the
least absolute shrinkage and selection operator (LASSO), wavelet-
based deconvolution, and compressed sensing are a few well-
known examples of this approach. This paper proposes gradient
projection (GP) algorithms for the bound-constrained quadratic
programming (BCQP) formulation of these problems. We test
variants of this approach that select the line search parameters
in different ways, including techniques based on the Barzilai-
Borwein method. Computational experiments show that these GP
approaches perform well in a wide range of applications, often
being significantly faster (in terms of computation time) than
competing methods. Although the performance of GP methods
tends to degrade as the regularization term is de-emphasized,
we show how they can be embedded in a continuation scheme
to recover their efficient practical performance.
I. INTRODUCTION
A. Background
There has been considerable interest in solving the convex
unconstrained optimization problem
min
x
1
2
ky − Axk
2
2
+ τkxk
1
, (1)
where x ∈ R
n
, y ∈ R
k
, A is an k × n matrix, τ is a
nonnegative parameter, kvk
2
denotes the Euclidean norm of
v, and kvk
1
=
P
i
|v
i
| is the ℓ
1
norm of v. Problems of the
form (1) have become familiar over the past three decades,
particularly in statistical and signal processing contexts. From
a Bayesian perspective, (1) can be seen as a maximum a
posteriori criterion for estimating x from observations
y = Ax + n, (2)
where n is white Gaussian noise of variance σ
2
, and the prior
on x is Laplacian (that is, log p(x) = −λkxk
1
+ K) [1],
[25], [54]. Problem (1) can also be viewed as a regularization
M. Figueiredo is with the Instituto de Telecomunicac¸ ˜oes and Department
of Electrical and Computer Engineering, Instituto Superior T´ecnico, 1049-
001 Lisboa, Portugal. R. Nowak is with the Department of Electrical and
Computer Engineering, University of Wisconsin, Madison, WI 53706, USA.
S. Wright is with Department of Computer Sciences, University of Wisconsin,
Madison, WI 53706, USA.
This work was partially supported by NSF, grants CCF-0430504 and CNS-
0540147, and by Fundac¸˜ao para a Ciˆencia e Tecnologia, POSC/FEDER, grant
POSC/EEA-CPS/61271/2004.
technique to overcome the ill-conditioned, or even singular,
nature of matrix A, when trying to infer x from noiseless
observations y = Ax or from noisy observations as in (2).
The presence of the ℓ
1
term encourages small components
of x to become exactly zero, thus promoting sparse solutions
[11], [54]. Because of this feature, (1) has been used for
more than three decades in several signal processing problems
where sparseness is sought; some early references are [12],
[37], [50], [53]. In the 1990’s, seminal work on the use of
ℓ
1
sparseness-inducing penalties/log-priors appeared in the
literature: the now famous basis pursuit denoising (BPDN,
[11, Section 5]) criterion and the least absolute shrinkage and
selection operator (LASSO, [54]). For brief historical accounts
on the use of the ℓ
1
penalty in statistics and signal processing,
see [41], [55].
Problem (1) is closely related to the following convex
constrained optimization problems:
min
x
kxk
1
subject to ky − Axk
2
2
≤ ε (3)
and
min
x
ky − Axk
2
2
subject to kxk
1
≤ t, (4)
where ε and t are nonnegative real parameters. Problem (3) is
a quadratically constrained linear program (QCLP) whereas
(4) is a quadratic program (QP). Convex analysis can be used
to show that a solution of (3) (for any ε such that this problem
is feasible) is either x = 0, or else is a minimizer of (1), for
some τ > 0. Similarly, a solution of (4) for any t ≥ 0 is also a
minimizer of (1) for some τ ≥ 0. These claims can be proved
using [49, Theorem 27.4].
The LASSO approach to regression has the form (4), while
the basis pursuit criterion [11, (3.1)] has the form (3) with
ε = 0, i.e., a linear program (LP)
min
x
kxk
1
subject to y = Ax. (5)
Problem (1) also arises in wavelet-based image/signal re-
construction and restoration (namely deconvolution); in those
problems, matrix A has the form A = RW, where R is (a ma-
trix representation of) the observation operator (for example,
convolution with a blur kernel or a tomographic projection),
W contains a wavelet basis or a redundant dictionary (that
is, multiplying by W corresponds to performing an inverse
wavelet transform), and x is the vector of representation
coefficients of the unknown image/signal [24], [25], [26].
TO APPEAR IN THE IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2007. 2
We mention also image restoration problems under to-
tal variation (TV) regularization [10], [47]. In the one-
dimensional (1D) case, a change of variables leads to the
formulation (1). In 2D, however, the techniques of this paper
cannot be applied directly.
Another intriguing new application for the optimization
problems above is compressed sensing
1
(CS) [6], [7], [8],
[9], [18]. Recent results show that a relatively small num-
ber of random projections of a sparse signal can contain
most of its salient information. It follows that if a signal is
sparse or approximately sparse in some orthonormal basis,
then an accurate reconstruction can be obtained from random
projections, which suggests a potentially powerful alternative
to conventional Shannon-Nyquist sampling. In the noiseless
setting, accurate approximations can be obtained by finding
a sparse signal that matches the random projections of the
original signal. This problem can be cast as (5), where again
matrix A has the form A = RW, but in this case R represents
a low-rank randomized sensing matrix (e.g., a k × d matrix
of independent realizations of a random variable), while the
columns of W contain the basis over which the signal has
a sparse representation (e.g., a wavelet basis). Problem (1)
is a robust version of this reconstruction process, which is
resilient to errors and noisy data, and similar criteria have
been proposed and analyzed in [8], [32].
B. Previous Algorithms
Several optimization algorithms and codes have been pro-
posed to solve the QCLP (3), the QP (4), the LP (5), and
the unconstrained (but nonsmooth) formulation (1). We review
that work here, identifying those contributions that are most
suitable for signal processing applications, which are the target
of this paper.
In the class of applications that motivates this paper, the
matrix A cannot be stored explicitly, and it is costly and
impractical to access significant portions of A and A
T
A. In
wavelet-based image reconstruction and some CS problems,
for which A = RW, explicit storage of A, R, or W is not
practical for problems of interesting scale. However, matrix-
vector products involving R and W can be done quite effi-
ciently. For example, if the columns of W contain a wavelet
basis, then any multiplication of the form Wv or W
T
v can
be performed by a fast wavelet transform (see Section III-G,
for details). Similarly, if R represents a convolution, then
multiplications of the form Rv or R
T
v can be performed
with the help of the fast Fourier transform (FFT) algorithm.
In some CS applications, if the dimension of y is not too large,
R can be explicitly stored; however, A is still not available
explicitly, because the large and dense nature of W makes it
highly impractical to compute and store RW.
Homotopy algorithms that find the full path of solutions,
for all nonnegative values of the scalar parameters in the
various formulations (τ in (1), ε in (3), and t in (4)), have
been proposed in [22], [39], [46], and [57]. The formulation
(4) is addressed in [46], while [57] addresses (1) and (4).
1
A comprehensive repository of CS literature and software can be fond in
www.dsp.ece.rice.edu/cs/.
The method in [39] provides the solution path for (1), for
a range of values of τ. The least angle regression (LARS)
procedure described in [22] can be adapted to solve the
LASSO formulation (4). These are all essentially homotopy
methods that perform pivoting operations involving subma-
trices of A or A
T
A at certain critical values of the cor-
responding parameter (τ, t, or ε). These methods can be
implemented so that only the submatrix of A corresponding
to nonzero components of the current vector x need be known
explicitly, so that if x has few nonzeros, these methods may
be competitive even for problems of very large scale. (See
for example the SolveLasso function in the SparseLab
toolbox, available from sparselab.stanford.edu.) In
some signal processing applications, however, the number of
nonzero x components may be significant, and since these
methods require at least as many pivot operations as there
are nonzeros in the solution, they may be less competitive
on such problems. The interior-point (IP) approach in [58],
which solves a generalization of (4), also requires explicit
construction of A
T
A, though the approach could in principle
modified to allow iterative solution of the linear system at each
primal-dual iteration.
Algorithms that require only matrix-vector products involv-
ing A and A
T
have been proposed in a number of recent
works. In [11], the problems (5) and (1) are solved by first
reformulating them as “perturbed linear programs” (which
are linear programs with additional terms in the objective
which are squared norms of the unknowns), then applying a
standard primal-dual IP approach [60]. The linear equations
or least-squares problems that arise at each IP iteration are
then solved with iterative methods such as LSQR [48] or
conjugate gradients (CG). Each iteration of these methods
requires one multiplication each by A and A
T
. MATLAB
implementations of related approaches are available in the
SparseLab toolbox; see in particular the routines SolveBP
and pdco. For additional details see [51].
Another IP method was recently proposed to solve a
quadratic programming reformulation of (1), different from
the one used here. Each search step is computed us-
ing preconditioned conjugate gradient (PCG) and requires
only products by A and A
T
[36]. The code, available at
www.stanford.edu/˜boyd/l1_ls/, is reported to be
faster than competing codes on the problems tested in [36].
The ℓ
1
-magic suite of codes (which is available at
www.l1-magic.org) implements algorithms for several of
the formulations described in Section I-A. In particular, the
formulation (3) is solved by recasting it as a second-order
cone program (SOCP), then applying a primal log-barrier
approach. For each value of the log-barrier parameter, the
smooth unconstrained subproblem is solved using Newton’s
method with line search, where the Newton equations may be
solved using CG. (Background on this approach can be found
in [6], [9].) As in [11] and [36], each CG iteration requires
only multiplications by A and A
T
; these matrices need not
be known or stored explicitly.
Iterative shrinkage/thresholding (IST) algorithms can also
be used to handle (1) and only require matrix-vector multi-
plications involving A and A
T
. Initially, IST was presented
剩余11页未读,继续阅读
资源评论
totoroproactive
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 五子棋 (也称为 Gobang 或五子棋) 的 AlphaZero 算法的实现.zip
- 为 Go 自动生成的 Google API .zip
- 一组快速入门示例,演示了适用于 Android 和 iOS 的 Google API.zip
- 一款简单但有效的 Go 网站迷你分析器.zip
- 一个线程安全的并发映射.zip
- 一个用于与任意 JSON 交互的 Go 包.zip
- 一个用于 go 的 cron 库.zip
- 基于BJUI + Spring MVC + Spring + Mybatis框架的办公自动化系统设计源码
- 基于百度地图的Java+HTML+JavaScript+CSS高速公路设备管理系统设计源码
- 基于Django Web框架的母婴商城实践项目设计源码
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功