没有合适的资源?快使用搜索试试~ 我知道了~
信号自适应稀疏表示的K-SVD算法
1星 需积分: 45 22 下载量 121 浏览量
2011-08-31
15:46:16
上传
评论
收藏 503KB PDF 举报
温馨提示
信号自适应稀疏表示方法K-SVD,优于短时傅里叶变换
资源推荐
资源详情
资源评论
K-SVD: An Algorithm for Designing of
Overcomplete Dictionaries for Sparse
Representation
Michal Aharon Michael Elad Alfred Bruckstein Yana Katz
Department of Computer Science
Technion—Israel Institute of Technology
Technion City, Haifa 32000, Israel
Tel. 972-4-8294925. Fax. 972-4-8293900
e-mail: michalo@cs.technion.ac.il
Abstract
In recent years there has been a growing interest in the study of sparse representation of signals.
Using an overcomplete dictionary that contains prototype signal-atoms, signals are described by sparse
linear combinations of these atoms. Applications that use sparse representation are many and include
compression, regularization in inverse problems, feature extraction, and more. Recent activity in this field
concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given
dictionary. Designing dictionaries to better fit the above model can be done by either selecting one from
a pre-specified set of linear transforms, or by adapting the dictionary to a set of training signals. Both
these techniques have been considered, but this topic is largely still open.
In this paper we propose a novel algorithm for adapting dictionaries in order to achieve sparse signal
representations. Given a set of training signals, we seek the dictionary that leads to the best representation
for each member in this set, under strict sparsity constraints. We present a new method – the K-SVD
algorithm – generalizing the K-Means clustering process. K-SVD is an iterative method that alternates
between sparse coding of the examples based on the current dictionary, and a process of updating the
dictionary atoms to better fit the data. The update of the dictionary columns is combined with an update
of the sparse representations, thereby accelerating convergence. The K-SVD algorithm is flexible and
can work with any pursuit method (e.g., basis pursuit, FOCUSS, or matching pursuit). We analyze this
algorithm and demonstrate its results on both synthetic tests and in applications on real image data.
Keywords: K-Means, vector quantization, gain-shape VQ, codebook, K-SVD, training, dictionary, atom
decomposition, sparse representation, basis pursuit, matching pursuit, FOCUSS.
1
I. INTRODUCTION
A. Sparse Representation of Signals
Recent years have witnessed a growing interest in the search for sparse representations of signals. Using
an overcomplete dictionary matrix D ∈ IR
n×K
that contains K prototype signal-atoms for columns,
{d
j
}
K
j=1
, a signal y ∈ IR
n
can be represented as a sparse linear combination of these atoms. The
representation of y may either be exact y = Dx, or approximate, y ≈ Dx, satisfying ky − Dxk
p
≤ .
The vector x ∈ IR
K
contains the representation coefficients of the signal y. In approximation methods,
typical norms used for measuring the deviation are the `
p
-norms for p = 1, 2 and ∞. In this work we
shall concentrate on the case of p = 2.
If n < K and D is a full-rank matrix, an infinite number of solutions are available for the representation
problem, hence constraints on the solution must be set. The solution with the fewest number of nonzero
coefficients is certainly an appealing representation. This sparsest representation is the solution of either
(P
0
) min
x
kxk
0
subject to y = Dx, (1)
or
(P
0,
) min
x
kxk
0
subject to ky − Dxk
2
≤ , (2)
where k·k
0
is the l
0
norm, counting the nonzero entries of a vector.
Applications that can benefit from the sparsity and overcompleteness concepts (together or separately)
include compression, regularization in inverse problems, feature extraction, and more. Indeed, the success
of the JPEG2000 coding standard can be attributed to the sparsity of the wavelet coefficients of
natural images [1]. In denoising, wavelet methods and shift-invariant variations that exploit overcomplete
representation, are among the most effective known algorithms for this task [2], [3], [4], [5]. Sparsity and
overcompleteness have been successfully used for dynamic range compression in images [6], separation
of texture and cartoon content in images [7], [8], inpainting [9], and more.
Extraction of the sparsest representation is a hard problem that has been extensively investigated in the
past few years. We review some of the most popular methods in Section II. In all those methods, there
is a preliminary assumption that the dictionary is known and fixed. In this work we address the issue of
designing the proper dictionary, in order to better fit the sparsity model imposed.
2
B. The Choice of the Dictionary
An overcomplete dictionary D that leads to sparse representations can either be chosen as a pre-
specified set of functions, or designed by adapting its content to fit a given set of signal examples.
Choosing a pre-specified transform matrix is appealing because it is simpler. Also, in many cases it
leads to simple and fast algorithms for the evaluation of the sparse representation. This is indeed the case
for overcomplete wavelets, curvelets, contourlets, steerable wavelet filters, short-time-Fourier transforms,
and more. Preference is typically given to tight frames that can easily be pseudo-inverted. The success
of such dictionaries in applications depends on how suitable they are to sparsely describe the signals in
question. Multiscale analysis with oriented basis functions and a shift-invariant property are guidelines
in such constructions.
In this paper we consider a different route for designing dictionaries D based on learning. Our goal
is to find the dictionary D that yields sparse representations for the training signals. We believe that
such dictionaries have the potential to outperform commonly used pre-determined dictionaries. With
ever-growing computational capabilities, computational cost may become secondary in importance to the
improved performance achievable by methods which adapt dictionaries for special classes of signals.
C. Our Paper’s Contribution and Structure
In this paper we present a novel algorithm for adapting dictionaries so as to represent signals sparsely.
Given a set of training signals {y
i
}
N
i=1
, we seek the dictionary D that leads to the best possible
representations for each member in this set with strict sparsity constraints. We introduce the K-SVD
algorithm that addresses the above task, generalizing the K-Means algorithm. The K-SVD is an iterative
method that alternates between sparse coding of the examples based on the current dictionary, and an
update process for the dictionary atoms so as to better fit the data. The update of the dictionary columns is
done jointly with an update of the sparse representation coefficients related to it, resulting in accelerated
convergence. The K-SVD algorithm is flexible and can work with any pursuit method, thereby tailoring
the dictionary to the application in mind. In this work we present the K-SVD algorithm, analyze it,
discuss its relation to prior art, and prove its superior performance. We demonstrate the K-SVD results
in both synthetic tests and applications involving real image data.
In Section II we survey pursuit algorithms that are later used by the K-SVD, together with some
3
recent theoretical results justifying their use for sparse coding. In Section III we refer to recent work
done in the field of sparse-representation dictionary design, and describe different algorithms that were
proposed for this task. In Section IV we describe our algorithm, its possible variations, and its relation to
previously proposed methods. The K-SVD results on synthetic data are presented in Section V, and some
preliminary applications involving real image data are given in Section VI. We conclude and discuss
future possible research direction in Section VII.
II. SPARSE CODING: PRIOR ART
Sparse coding is the process of computing the representation coefficients, x, based on the given signal
y and the dictionary D. This process, commonly referred to as “atom decomposition”, requires solving
(1) or (2), and this is typically done by a “pursuit algorithm” that finds an approximate solution. In
this section we briefly discuss several such algorithms, and their prospects for success. A more detailed
description of those methods can be found in [10]. Sparse coding is a necessary stage in the K-SVD
method we develop later in this paper, hence it is important to have a good overview of methods for
achieving it.
Exact determination of sparsest representations proves to be an NP-hard problem [11]. Thus,
approximate solutions are considered instead, and in the past decade or so several efficient pursuit
algorithms have been proposed. The simplest ones are the Matching Pursuit (MP) [12] and the Orthogonal
Matching Pursuit (OMP) algorithms [13], [14], [15], [16]. These are greedy algorithms that select the
dictionary atoms sequentially. These methods are very simple, involving the computation of inner products
between the signal and dictionary columns, and possibly deploying some least squares solvers. Both (1)
and (2) are easily addressed by changing the stopping rule of the algorithm.
A second well known pursuit approach is the Basis Pursuit (BP) [17]. It suggests a convexification of
the problems posed in (1) and (2), by replacing the `
0
-norm with an `
1
-norm. The Focal Under-determined
System Solver (FOCUSS) is very similar, using the `
p
-norm with p ≤ 1, as a replacement to the `
0
-norm
[18], [19], [20], [21]. Here, for p < 1 the similarity to the true sparsity measure is better, but the overall
problem becomes non-convex, giving rise to local minima that may mislead in the search for solutions.
Lagrange multipliers are used to convert the constraint into a penalty term, and an iterative method is
derived based on the idea of iterated reweighed least-squares that handles the `
p
-norm as an `
2
weighted
4
norm.
Both the BP and the FOCUSS can be motivated based on Maximum A Posteriori (MAP) estimation,
and indeed several works used this reasoning directly [22], [23], [24], [25]. The MAP can be used to
estimate the coefficients as random variables, by maximizing the posterior P (x|y, D) ∝ P (y|D, x)P (x).
The prior distribution on the coefficient vector x is assumed to be a super-Gaussian iid distribution that
favors sparsity. For the Laplace distribution this approach is equivalent to BP [22].
Extensive study of these algorithms in recent years has established that if the sought solution, x, is
sparse enough, these techniques recover it well in the exact case [16], [26], [27], [28], [29], [30]. Further
work considered the approximated versions and has shown stability in recovery of x [31], [32]. The recent
front of activity revisits those questions within a probabilistic setting, obtaining more realistic assessments
on pursuit algorithm performance and success [33], [34], [35]. The properties of the dictionary D set the
limits on the sparsity of the coefficient vector that consequently leads to its successful evaluation.
III. DESIGN OF DI C T I O NA R I E S : PRIOR ART
We now come to the main topic of the paper, the training of dictionaries based on a set of examples.
Given such set Y = {y
i
}
N
i=1
, we assume that there exists a dictionary D that gave rise to the given
signal examples via sparse combinations, i.e., we assume that there exists D, so that solving (P
0
) for each
example y
k
gives a sparse representation x
k
. It is in this setting that we ask what the proper dictionary
D is.
A. Generalizing the K-Means?
There is an intriguing relation between sparse representation and clustering (i.e., vector quantization).
This connection has previously been mentioned in several reports [36], [37], [38]. In clustering, a set of
descriptive vectors {d
k
}
K
k=1
is learned, and each sample is represented by one of those vectors (the one
closest to it, usually in the `
2
distance measure). We may think of this as an extreme sparse representation,
where only one atom is allowed in the signal decomposition, and furthermore, the coefficient multiplying
it must be 1. There is a variant of the vector quantization (VQ) coding method, called Gain-Shape VQ,
where this coefficient is allowed to vary [39]. In contrast, in sparse representations as discussed in this
paper, each example is represented as a linear combination of several vectors {d
k
}
K
k=1
. Thus, sparse
representations can be referred to as a generalization of the clustering problem.
剩余28页未读,继续阅读
资源评论
- 茶香小怪2019-04-21就一篇文章?!
wangxiaoyankx
- 粉丝: 0
- 资源: 4
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功