基于稀疏表示的图像超分辨方法研究资源-CSDN文库

需积分: 9 42 浏览量 2012-10-31 13:42:04 上传评论 1 收藏 2.14MB PDF 举报

资源推荐

资源详情

资源评论

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Image Super-Resolution via Sparse Representation

Jianchao Yang, Student Member, IEEE, John Wright, Member, IEEE Thomas Huang, Life Fellow, IEEE and

Yi Ma, Senior Member, IEEE

Abstract—This paper presents a new approach to single-image

superresolution, based on sparse signal representation. Research

on image statistics suggests that image patches can be well-

represented as a sparse linear combination of elements from

an appropriately chosen over-complete dictionary. Inspired by

this observation, we seek a sparse representation for each patch

of the low-resolution input, and then use the coefﬁcients of this

representation to generate the high-resolution output. Theoretical

results from compressed sensing suggest that under mild condi-

tions, the sparse representation can be correctly recovered from

the downsampled signals. By jointly training two dictionaries for

the low- and high-resolution image patches, we can enforce the

similarity of sparse representations between the low resolution

and high resolution image patch pair with respect to their

own dictionaries. Therefore, the sparse representation of a low

resolution image patch can be applied with the high resolution

image patch dictionary to generate a high resolution image patch.

The learned dictionary pair is a more compact representation

of the patch pairs, compared to previous approaches, which

simply sample a large amount of image patch pairs [1], reducing

the computational cost substantially. The effectiveness of such

a sparsity prior is demonstrated for both general image super-

resolution and the special case of face hallucination. In both

cases, our algorithm generates high-resolution images that are

competitive or even superior in quality to images produced by

other similar SR methods. In addition, the local sparse modeling

of our approach is naturally robust to noise, and therefore the

proposed algorithm can handle super-resolution with noisy inputs

in a more uniﬁed framework.

Index Terms—Image super-resolution, sparse representation,

sparse coding, face hallucination, non-negative matrix factoriza-

tion.

I. INTRODUCTION

Super-resolution (SR) image reconstruction is currently a

very active area of research, as it offers the promise of

overcoming some of the inherent resolution limitations of

low-cost imaging sensors (e.g. cell phone or surveillance

cameras) allowing better utilization of the growing capability

of high-resolution displays (e.g. high-deﬁnition LCDs). Such

resolution-enhancing technology may also prove to be essen-

tial in medical imaging and satellite imaging where diagnosis

or analysis from low-quality images can be extremely difﬁcult.

Conventional approaches to generating a super-resolution im-

age normally require as input multiple low-resolution images

Jianchao Yang and Thomas Huang are with Beckman Institute, Uni-

versity of Illinois Urbana-Champaign, Urbana, IL 61801 USA (email:

jyang29@ifp.uiuc.edu; huang@ifp.uiuc.edu). John Wright is with the Visual

Computing Group, Microsoft Research Asia (email: jnwright@uiuc.edu). Yi

Ma is with the Visual Computing Group, Microsoft Research Aisa, as well

as Coordinated Science Laboratory, University of Illinois Urbana-Champaign,

Urbana, IL 61801 USA (email: yima@uiuc.edu).

This work was supported in part by the U.S. Army Research Laboratory

and the U.S. Army Research Ofﬁce under grant number W911NF-09-1-0383.

It was also supported by grants NSF IIS 08-49292, NSF ECCS 07-01676,

and ONR N00014-09-1-0230.

of the same scene, which are aligned with sub-pixel accuracy.

The SR task is cast as the inverse problem of recovering the

original high-resolution image by fusing the low-resolution

images, based on reasonable assumptions or prior knowledge

about the observation model that maps the high-resolution im-

age to the low-resolution ones. The fundamental reconstruction

constraint for SR is that the recovered image, after applying the

same generation model, should reproduce the observed low-

resolution images. However, SR image reconstruction is gen-

erally a severely ill-posed problem because of the insufﬁcient

number of low resolution images, ill-conditioned registration

and unknown blurring operators, and the solution from the

reconstruction constraint is not unique. Various regularization

methods have been proposed to further stabilize the inversion

of this ill-posed problem, such as [2], [3], [4].

However, the performance of these reconstruction-based

super-resolution algorithms degrades rapidly when the desired

magniﬁcation factor is large or the number of available input

images is small. In these cases, the result may be overly

smooth, lacking important high-frequency details [5]. Another

class of SR approach is based on interpolation [6], [7],

[8]. While simple interpolation methods such as Bilinear or

Bicubic interpolation tend to generate overly smooth images

with ringing and jagged artifacts, interpolation by exploiting

the natural image priors will generally produce more favorable

results. Dai et al. [7] represented the local image patches using

the background/foreground descriptors and reconstructed the

sharp discontinuity between the two. Sun et. al. [8] explored

the gradient proﬁle prior for local image structures and ap-

plied it to super-resolution. Such approaches are effective in

preserving the edges in the zoomed image. However, they are

limited in modeling the visual complexity of the real images.

For natural images with ﬁne textures or smooth shading, these

approaches tend to produce watercolor-like artifacts.

A third category of SR approach is based on ma-

chine learning techniques, which attempt to capture the co-

occurrence prior between low-resolution and high-resolution

image patches. [9] proposed an example-based learning strat-

egy that applies to generic images where the low-resolution

to high-resolution prediction is learned via a Markov Random

Field (MRF) solved by belief propagation. [10] extends this

approach by using the Primal Sketch priors to enhance blurred

edges, ridges and corners. Nevertheless, the above methods

typically require enormous databases of millions of high-

resolution and low-resolution patch pairs, and are therefore

computationally intensive. [11] adopts the philosophy of Lo-

cally Linear Embedding (LLE) [12] from manifold learning,

assuming similarity between the two manifolds in the high-

resolution and the low-resolution patch spaces. Their algorithm

maps the local geometry of the low-resolution patch space to

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

the high-resolution one, generating high-resolution patch as

a linear combination of neighbors. Using this strategy, more

patch patterns can be represented using a smaller training

database. However, using a ﬁxed number K neighbors for

reconstruction often results in blurring effects, due to over- or

under-ﬁtting. In our previous work [1], we proposed a method

for adaptively choosing the most relevant reconstruction neigh-

bors based on sparse coding, avoiding over- or under-ﬁtting of

[11] and producing superior results. However, sparse coding

over a large sampled image patch database directly is too time-

consuming.

While the mentioned approaches above were proposed for

generic image super-resolution, speciﬁc image priors can be

incorporated when tailored to SR applications for speciﬁc

domains such as human faces. This face hallucination prob-

lem was addressed in the pioneering work of Baker and

Kanade [13]. However, the gradient pyramid-based prediction

introduced in [13] does not directly model the face prior, and

the pixels are predicted individually, causing discontinuities

and artifacts. Liu et al. [14] proposed a two-step statistical

approach integrating the global PCA model and a local patch

model. Although the algorithm yields good results, the holistic

PCA model tends to yield results like the mean face and the

probabilistic local patch model is complicated and compu-

tationally demanding. Wei Liu et al. [15] proposed a new

approach based on TensorPatches and residue compensation.

While this algorithm adds more details to the face, it also

introduces more artifacts.

This paper focuses on the problem of recovering the super-

resolution version of a given low-resolution image. Similar

to the aforementioned learning-based methods, we will rely

on patches from the input image. However, instead of work-

ing directly with the image patch pairs sampled from high-

and low-resolution images [1], we learn a compact repre-

sentation for these patch pairs to capture the co-occurrence

prior, signiﬁcantly improving the speed of the algorithm.

Our approach is motivated by recent results in sparse signal

representation, which suggest that the linear relationships

among high-resolution signals can be accurately recovered

from their low-dimensional projections [16], [17]. Although

the super-resolution problem is very ill-posed, making precise

recovery impossible, the image patch sparse representation

demonstrates both effectiveness and robustness in regularizing

the inverse problem.

a) Basic Ideas: To be more precise, let D ∈ R

n×K

be an overcomplete dictionary of K atoms (K>n), and

suppose a signal x ∈ R

can be represented as a sparse linear

combination with respect to D. That is, the signal x can be

written as x = Dα

where where α

∈ R

is a vector with

very few ( n) nonzero entries. In practice, we might only

observe a small set of measurements y of x:

= Lx = LDα

, (1)

where L ∈ R

k×n

with k<nis a projection matrix. In our

super-resolution context, x is a high-resolution image (patch),

while y is its low-resolution counter part (or features extracted

from it). If the dictionary D is overcomplete, the equation

x = Dα is underdetermined for the unknown coefﬁcients α.

Fig. 1. Reconstruction of a raccoon face with magniﬁcation factor 2. Left:

result by our method. Right: the original image. There is little noticeable

difference visually even for such a complicated texture. The RMSE for the

reconstructed image is 5.92 (only the local patch model is employed).

The equation y = LDα is even more dramatically under-

determined. Nevertheless, under mild conditions, the sparsest

solution α

to this equation will be unique. Furthermore, if

D satisﬁes an appropriate near-isometry condition, then for

a wide variety of matrices L, any sufﬁciently sparse linear

representation of a high-resolution image patch x in terms

of the D can be recovered (almost) perfectly from the low-

resolution image patch [17], [18].

Fig. 1 shows an example

that demonstrates the capabilities of our method derived from

this principle. The image of the raccoon face is blurred and

downsampled to half of its original size in both dimensions.

Then we zoom the low-resolution image to the original size

using the proposed method. Even for such a complicated

texture, sparse representation recovers a visually appealing

reconstruction of the original signal.

Recently sparse representation has been successfully applied

to many other related inverse problems in image processing,

such as denoising [19] and restoration [20], often improving on

the state-of-the-art. For example in [19], the authors use the

K-SVD algorithm [21] to learn an overcomplete dictionary

from natural image patches and successfully apply it to the

image denoising problem. In our setting, we do not directly

compute the sparse representation of the high-resolution patch.

Instead, we will work with two coupled dictionaries, D

for

high-resolution patches, and D

for low-resolution ones. The

sparse representation of a low-resolution patch in terms of

will be directly used to recover the corresponding high-

resolution patch from D

. We obtain a locally consistent

solution by allowing patches to overlap and demanding that the

reconstructed high-resolution patches agree on the overlapped

areas. In this paper, we try to learn the two overcomplete

dictionaries in a probabilistic model similar to [22]. To enforce

that the image patch pairs have the same sparse representations

with respect to D

and D

, we learn the two dictionaries

simultaneously by concatenating them with proper normal-

ization. The learned compact dictionaries will be applied to

both generic image super-resolution and face hallucination to

demonstrate their effectiveness.

Compared with the aforementioned learning-based methods,

our algorithm requires only two compact learned dictionaries,

instead of a large training patch database. The computation,

mainly based on linear programming or convex optimization,

Even though the structured projection matrix deﬁned by blurring and

downsampling in our SR context does not guarantee exact recovery of α

empirical experiments indeed demonstrate the effectiveness of such a sparse

prior for our SR tasks.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

is much more efﬁcient and scalable, compared with [9], [10],

[11]. The online recovery of the sparse representation uses the

low-resolution dictionary only – the high-resolution dictionary

is used to calculate the ﬁnal high-resolution image. The

computed sparse representation adaptively selects the most

relevant patch bases in the dictionary to best represent each

patch of the given low-resolution image. This leads to superior

performance, both qualitatively and quantitatively, compared

to the method described in [11], which uses a ﬁxed number

of nearest neighbors, generating sharper edges and clearer

textures. In addition, the sparse representation is robust to

noise as suggested in [19], and thus our algorithm is more

robust to noise in the test image, while most other methods

cannot perform denoising and super-resolution simultaneously.

b) Organization of the Paper: The remainder of this

paper is organized as follows. Section II details our formula-

tion and solution to the image super-resolution problem based

on sparse representation. Speciﬁcally, we study how to apply

sparse representation for both generic image super-resolution

and face hallucination. In Section III, we discuss how to

learn the two dictionaries for the high- and low-resolution

image patches respectively. Various experimental results in

Section IV demonstrate the efﬁcacy of sparsity as a prior for

regularizing image super-resolution.

c) Notations: X and Y denote the high- and low-

resolution images respectively, and x and y denote the high-

and low-resolution image patches respectively. We use bold

uppercase D to denote the dictionary for sparse coding,

speciﬁcally we use D

and D

to denote the dictionaries

for high- and low-resolution image patches respectively. Bold

lowercase letters denote vectors. Plain uppercase letters denote

regular matrices, i.e., S is used as a downsampling operation

in matrix form. Plain lowercase letters are used as scalars.

II. I

MAGE SUPER-RESOLUTION FROM SPARSITY

The single-image super-resolution problem asks: given a

low-resolution image Y , recover a higher-resolution image X

of the same scene. Two constraints are modeled in this work

to solve this ill-posed problem: 1) reconstruction constraint,

which requires that the recovered X should be consistent with

the input Y with respect to the image observation model;

and 2) sparsity prior, which assumes that the high resolution

patches can be sparsely represented in an appropriately chosen

overcomplete dictionary, and that their sparse representations

can be recovered from the low resolution observation.

1) Reconstruction constraint: The observed low-resolution

image Y is a blurred and downsampled version of the high

resolution image X:

Y = SHX (2)

Here, H represents a blurring ﬁlter, and S the downsampling

operator.

Super-resolution remains extremely ill-posed, since for a

given low-resolution input Y , inﬁnitely many high-resolution

images X satisfy the above reconstruction constraint. We

further regularize the problem via the following prior on small

patches x of X:

2) Sparsity prior: The patches x of the high-resolution

image X can be represented as a sparse linear combination in

a dictionary D

trained from high-resolution patches sampled

from training images:

x ≈ D

α for some α ∈ R

with α

 K. (3)

The sparse representation α will be recovered by representing

patches y of the input image Y , with respect to a low

resolution dictionary D

co-trained with D

. The dictionary

training process will be discussed in Section III.

We apply our approach to both generic images and face

images. For generic image super-resolution, we divide the

problem into two steps. First, as suggested by the sparsity

prior (3), we ﬁnd the sparse representation for each local

patch, respecting spatial compatibility between neighbors.

Next, using the result from this local sparse representation,

we further regularize and reﬁne the entire image using the

reconstruction constraint (2). In this strategy, a local model

from the sparsity prior is used to recover lost high-frequency

for local details. The global model from the reconstruction

constraint is then applied to remove possible artifacts from

the ﬁrst step and make the image more consistent and natural.

The face images differ from the generic images in that the face

images have more regular structure and thus reconstruction

constraints in the face subspace can be more effective. For

face image super-resolution, we reverse the above two steps

to make better use of the global face structure as a regularizer.

We ﬁrst ﬁnd a suitable subspace for human faces, and apply

the reconstruction constraints to recover a medium resolution

image. We then recover the local details using the sparsity

prior for image patches.

The remainder of this section is organized as follows: in

Section II-A, we discuss super-resolution for generic images.

We will introduce the local model based on sparse represen-

tation and global model based on reconstruction constraints.

In Section II-B we discuss how to introduce the global face

structure into this framework to achieve more accurate and

visually appealing super-resolution for face images.

A. Generic Image Super-Resolution from Sparsity

1) Local model from sparse representation: Similar to

the patch-based methods mentioned previously, our algorithm

tries to infer the high-resolution image patch for each low-

resolution image patch from the input. For this local model,

we have two dictionaries D

and D

, which are trained to

have the same sparse representations for each high-resolution

and low-resolution image patch pair. We subtract the mean

pixel value for each patch, so that the dictionary represents

image textures rather than absolute intensities. In the recovery

process, the mean value for each high-resolution image patch

is then predicted by its low-resolution version.

For each input low-resolution patch y, we ﬁnd a sparse

representation with respect to D

. The corresponding high-

resolution patch bases D

will be combined according to these

coefﬁcients to generate the output high-resolution patch x.

The problem of ﬁnding the sparsest representation of y can

be formulated as:

min α

s.t. F D

α − F y

≤ , (4)

剩余12页未读，继续阅读

评论收藏

内容反馈

yancai345

粉丝: 0
资源: 3

基于稀疏表示的图像超分辨方法研究

基于稀疏表示的超分辨率方法

论文研究-基于稀疏表示的图像超分辨率算法研究 .pdf

图像超分辨率与稀疏表示

基于稀疏表示的遥感图像超分辨率

基于稀疏表示的图像超分辨率重建快速算法

基于稀疏表示的图像超分辨PPT

基于稀疏表示和多成分字典学习的超分辨率重建

稀疏表示超分辨重建

基于稀疏表示的遥感图像超分辨重建

基于稀疏表征的图像超分辨算法研究

音视频-编解码-基于广义稀疏表示的图像超分辨重建方法研究.pdf

稀疏表示遥感图像超分辨重建

基于稀疏重构的空间邻近目标红外单帧图像超分辨方法

改进的稀疏表示遥感图像超分辨重建

CVPR09-ScSPM.rar_ScSPM_图像 稀疏表示_图像超分辨_基于稀疏表示_稀疏表示

论文研究-基于稀疏表示的自适应图像超分辨率重建算法.pdf

SR.zip_图像超分辨_稀疏_稀疏表示 MATLAB_超分辨_重建

基于稀疏性深度学习的航拍图像超分辨重构.pdf

最新稀疏表示用于图像超分辨重构的论文

阵列测向的稀疏超分辨方法研究

基于稀疏表示模型的图像解码方法

论文研究-图像超分辨率重建的研究进展.pdf

论文研究 - 基于深度学习和稀疏编码的超分辨率方法对胸部X光片放大图像的图像质量的性能评估

基于稀疏表示和自相似学习的图像超分辨率重构 (2013年)

基于脉冲卷积神经网络稀疏表征的高分辨率遥感图像场景分类方法.pdf

基于过完备字典稀疏表示的图像超分辨率研究 (2013年)

解决win7win8win10装4.8-3.5的.Net framework3.5安装失败问题 附带安装文档

谷歌浏览器axure扩展程序

最新资源

CVPR09-ScSPM.rar_ScSPM_图像稀疏表示_图像超分辨_基于稀疏表示_稀疏表示

解决win7win8win10装4.8-3.5的.Net framework3.5安装失败问题附带安装文档