Image Super-Resolution Based on Adaptive Joint
Distribution Modeling
Hangfan Liu
∗
, Ruiqin Xiong
∗
, Qiang Song
∗
, Feng Wu
†
, Wen Gao
∗
∗
Institute of Digital Media, Peking University, Beijing 100871, China
†
University of Science and Technology of China, Hefei 230026, China
Email: {liuhf,rqxiong,songqiang,wgao}@pku.edu.cn, fengwu@ustc.edu.cn
Abstract—This paper combines an adaptive reconstruction
based approach and a learning based technique into an effective
scheme for single image super-resolution. Unlike conventional
schemes that adopt pre-trained dictionaries to tell the relationship
between high-resolution (HR) image and the low-resolution (LR)
observation, the proposed method attempts to learn the joint
distribution of highly-correlated patch couples from the input
image itself instead of an external dataset, so that the learnt
models are specially tailored for the current patches and thus
can better fit the image data of interest. To be specific, we first
apply spatially adaptive gradient sparsity regularization in the
reconstruction of the HR image using the contour information,
and then utilize the generated HR output to guide the joint
distribution learning to infer the relationship between the highly-
correlated HR and LR patches. In this way, we simultaneously
exploit the inter-scale correlation as well as the local and
non-local correlation of the image contents. Empirical results
show that the performance of the proposed method is highly
competitive with state-of-the-art schemes in terms of peak signal-
to-noise ratio (PSNR) and perceptual quality.
Index Terms—Super-resolution, content adaptive, joint distri-
bution modeling, non-local similarity, gradient sparsity
I. INTRODUCTION
As an important image processing task, image super-
resolution (SR) has constantly attracted extensive research
works. It is a highly ill-posed problem aiming to recover
high-resolution (HR) images from low-resolution (LR) obser-
vations. This paper focuses on single image SR problem that
requires to reconstruct the HR image X from one single LR
input Y , which is generally formulated as Y = DHX + η,
where H is the blur kernel, D denotes the down-sampling
operator, and η is the additive noise. In order to tackle
single image SR problems, it is necessary to exploit the prior
knowledge of the original image X or seek extra information
so as to regularize the solution space.
A typical line of research is based on regularized re-
construction [1], [2]. Such works enforce regularization that
reflect some prior knowledge upon the latent HR images,
and meanwhile require the reconstructed HR image to be
consistent with the LR input. Earlier and simpler priors are
based on intuitive understanding such as local smoothness of
natural images, later and more complicated priors are generally
based on statistical analysis of image database. Image priors
that faithfully indicate the statistics of actual image contents
are of great importance for the performance of these schemes.
After that, leaning based SR methods emerged and achieved
state-of-the-art performance. They estimate high-frequency
components from dictionaries that encode the relationship
between HR and LR contents [3], [4]. Such dictionaries are
usually trained by external dataset of HR and LR image pairs,
and have various variants such as Gaussian mixture model
[5]. Recently, based on convolutional neural network (CNN),
the authors of SRCNN [6] proposed to use CNN to learn the
mapping between HR and LR images.
Inspired by the success of recent regularization based and
learning based methods, this paper attempts to learn the LR to
HR image content mapping via data-driven joint distribution
modeling for LR and HR patch couples. Different than many
approaches that use external data to learn the relationship
of LR and HR contents, we utilize highly correlated non-
local patches retrieved from the current image itself as data
samples to form the distribution. The pilot used to provide
corresponding HR data is generated by a gradient sparsity (GS)
based reconstruction procedure. The GS regularization utilizes
gradients to describe the directionality of variation between
adjacent pixels, and is adaptively applied to image contents
that are more likely to be highly sparse in gradient domain
using contour information, so that artifacts are suppressed
while edges are preserved.
II. ADAPTIVE GRADIENT SPARSITY REGULARIZATION
This section describes the first stage of the proposed
scheme, which generates a basic estimate of the high-
resolution (HR) image for the second stage via gradient
sparsity (GS) regularization:
˜
X = arg min Φ
GS
(X) + µ kDHX − Y k
2
2
, (1)
where Φ
GS
(X) is the GS based function regularizing the
solution space, kDHX − Y k
2
2
is the data fidelity term, and
µ is the regularization parameter balancing the two competing
terms. In order to preserve sharp edges while suppressing
annoying distortions such as ringing artifacts, we exploit
sparsity of gradients in non-edge areas of the latent HR image.
The sparsity of gradient data has been widely used in image
restoration tasks [7]. Since the statistical characteristics of
images are generally non-stationary, using spatially adaptive
gradient sparsity regularization is more effective than employ-
ing a global gradient model [7]. In this paper, we consider the
fact that image gradients are highly sparse in non-edge area,
while those at edges usually have significant values hence are
978-1-5386-0462-5/17/$31.00
c
2017 IEEE VCIP 2017, Dec. 10 – 13, 2017, St Petersburg, U.S.A.