【免费】论文4_AccurateImageSuper-ResolutionUsingVeryDeepConvolutiona资源-CSDN文库

需积分: 0 131 浏览量 2022-08-03 15:56:15 上传评论收藏 1.33MB PDF 举报

这篇论文《Accurate Image Super-Resolution Using Very Deep Convolutional Networks》发表于2016年的CVPR（计算机视觉与模式识别会议），由Seoul National University的研究人员Jiwon Kim、Jung Kwon Lee和Kyoung Mu Lee共同撰写。他们提出了一种高度精确的单图像超分辨率（SISR）方法，该方法受到VGG-net在ImageNet分类中的启发，构建了一个深度卷积网络模型，具有20层权重层。在图像超分辨率恢复中，目标是根据低分辨率（LR）图像生成高分辨率（HR）图像。SISR技术广泛应用于各种领域，如安全监控、医疗成像等，这些场景往往需要在需要时增加图像细节。早期的SISR方法主要依赖插值，如双三次和兰索斯重采样，后来发展到利用统计图像先验或内部补丁重复的更强大方法。近年来，学习方法成为从LR到HR补丁映射的主流模型，包括邻居嵌入、稀疏编码和随机森林等方法。论文的核心贡献在于通过非常深的卷积网络结构来利用小滤波器的级联，以有效的方式提取大图像区域的上下文信息。然而，深度网络在训练过程中可能面临收敛速度慢的问题。为了解决这个问题，研究者提出了一个简单的但有效的训练策略，仅学习残差并采用极高的学习率（比SRCNN的学习率高出10,000倍），同时应用可调整的梯度剪切来稳定训练过程。与现有的超分辨率方法相比，该方法在准确性和视觉效果上都有所提升。特别是，他们采用的CNN模型能够学习更复杂的图像特征，并通过多层非线性变换来恢复细节。这不仅提高了重建图像的质量，而且使得观察到的视觉改善显著。这篇论文展示了深度学习在图像超分辨率任务上的潜力，通过非常深的卷积网络结构实现了对图像细节的精确恢复，同时也提出了一种针对深度网络训练的有效优化策略。这种方法不仅提升了图像质量，还为后续的超分辨率研究提供了新的思路和参考。

资源推荐

资源详情

资源评论

Accurate Image Super-Resolution Using Very Deep Convolutional Networks

Jiwon Kim, Jung Kwon Lee and Kyoung Mu Lee

Department of ECE, ASRI, Seoul National University, Korea

{j.kim, deruci, kyoungmu}@snu.ac.kr

Abstract

We present a highly accurate single-image super-

resolution (SR) method. Our method uses a very deep con-

volutional network inspired by VGG-net used for ImageNet

classiﬁcation [

19]. We ﬁnd increasing our network depth

shows a signiﬁcant improvement in accuracy. Our ﬁnal

model uses 20 weight layers. By cascading small ﬁlters

many times in a deep network structure, contextual infor-

mation over large image regions is exploited in an efﬁcient

way. With very deep networks, however, convergence speed

becomes a critical issue during training. We propose a sim-

ple yet effective training procedure. We learn residuals only

and use extremely high learning rates (10

times higher

than SRCNN [

6]) enabled by adjustable gradient clipping.

Our proposed method performs better than existing meth-

ods in accuracy and visual improvements in our results are

easily noticeable.

1. Introduction

We address the problem of generating a high-resolution

(HR) image given a low-resolution (LR) image, commonly

referred as single image super-resolution (SISR) [

12], [8],

[

9]. SISR is widely used in computer vision applications

ranging from security and surveillance imaging to medical

imaging where more image details are required on demand.

Many SISR methods have been studied in the computer

vision community. Early methods include interpolation

such as bicubic interpolation and Lanczos resampling [

more powerful methods utilizing statistical image priors

[

20, 13] or internal patch recurrence [9].

Currently, learning methods are widely used to model a

mapping from LR to HR patches. Neighbor embedding [

15] methods interpolate the patch subspace. Sparse coding

[

25, 26, 21, 22] methods use a learned compact dictionary

based on sparse signal representation. Lately, random forest

[18] and convolutional neural network (CNN) [6] have also

been used with large improvements in accuracy.

Among them, Dong et al. [

6] has demonstrated that a

CNN can be used to learn a mapping from LR to HR in an

slow running time(s) fast

-2

-1

PSNR (dB)

36.4

36.6

36.8

37.2

37.4

37.6

VDSR (Ours)

SRCNN

SelfEx

RFL

A+

Figure 1: Our VDSR improves PSNR for scale factor ×2 on

dataset Set5 in comparison to the state-of-the-art methods (SR-

CNN uses the public slower implementation using CPU). VDSR

outperforms SRCNN by a large margin (0.87 dB).

end-to-end manner. Their method, termed SRCNN, does

not require any engineered features that are typically neces-

sary in other methods [

25, 26, 21, 22] and shows the state-

of-the-art performance.

While SRCNN successfully introduced a deep learning

technique into the super-resolution (SR) problem, we ﬁnd

its limitations in three aspects: ﬁrst, it relies on the con-

text of small image regions; second, training converges too

slowly; third, the network only works for a single scale.

In this work, we propose a new method to practically

resolve the issues.

Context We utilize contextual information spread over

very large image regions. For a large scale factor, it is often

the case that information contained in a small patch is not

sufﬁcient for detail recovery (ill-posed). Our very deep net-

work using large receptive ﬁeld takes a large image context

into account.

Convergence We suggest a way to speed-up the train-

ing: residual-learning CNN and extremely high learning

rates. As LR image and HR image share the same infor-

mation to a large extent, explicitly modelling the residual

image, which is the difference between HR and LR images,

is advantageous. We propose a network structure for efﬁ-

1646

cient learning when input and output are highly correlated.

Moreover, our initial learning rate is 10

times higher than

that of SRCNN [

6]. This is enabled by residual-learning

and gradient clipping.

Scale Factor We propose a single-model SR approach.

Scales are typically user-speciﬁed and can be arbitrary in-

cluding fractions. For example, one might need smooth

zoom-in in an image viewer or resizing to a speciﬁc dimen-

sion. Training and storing many scale-dependent models in

preparation for all possible scenarios is impractical. We ﬁnd

a single convolutional network is sufﬁcient for multi-scale-

factor super-resolution.

Contribution In summary, in this work, we propose a

highly accurate SR method based on a very deep convolu-

tional network. Very deep networks converge too slowly

if small learning rates are used. Boosting convergence rate

with high learning rates lead to exploding gradients and we

resolve the issue with residual-learning and gradient clip-

ping. In addition, we extend our work to cope with multi-

scale SR problem in a single network. Our method is rel-

atively accurate and fast in comparison to state-of-the-art

methods as illustrated in Figure

2. Related Work

SRCNN is a representative state-of-art method for deep

learning-based SR approach. So, let us analyze and com-

pare it with our proposed method.

2.1. Convolutional Network for Image Super-

Resolution

Model SRCNN consists of three layers: patch extrac-

tion/representation, non-linear mapping and reconstruction.

Filters of spatial sizes 9 × 9, 1 × 1, and 5 × 5 were used

respectively.

In [

6], Dong et al. attempted to prepare deeper models,

but failed to observe superior performance after a week of

training. In some cases, deeper models gave inferior perfor-

mance. They conclude that deeper networks do not result in

better performance (Figure 9).

However, we argue that increasing depth signiﬁcantly

boosts performance. We successfully use 20 weight lay-

ers (3 × 3 for each layer). Our network is very deep (20

vs. 3 [

6]) and information used for reconstruction (recep-

tive ﬁeld) is much larger (41 × 41 vs. 13 × 13).

Training For training, SRCNN directly models high-

resolution images. A high-resolution image can be de-

composed into a low frequency information (corresponding

to low-resolution image) and high frequency information

(residual image or image details). Input and output images

share the same low-frequency information. This indicates

that SRCNN serves two purposes: carrying the input to the

end layer and reconstructing residuals. Carrying the input

to the end is conceptually similar to what an auto-encoder

does. Training time might be spent on learning this auto-

encoder so that the convergence rate of learning the other

part (image details) is signiﬁcantly decreased. In contrast,

since our network models the residual images directly, we

can have much faster convergence with even better accu-

racy.

Scale As in most existing SR methods, SRCNN is

trained for a single scale factor and is supposed to work

only with the speciﬁed scale. Thus, if a new scale is on de-

mand, a new model has to be trained. To cope with multiple

scale SR (possibly including fractional factors), we need to

construct individual single scale SR system for each scale

of interest.

However, preparing many individual machines for all

possible scenarios to cope with multiple scales is inefﬁcient

and impractical. In this work, we design and train a sin-

gle network to handle multiple scale SR problem efﬁciently.

This turns out to work very well. Our single machine is

compared favorably to a single-scale expert for the given

sub-task. For three scales factors (×2, 3, 4), we can reduce

the number of parameters by three-fold.

In addition to the aforementioned issues, there are some

minor differences. Our output image has the same size as

the input image by padding zeros every layer during train-

ing whereas output from SRCNN is smaller than the input.

Finally, we simply use the same learning rates for all lay-

ers while SRCNN uses different learning rates for different

layers in order to achieve stable convergence.

3. Proposed Method

3.1. Proposed Network

For SR image reconstruction, we use a very deep convo-

lutional network inspired by Simonyan and Zisserman [

19].

The conﬁguration is outlined in Figure

2. We use d layers

where layers except the ﬁrst and the last are of the same

type: 64 ﬁlter of the size 3 × 3 × 64, where a ﬁlter operates

on 3 × 3 spatial region across 64 channels (feature maps).

The ﬁrst layer operates on the input image. The last layer,

used for image reconstruction, consists of a single ﬁlter of

size 3 × 3 × 64.

The network takes an interpolated low-resolution image

(to the desired size) as input and predicts image details.

Modelling image details is often used in super-resolution

methods [

21, 22, 15, 3] and we ﬁnd that CNN-based meth-

ods can beneﬁt from this domain-speciﬁc knowledge.

In this work, we demonstrate that explicitly modelling

image details (residuals) has several advantages. These are

further discussed later in Section

4.2.

One problem with using a very deep network to predict

dense outputs is that the size of the feature map gets reduced

every time convolution operations are applied. For example,

when an input of size (n +1)×(n+1) is applied to a network

1647

剩余8页未读，继续阅读

评论收藏

内容反馈

恽磊

粉丝: 29
资源: 297

论文4_Accurate Image Super-Resolution Using Very Deep Convolutiona

最新资源

论文4_Accurate Image Super-Resolution Using Very Deep Convolutiona

Accurate Image Super-Resolution Using Very Deep Convolutional Networks

Image Super-Resolution Using Deep Convolutional Networks code

Image Super-Resolution Using Deep Convolutional Networks.pdf

WDSR-Wide Activation for Efficient and Accurate Image Super-Resolution 论文代码

Image super-resolution

Learning a Deep Convolutional Network for Image Super-Resolution

image_super_resolution:深度学习的图像超分辨率

Python-通过深度学习的图像超分辨率ImageSuperResolution

Machine_Learning_Mastery_With_Python_－_Understand_Your_Data，_Create_Accurate_Models_and_Work_Projects_End－To－End.pdf ....pdf

VDSR-caffe

深层卷积神经网络实现超分辨重建，原文及翻译

vdsr图像超清化

基于深度学习的预训练好的超分辨率模型

图像处理中 超分辨率的算法

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

图像超分方面的经典论文

基于深度学习的单幅图片超分辨率重构研究进展.docx

michels_2017_stiffly-accurate-integration.pdf

P_Boersma_Accurate_short-term_analysis_of_the_fun_Boersma_short

FootPath_Accurate_Map-based_Indoor_Navigation

python_play-image_to_textimg-master_python_

LapSRN代码（Matlab）

Python-NeuralEnhance使用深度学习实现超分辨率图像

用卷积滤波器matlab代码-VDSR-TensorFlow:TensorFlow实施“使用超深度卷积网络的精确图像超分辨率”（CVPR201

SR超分辨率的编程

最新资源

图像处理中超分辨率的算法