adaptationinstancenormalization自适应的实例正则化资源-CSDN文库

需积分: 47 39 浏览量 2018-08-15 13:51:32 上传评论 1 收藏 7.91MB PDF 举报

自适应的实例正则化（Adaptive Instance Normalization，简称AdaIN）是一种在深度神经网络中用于图像样式转换的技术。该技术最早由Xun Huang和Serge Belongie提出，并且在他们2017年的论文《Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization》中有详细介绍。AdaIN技术的关键在于，它能够使内容图像在神经网络中以实时的方式接受另一种图像的样式，从而达到风格迁移的效果。为了深入理解AdaIN技术，我们需要知道什么是实例正则化（Instance Normalization）。实例正则化是一种针对每个样本进行标准化的正则化方法，其核心思想是在单个图像内部做归一化处理，而不是像批次归一化（Batch Normalization）那样在一批图像之间进行统计。实例正则化通过计算特征图（feature maps）中每个样本的均值和方差，并将它们标准化到特定的范围内，从而达到正则化的效果，以减少模型训练时的过拟合现象。然而，实例正则化本身并不具备自动适应不同图像风格的能力。而AdaIN所做的是，它通过引入新的机制，可以调整内容特征的均值和方差到风格特征的均值和方差，从而实现不同风格之间的适应和转换。在深度学习中，图像风格可以理解为图像中的一种模式或者特征，这些模式或特征可以通过深度神经网络中的某些层被抽象出来。在传统的图像风格转换方法中，如 Gatys 等人的工作，他们使用了一种基于优化的迭代过程来实现风格转换。但这种方法速度太慢，限制了它的实用性。为了解决这一问题，研究者们尝试训练单次前向传递就能完成风格转换的前馈神经网络。然而，这通常会带来一个问题：网络往往只适用于一组固定的风格，对于新的任意风格则束手无策。针对这一问题，AdaIN提出了一种全新的思路。它将内容图像的特征和风格图像的特征结合起来，通过调整内容特征的统计特性（均值和方差），使之与风格特征的统计特性对齐。这种对齐是通过一种特殊的层——AdaIN层来完成的。利用这种层，可以在单个前馈神经网络中实现任意风格的转换，而无需针对每种风格训练不同的网络模型。此外，AdaIN还允许用户进行灵活的控制，包括内容和风格之间的权衡、风格插值以及颜色和空间控制等。这使得用户可以通过单一的前馈神经网络实现对生成图像的精细调整，从而创造出具有不同视觉效果的艺术作品。 AdaIN技术的核心在于它能够实现实时的任意风格转换，并且具有高度的灵活性和通用性。这在深度学习及计算机视觉领域具有重要的应用价值，特别是在图像处理和生成领域。通过AdaIN技术，可以有效地将一种图像的风格应用到另一幅图像上，创作出风格迥异但内容不变的艺术作品，为数字艺术创作提供了新的工具。

资源推荐

资源详情

资源评论

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

Xun Huang Serge Belongie

Department of Computer Science & Cornell Tech, Cornell University

{xh258,sjb344}@cornell.edu

Abstract

Gatys et al. recently introduced a neural algorithm that

renders a content image in the style of another image,

achieving so-called style transfer. However, their frame-

work requires a slow iterative optimization process, which

limits its practical application. Fast approximations with

feed-forward neural networks have been proposed to speed

up neural style transfer. Unfortunately, the speed improve-

ment comes at a cost: the network is usually tied to a ﬁxed

set of styles and cannot adapt to arbitrary new styles. In this

paper, we present a simple yet effective approach that for the

ﬁrst time enables arbitrary style transfer in real-time. At the

heart of our method is a novel adaptive instance normaliza-

tion (AdaIN) layer that aligns the mean and variance of the

content features with those of the style features. Our method

achieves speed comparable to the fastest existing approach,

without the restriction to a pre-deﬁned set of styles. In ad-

dition, our approach allows ﬂexible user controls such as

content-style trade-off, style interpolation, color & spatial

controls, all using a single feed-forward neural network.

1. Introduction

The seminal work of Gatys et al. [16] showed that deep

neural networks (DNNs) encode not only the content but

also the style information of an image. Moreover, the im-

age style and content are somewhat separable: it is possible

to change the style of an image while preserving its con-

tent. The style transfer method of [16] is ﬂexible enough to

combine content and style of arbitrary images. However, it

relies on an optimization process that is prohibitively slow.

Signiﬁcant effort has been devoted to accelerating neural

style transfer. [24, 51, 31] attempted to train feed-forward

neural networks that perform stylization with a single for-

ward pass. A major limitation of most feed-forward meth-

ods is that each network is restricted to a single style. There

are some recent works addressing this problem, but they are

either still limited to a ﬁnite set of styles [11, 32, 55, 5], or

much slower than the single-style transfer methods [6].

In this work, we present the ﬁrst neural style transfer

algorithm that resolves this fundamental ﬂexibility-speed

dilemma. Our approach can transfer arbitrary new styles

in real-time, combining the ﬂexibility of the optimization-

based framework [16] and the speed similar to the fastest

feed-forward approaches [24, 52]. Our method is inspired

by the instance normalization (IN) [52, 11] layer, which

is surprisingly effective in feed-forward style transfer. To

explain the success of instance normalization, we propose

a new interpretation that instance normalization performs

style normalization by normalizing feature statistics, which

have been found to carry the style information of an im-

age [16, 30, 33]. Motivated by our interpretation, we in-

troduce a simple extension to IN, namely adaptive instance

normalization (AdaIN). Given a content input and a style

input, AdaIN simply adjusts the mean and variance of the

content input to match those of the style input. Through

experiments, we ﬁnd AdaIN effectively combines the con-

tent of the former and the style latter by transferring feature

statistics. A decoder network is then learned to generate the

ﬁnal stylized image by inverting the AdaIN output back to

the image space. Our method is nearly three orders of mag-

nitude faster than [16], without sacriﬁcing the ﬂexibility of

transferring inputs to arbitrary new styles. Furthermore, our

approach provides abundant user controls at runtime, with-

out any modiﬁcation to the training process.

2. Related Work

Style transfer. The problem of style transfer has its origin

from non-photo-realistic rendering [28], and is closely re-

lated to texture synthesis and transfer [13, 12, 14]. Some

early approaches include histogram matching on linear ﬁl-

ter responses [19] and non-parametric sampling [12, 15].

These methods typically rely on low-level statistics and of-

ten fail to capture semantic structures. Gatys et al. [16] for

the ﬁrst time demonstrated impressive style transfer results

by matching feature statistics in convolutional layers of a

DNN. Recently, several improvements to [16] have been

proposed. Li and Wand [30] introduced a framework based

on markov random ﬁeld (MRF) in the deep feature space to

enforce local patterns. Gatys et al. [17] proposed ways to

control the color preservation, the spatial location, and the

scale of style transfer. Ruder et al. [45] improved the quality

arXiv:1703.06868v2 [cs.CV] 30 Jul 2017

of video style transfer by imposing temporal constraints.

The framework of Gatys et al. [16] is based on a slow

optimization process that iteratively updates the image to

minimize a content loss and a style loss computed by a loss

network. It can take minutes to converge even with mod-

ern GPUs. On-device processing in mobile applications is

therefore too slow to be practical. A common workaround

is to replace the optimization process with a feed-forward

neural network that is trained to minimize the same ob-

jective [24, 51, 31]. These feed-forward style transfer ap-

proaches are about three orders of magnitude faster than

the optimization-based alternative, opening the door to real-

time applications. Wang et al. [53] enhanced the granularity

of feed-forward style transfer with a multi-resolution archi-

tecture. Ulyanov et al. [52] proposed ways to improve the

quality and diversity of the generated samples. However,

the above feed-forward methods are limited in the sense that

each network is tied to a ﬁxed style. To address this prob-

lem, Dumoulin et al. [11] introduced a single network that

is able to encode 32 styles and their interpolations. Con-

current to our work, Li et al. [32] proposed a feed-forward

architecture that can synthesize up to 300 textures and trans-

fer 16 styles. Still, the two methods above cannot adapt to

arbitrary styles that are not observed during training.

Very recently, Chen and Schmidt [6] introduced a feed-

forward method that can transfer arbitrary styles thanks to

a style swap layer. Given feature activations of the content

and style images, the style swap layer replaces the content

features with the closest-matching style features in a patch-

by-patch manner. Nevertheless, their style swap layer cre-

ates a new computational bottleneck: more than 95% of the

computation is spent on the style swap for 512 × 512 input

images. Our approach also permits arbitrary style transfer,

while being 1-2 orders of magnitude faster than [6].

Another central problem in style transfer is which style

loss function to use. The original framework of Gatys et

al. [16] matches styles by matching the second-order statis-

tics between feature activations, captured by the Gram ma-

trix. Other effective loss functions have been proposed,

such as MRF loss [30], adversarial loss [31], histogram

loss [54], CORAL loss [41], MMD loss [33], and distance

between channel-wise mean and variance [33]. Note that all

the above loss functions aim to match some feature statistics

between the style image and the synthesized image.

Deep generative image modeling. There are several al-

ternative frameworks for image generation, including varia-

tional auto-encoders [27], auto-regressive models [40], and

generative adversarial networks (GANs) [18]. Remarkably,

GANs have achieved the most impressive visual quality.

Various improvements to the GAN framework have been

proposed, such as conditional generation [43, 23], multi-

stage processing [9, 20], and better training objectives [46,

1]. GANs have also been applied to style transfer [31] and

cross-domain image generation [50, 3, 23, 38, 37, 25].

3. Background

3.1. Batch Normalization

The seminal work of Ioffe and Szegedy [22] introduced

a batch normalization (BN) layer that signiﬁcantly ease the

training of feed-forward networks by normalizing feature

statistics. BN layers are originally designed to acceler-

ate training of discriminative networks, but have also been

found effective in generative image modeling [42]. Given

an input batch x ∈ R

N×C×H×W

, BN normalizes the mean

and standard deviation for each individual feature channel:

BN(x) = γ



x − µ(x)

σ(x)



+ β (1)

where γ, β ∈ R

are afﬁne parameters learned from data;

µ(x), σ(x) ∈ R

are the mean and standard deviation,

computed across batch size and spatial dimensions indepen-

dently for each feature channel:

(x) =

NHW

n=1

h=1

w=1

nchw

(2)

(x) =

NHW

n=1

h=1

w=1

nchw

− µ

(x))

+ 

(3)

BN uses mini-batch statistics during training and replace

them with popular statistics during inference, introducing

discrepancy between training and inference. Batch renor-

malization [21] was recently proposed to address this issue

by gradually using popular statistics during training. As

another interesting application of BN, Li et al. [34] found

that BN can alleviate domain shifts by recomputing popular

statistics in the target domain. Recently, several alternative

normalization schemes have been proposed to extend BN’s

effectiveness to recurrent architectures [35, 2, 47, 8, 29, 44].

3.2. Instance Normalization

In the original feed-forward stylization method [51], the

style transfer network contains a BN layer after each con-

volutional layer. Surprisingly, Ulyanov et al. [52] found

that signiﬁcant improvement could be achieved simply by

replacing BN layers with IN layers:

IN(x) = γ



x − µ(x)

σ(x)



+ β (4)

Different from BN layers, here µ(x) and σ(x) are com-

puted across spatial dimensions independently for each

channel and each sample:

(x) =

h=1

w=1

nchw

(5)

剩余10页未读，继续阅读

评论收藏

内容反馈

暮云凌轩

粉丝: 1
资源: 2

adaptation instance normalization 自适应的实例正则化

最新资源

adaptation instance normalization 自适应的实例正则化

使用带有自适应正则化的示例SVM的无监督域自适应

pytorch-AdaIN:非官方的pytorch实现“使用自适应实例规范化实时进行任意样式转换” [Huang +，ICCV2017]

Python-ArbitraryStyleTransferinRealtimewithAdaptiveInstanceNormalization

Tensorflow-Style-Transfer-with-Adain:论文“具有自适应实例归一化的实时任意样式传输”的Tensorflow实现

WadaIN-VC:正式实现“具有权重自适应实例归一化的单发语音转换”

Neural-Pose-Transfer:通过空间自适应实例归一化进行神经姿势传输。 在CVPR 2020中

L0&L1smooth:用于图像平滑的自适应正则化范数-matlab开发

自适应仿真实例1

自适应案例 源码

论文研究-基于特征的自适应正则化配准算法.pdf

NeurIPS 2020上与【域自适应】相关论文（六篇）

神经网络正则化的部分图推理_Partial Graph Reasoning for Neural Network Regular

题3_自适应反演_自适应反演控制_自适应_

自适应卡尔曼滤波算法 AKF.zip

SaDE自适应差分进化算法代码.rar

iframe 自适应 实例

(原文+译文)DeepSubdomainAdaptationNetworkforImageClassification(用于图像分类的深度子域自适应网络)

核自适应滤波器_核递推最小二乘.zip

几何感知的无监督域自适应_Geometry-Aware Unsupervised Domain Adaptation.pdf

Unsupervised Domain Adaption of Object Detectors A Survey.pdf

A Literature Review of Domain Adaptation with Unlabeled Data(无标签数据域自适应文献综述)

页面自适应

matlab偏差代码-domain-adaptation:具有随机期望最大化的域自适应

借助语义相关正则化促进零散学习

《迁移学习:域自适应理论》综述论文

2021-域自适应-医学图像分析-Domain Adaptation for Medical Image Analysis

ADVENT-master_无监督分割_语义分割_自适应_

变步长的LMS自适应滤波算法matlab程序_变步长LMSMATLAB_盲源分离滤波_自适应变步长_变步长LMS_自适应滤波_源

最新资源

Neural-Pose-Transfer:通过空间自适应实例归一化进行神经姿势传输。在CVPR 2020中

自适应案例源码

iframe 自适应实例