【免费】Image2StyleGAN++翻译1资源-CSDN文库

嵌入式硬件

需积分: 0 143 浏览量 2022-08-04 13:02:20 上传评论收藏 2.06MB PDF 举报

资源详情

资源评论

资源推荐

Image2StyleGAN++: How to Edit the Embedded Images?

图像 2stylegan + + : 如何编辑嵌入式图像？

Rameen Abdal

拉面 Abdal

Yipeng Qin

秦一鹏

Peter Wonka

KAUST

卡斯特

Cardiff University

卡迪夫大学

KAUST

卡斯特

rameen.abdal@kaust.edu.sa

Rameen.abdal@kaust.edu. sa

qiny16@cardiff.ac.uk

Qini16@cardiff.ac.uk

pwonka@gmail.co

Pwonka@gmail.

com

(a) (b) (c) (d)

(a)(b)(c)(d)

Figure 1: (a) and (b): input images; (c): the “two-face” generated by naively copying the left half from (a) and the right half

图 1: (a)和(b) : 输入图像; (c) : 通过天真地从(a)和右边复制左边的一半而产生的“双面”

from (b); (d): the “two-face” generated by our Image2StyleGAN++ framework.

来自(b) ; (d) : 我们的 Image2StyleGAN + + 框架生成的“双面”。

Abstract

摘要

We propose Image2StyleGAN++, a flexible image edit-ing

framework with many applications. Our framework ex-tends

the recent Image2StyleGAN [1] in three ways. First, we

introduce noise optimization as a complement to the W

latent space embedding. Our noise optimization can restore

high frequency features in images and thus significantly im-

proves the quality of reconstructed images, e.g. a big in-

crease of PSNR from 20 dB to 45 dB. Second, we extend the

global W

latent space embedding to enable local embed-

dings. Third, we combine embedding with activation tensor

manipulation to perform high quality local edits along with

global semantic edits on images. Such edits motivate vari-ous

high quality image editing applications, e.g. image re-

construction, image inpainting, image crossover, local style

transfer, image editing using scribbles, and attribute level

feature transfer. Examples of the edited images are shown

across the paper for visual inspection.

我们提出

Image2StyleGAN + +

，一个灵活的图像编辑

框架，具有许多应用程序。我们的框架在三个方面扩展

了最近的

Image2StyleGAN [1]

。首先，我们引入噪声优化

作为

w +

潜在空间嵌入的补充。我们的噪声优化可以恢

复图像中的高频特征，从而显著提高重建图像的质量，

如

PSNR

从

分贝大幅提高到

分贝。其次，我们扩展

了全局

w +

潜在空间嵌入以实现局部嵌入。第三，我们

将嵌入与激活张量操作相结合，以执行高质量的局部编

辑以及图像上的全局语义编辑。这样的编辑激发了各种

高质量的图像编辑应用，如图像重建、图像修复、图像

交叉、局部风格转换、使用涂鸦的图像编辑和属性级别

的特征转换。编辑过的图像的例子在纸上显示，以供视

觉检查。

1. Introduction

引言

Recent GANs [19, 6] demonstrated that synthetic im-

ages can be generated with very high quality. This mo-

tivates research into embedding algorithms that embed a

given photograph into a GAN latent space. Such embed-

最近甘斯[19,6]证明，合成图像可以产生非常高的

质量。这激发了将给定照片嵌入 GAN 潜在空间的嵌

入算法的研究。这样的嵌入

ding algorithms can be used to analyze the limitations of

GANs [5], do image inpainting [8, 39, 38, 36], local im-

age editing [40, 17], global image transformations such as

image morphing and expression transfer [1], and few-shot

video generation [35, 34].

Ding 算法可用于分析 GANs [5]的局限性，进行图像修

补[8,39,38,36] ，局部图像编辑[40,17] ，全局图像变形

和表达式转换[1] ，少镜头视频生成[35,34]。

In this paper, we propose to extend a very recent em-

bedding algorithm, Image2StyleGAN [1]. In particular, we

would like to improve this previous algorithm in three as-

pects. First, we noticed that the embedding quality can be

further improved by including Noise space optimization into

the embedding framework. The key insight here is that stable

Noise space optimization can only be conducted if the

optimization is done sequentially with W

space and not

jointly. Second, we would like to improve the capabili-ties of

the embedding algorithm to increase the local control over

the embedding. One way to improve local control is to

include masks in the embedding algorithm with undefined

content. The goal of the embedding algorithm should be to

find a plausible embedding for everything outside the mask,

while filling in reasonable semantic content in the masked

pixels. Similarly, we would like to provide the option of

approximate embeddings, where the specified pixel colors are

only a guide for the embedding. In this way, we aim to

achieve high quality embeddings that can be controlled by

user scribbles. In the third technical part of the paper, we

investigate the combination of embedding algorithm and di-

在本文中，我们提出扩展一个非常新的嵌入算法，

Image2StyleGAN [1]。特别是，我们想从三个方面改进以

前的算法。首先，我们注意到嵌入质量可以通过在嵌入

框架中引入噪声空间优化来进一步提高。这里的关键观

点是，稳定的噪声空间优化只能进行，如果优化是顺序

进行的 w + 空间，而不是联合。其次，我们希望提高嵌

入算法的能力，以增加对嵌入的局部控制。提高局部控

制的一个方法是在嵌入算法中包含含有未定义内容的掩

码。嵌入算法的目标应该是为掩码之外的所有内容找到

一个合理的嵌入，同时在掩码像素中填充合理的语义内

容。同样，我们也想提供近似嵌入的选项，其中指定的

像素颜色只是嵌入的指南。通过这种方式，我们的目标

是实现高质量的嵌入，可以通过用户涂鸦来控制。在论

文的第三个技术部分，我们研究了嵌入算法和 di- 的结合

18296

rect manipulations of the activation maps (called

activation tensors in our paper).

激活图的直接操作(在我们的论文中称为激活张量)。

Our main contributions are:

我们的主要贡献是:

1. We propose Noise space optimization to restore the

high frequency features in an image that cannot be re-

produced by other latent space optimization of GANs.

The resulting images are very faithful reconstructions

of up to 45 dB compared to about 20 dB (PSNR) for

the previously best results.

我们提出噪声空间优化来恢复图像中的高频特征，

这些特征是其他甘斯潜在空间优化不能再现的。

由此产生的图像是非常忠实的重建高达 45 分贝，

而约 20 分贝(PSNR)为以前最好的结果。

2. We propose an extended embedding algorithm into the

我们提出了一个扩展嵌入算法到

space of StyleGAN that allows for local

modifica-tions such as missing regions and locally

approximate embeddings.

StyleGAN 的 w + 空间，允许局部修改，如缺失

区域和局部近似嵌入。

3. We investigate the combination of embedding and acti-

vation tensor manipulation to perform high quality lo-

cal edits along with global semantic edits on images.

我们研究嵌入和激活张量操作的组合，以执行高质量的

局部编辑以及图像上的全局语义编辑。

4. We apply our novel framework to multiple image

edit-ing and manipulation applications. The results

show that the method can be successfully used to

develop a state-of-the-art image editing software.

我们将我们的新框架应用于多种图像编辑和处理应

用。结果表明，该方法可以成功地用于开发最先

进的图像编辑软件。

2. Related Work

相关工作

Generative Adversarial Networks (GANs) [14, 29] are one

of the most popular generative models that have been

successfully applied to many computer vision applications,

e.g. object detection [23], texture synthesis [22, 37, 31],

image-to-image translation [16, 42, 28, 25] and video gen-

eration [33, 32, 35, 34]. Backing these applications are the

massive improvements on GANs in terms of architec-ture [19,

6, 28, 16], loss function design [26, 2], and regu-larization

[27, 15]. On the bright side, such improvements significantly

boost the quality of the synthesized images. To date, the two

highest quality GANs are StyleGAN [19] and BigGAN [6].

Between them, StyleGAN produces excellent results for

unconditional image synthesis tasks, especially on face

images; BigGAN produces the best results for con-ditional

image synthesis tasks (e.g. ImageNet [9]). While on the dark

side, these improvements make the training of GANs more

and more expensive that nowadays it is almost a privilege of

wealthy institutions to compete for the best performance. As

a result, methods built on pre-trained gen-erators start to

attract attention very recently. In the follow-ing, we would

like to discuss previous work of two such ap-proaches:

embedding images into a GAN latent space and the

manipulation of GAN activation tensors.

生成对抗网络[14,29]是最流行的生成模型之一，已成

功应用于许多计算机视觉应用，如目标检测[23] ，纹理

合成[22,37,31] ，图像到图像的转换[16,42,28,25]和视频生

成[33,32,35,34]。支持这些应用的是 GANs 在体系结构方

面的巨大改进[19,6,28,16] ，损失函数设计[26,2]和规范化

[27,15]。好的一面是，这样的改进显着提高了合成图像

的质量。迄今为止，两个质量最高的甘斯是 StyleGAN

[19]和 BigGAN [6]。在两者之间，StyleGAN 在无条件图

像合成任务中，尤其是在人脸图像上，产生了极好的结

果; BigGAN 在有条件的图像合成任务中，产生了最好的

结果(例如 ImageNet [9])。然而，这些改进使得甘斯的培

训费用越来越昂贵，如今，竞争最佳表现几乎成了富裕

机构的特权。因此，建立在预先训练的发电机上的方法

最近开始引起人们的注意。在下文中，我们将讨论以前

两种方法的工作: 在 GAN 潜在空间中嵌入图像和操纵

GAN 活化张量。

Latent Space Embedding. The embedding of an image into

the latent space is a longstanding topic in both machine

learning and computer vision. In general, the embedding

潜在空间嵌入。将图像嵌入到潜在空间是机器学习和计

算机视觉中一个长期存在的话题。一般来说，嵌入

can be implemented in two ways: i) passing the input im-age

through an encoder neural network (e.g. the Variational Auto-

Encoder [21]); ii) optimizing a random initial latent code to

match the input image [41, 7]. Between them, the first

approach dominated for a long time. Although it has an

inherent problem to generalize beyond the training dataset, it

produces higher quality results than the naive latent code

optimization methods [41, 7]. While recently, Abdal et al. [1]

obtained excellent embedding results by optimizing the latent

codes in an enhanced W

latent space instead of the initial Z

latent space. Their method suggests a new direc-tion for

various image editing applications and makes the second

approach interesting again.

可以通过两种方式实现: i)通过编码器神经网络(例如

Variational Auto-Encoder [21])传递输入图像; ii)优化随机

初始潜在代码以匹配输入图像[41,7]。在他们之间，第一

种方法占主导地位很长时间。虽然它有一个固有的问题，

泛化超出了训练数据集，它产生了更高的质量结果比幼

稚的潜在代码优化方法[41,7]。最近，Abdal 等[1]通过在

增强的 w + 潜在空间中优化潜在码而不是在初始的 z 潜

在空间中优化潜在码，得到了很好的嵌入结果。他们的

方法为各种图像编辑应用程序提供了一个新的方向，并

使第二种方法再次变得有趣。

Activation Tensor Manipulation. With fixed neural net-

work weights, the expression power of a generator can be

fully utilized by manipulating its activation tensors. Based on

this observation, Bau [4] et al. investigated what a GAN can

and cannot generate by locating and manipulating rel-evant

neurons in the activation tensors [4, 5]. Built on the

understanding of how an object is “drawn” by the genera-tor,

they further designed a semantic image editing system that

can add, remove or change the appearance of an object in an

input image [3]. Concurrently, Fruhst¨uck¨ et al. [11]

investigated the potential of activation tensor manipulation in

image blending. Observing that boundary artifacts can be

eliminated by by cropping and combining activation tensors

at early layers of a generator, they proposed an algorithm to

create large-scale texture maps of hundreds of megapixels by

combining outputs of GANs trained on a lower resolu-tion.

激活张量操作。使用固定的神经网络权重，通过操纵激

活张量，可以充分利用发生器的表达能力。基于这一观

察，Bau [4]等人通过定位和操纵激活张量中的相关神经

元来研究 GAN 能够和不能产生什么[4,5]。在理解生成器

如何“绘制”对象的基础上，他们进一步设计了一个语义

图像编辑系统，可以添加、删除或改变输入图像中对象

的外观[3]。同时，Fruhst uck 等[11]研究了激活张量操作

在图像混合中的潜力。他们观察到边界伪影可以通过裁

剪和合并生成器早期层的激活张量来消除，他们提出了

一种算法，通过合并受过较低分辨率训练的 GANs 的输

出来创建数百万像素的大规模纹理映射。

3. Overview

3. 概览

Our paper is structured as follows. First, we describe an

extended version of the Image2StyleGAN [1] embedding

algorithm (See Sec. 4). We propose two novel modifica-tions:

1) to enable local edits, we integrate various spatial masks

into the optimization framework. Spatial masks en-able

embeddings of incomplete images with missing values and

embeddings of images with approximate color values such as

user scribbles. In addition to spatial masks, we ex-plore layer

masks that restrict the embedding into a set of selected layers.

The early layers of StyleGAN [19] encode content and the

later layers control the style of the image. By restricting

embeddings into a subset of layers we can better control what

attributes of a given image are extracted.

我们的论文结构如下。首先，我们描述了

Image2StyleGAN [1]嵌入算法的扩展版本(见第 4 节)。我

们提出了两个新的修改: 1)使局部编辑成为可能，我们将

各种空间掩模整合到优化框架中。空间掩码能够嵌入缺

失值的不完整图像和具有近似颜色值的图像，如用户涂

鸦。除了空间蒙版，我们还探索了层蒙版，它限制了嵌

入到一组选定的图层中。StyleGAN [19]的早期层对内容

进行编码，后面的层控制图像的样式。通过限制嵌入到

一个子集的图层，我们可以更好地控制什么属性的给定

图像被提取。

2) to further improve the embedding quality, we optimize

for an additional group of variables n that control additive

noise maps. These noise maps encode high frequency de-

tails and enable embedding with very high reconstruction

quality.

为了进一步提高嵌入质量，我们优化了一组额外的变

量 n 控制加性噪声映射。这些噪声图对高频细节进行

编码，并使嵌入具有非常高的重建质量。

Second, we explore multiple operations to directly ma-

nipulate activation tensors (See Sec. 5). We mainly explore

其次，我们探索多种操作直接操纵激活张量(见第 5

节)。我们主要探索

8297

剩余19页未读，继续阅读

评论收藏

内容反馈

Msura

粉丝: 62
资源: 323

Image2StyleGAN++翻译1

评论0

最新资源

Image2StyleGAN++翻译1

评论0

能生成逼真图像的不只有 GAN

stylegan2-projecting-images:使用StyleGAN2将图像投影到潜在空间

分析提高图像质量计算机视觉Analyzing and Improving the Image Quality of StyleGAN

Analyzing and Improving the Image Quality of StyleGAN.pdf

unet-stylegan2:使用UNet Discriminator实现Stylegan2的Pytorch实现

Image2Lcd+V3.2

stylegan2-ADAxCLIP:你的猫看起来像魔鬼！

gan+无数据集.zip

很强的图片效果image-slideshow JS+CSS相册效果

VPC2007 empty disk image + winme boot + cddrv.zip 2张小软盘

Digital+Image+Processing+-+Third+Edition.pdf

Neat Image V5.2 Pro+ 汉化版(修正版).rar

stylegan2-clip-approach:使用CLIP在StyleGAN2潜在空间中导航

openGL + assimp全套安装包（SDL2+glfw+glad+glm+stb_image.h+Assimp）

ImageMagick+Tricks+Web+Image+Effects+from+the+Command+Line+and+PHP.pdf

Image2Lcd+KeyGen

Acronis+True+Image+使用教程(中文版)终稿.pdf

Image2Lcd+汉字取模，TFT助手

Threedimensional+image+processing+and+recognition.pdf

CDIMAGE 2.47~2.54 + oscdimg2.54

BurpLoaderKeygen.jar.zip

最新版ISO/IEC 27001:2022、ISO 27002:2022中英文合集

Goby红队版-win-x64-2.4.7版本

Chrome Header Editor 插件

ISO SAE 21434-2021 中文版.pdf

OpenVAS GVM 中文翻译补丁

安全认证cisp教材全套

STM32F103C8T6核心板-电路原理图1.PDF

最新资源