GenerativeModelingbyEstimatingGradientsofthe.pdf资源-CSDN文库

需积分: 1 102 浏览量 2023-05-18 11:42:18 上传评论收藏 10.87MB PDF 举报

生成模型通过估计数据分布梯度本篇文章主要介绍了一种新的生成模型方法，即通过估计数据分布梯度来进行生成模型。该方法可以学习 score 函数（对数概率密度函数的梯度），然后使用 Langevin 采样生成样本。这种生成模型被称为基于分数的生成模型，具有许多重要的优势，包括GAN 级别的样本质量、灵活的模型架构、精确的对数似然计算和逆问题解决 без重新训练模型。 score 函数是对数概率密度函数的梯度，对于给定的数据分布，可以学习 score 函数，并使用 Langevin 采样生成样本。这种方法可以避免GAN 中的对抗性训练，具有更高的样本质量和灵活的模型架构。此外，基于分数的生成模型还可以进行 exact log-likelihood 计算和逆问题解决。在本篇文章中，我们将详细介绍基于分数的生成模型的直觉、基本概念和潜在的应用。我们将讨论 score 函数、基于分数的生成模型、Langevin 动力学、naive 基于分数的生成模型的缺陷、使用多种噪声扰动的基于分数的生成模型、使用随机微分方程（SDE）的基于分数的生成模型、逆问题解决等内容。 score 函数是对数概率密度函数的梯度，对于给定的数据分布，可以学习 score 函数，以便生成样本。 score 函数可以通过 score matching 进行学习， score matching 是一种学习 score 函数的方法，可以学习 score 函数，而不需要知道数据分布的确切形式。基于分数的生成模型可以避免GAN 中的对抗性训练，具有更高的样本质量和灵活的模型架构。此外，基于分数的生成模型还可以进行 exact log-likelihood 计算和逆问题解决。 Langevin 动力学是一种生成样本的方法，通过使用 score 函数，可以生成高质量的样本。 Langevin 动力学可以避免GAN 中的对抗性训练，具有更高的样本质量和灵活的模型架构。基于分数的生成模型可以应用于多种领域，例如图像生成、文本生成、数据增强等。同时，基于分数的生成模型也可以应用于逆问题解决，例如图像编辑、文本编辑等。基于分数的生成模型是一种非常有前途的生成模型方法，具有许多重要的优势，可以应用于多种领域。

资源推荐

资源详情

资源评论

4/23/23, 2:11 PM

Generative Modeling by Estimating Gradients of the Data Distribution | Yang Song

https://yang-song.net/blog/2021/score/

1/22

AUTHORS AFFILIATIONS

Yang Song Stanford University

PUBLISHED

May 5, 2021

Generative Modeling by Estimating Gradients of the

Data Distribution

This blog post focuses on a promising new direction for generative modeling. We can learn score functions (gradients of log

probability density functions) on a large number of noise-perturbed data distributions, then generate samples with Langevin-type

sampling. The resulting generative models, often called

score-based generative models

, has several important advantages over

existing model families: GAN-level sample quality without adversarial training, exible model architectures, exact log-likelihood

computation, and inverse problem solving without re-training models. In this blog post, we will show you in more detail the intuition,

basic concepts, and potential applications of score-based generative models.

Contents

Introduction

The score function, score-based models, and score matching

Langevin dynamics

Naive score-based generative modeling and its pitfalls

Score-based generative modeling with multiple noise perturbations

Score-based generative modeling with stochastic differential equations (SDEs)

Perturbing data with an SDE

Reversing the SDE for sample generation

Estimating the reverse SDE with score-based models and score matching

How to solve the reverse SDE

Probability ow ODE

Controllable generation for inverse problem solving

Connection to diffusion models and others

Concluding remarks

Introduction

Existing generative modeling techniques can largely be grouped into two categories based on how they represent probability distributions.

1. likelihood-based models, which directly learn the distribution’s probability density (or mass) function via (approximate) maximum likelihood. Typical likeliho

based models include autoregressive models , normalizing ow models , energy-based models (EBMs) , and variational auto-encoders (VAEs)

2. implicit generative models , where the probability distribution is implicitly represented by a model of its sampling process. The most prominent example

generative adversarial networks (GANs) , where new samples from the data distribution are synthesized by transforming a random Gaussian vector with

neural network.

[1, 2, 3] [4, 5] [6, 7]

[8, 9]

[10]

[11]

4/23/23, 2:11 PM

Generative Modeling by Estimating Gradients of the Data Distribution | Yang Song

https://yang-song.net/blog/2021/score/

3/22

Score-based models have achieved state-of-the-art performance on many downstream tasks and applications. These tasks include, among others, image

generation (Yes, better than GANs!), audio synthesis , shape generation , and music generation . Moreover, score-based models

have connections to normalizing ow models, therefore allowing exact likelihood computation and representation learning. Additionally, modeling and estimatin

scores facilitates inverse problem solving, with applications such as image inpainting , image colorization , compressive sensing, and medical image

reconstruction (e.g., CT, MRI) .

1024 x 1024 samples generated from score-based models

This post aims to show you the motivation and intuition of score-based generative modeling, as well as its basic concepts, properties and applications.

The score function, score-based models, and score matching

Suppose we are given a dataset , where each point is drawn independently from an underlying data distribution . Given this dataset, the

goal of generative modeling is to t a model to the data distribution such that we can synthesize new data points at will by sampling from the distribution.

In order to build such a generative model, we rst need a way to represent a probability distribution. One such way, as in likelihood-based models, is to directly

model the probability density function (p.d.f.) or probability mass function (p.m.f.). Let be a real-valued function parameterized by a learnable param

. We can dene a p.d.f. via

where is a normalizing constant dependent on , such that . Here the function is often called an unnormalized probabilistic mod

or energy-based model .

We can train by maximizing the log-likelihood of the data

However, equation requires to be a normalized probability density function. This is undesirable because in order to compute , we must evaluate

the normalizing constant —a typically intractable quantity for any general . Thus to make maximum likelihood training feasible, likelihood-based model

must either restrict their model architectures (e.g., causal convolutions in autoregressive models, invertible networks in normalizing ow models) to make

tractable, or approximate the normalizing constant (e.g., variational inference in VAEs, or MCMC sampling used in contrastive divergence ) which may be

computationally expensive.

By modeling the score function instead of the density function, we can sidestep the diculty of intractable normalizing constants. The score function of a

distribution is dened as

and a model for the score function is called a score-based model , which we denote as . The score-based model is learned such that

, and can be parameterized without worrying about the normalizing constant. For example, we can easily parameterize a score-based mo

with the energy-based model dened in equation , via

[17, 18, 19, 20, 21, 22] [23, 24, 25] [26] [27]

[17, 20] [20]

[28]

[20]

{

, ⋯ ,

}

(

)

(

) ∈

(

) =

−

(

)

> 0

∫

(

= 1

(

)

[7]

(

)

max

∑

log

(

(2)

(

)

(

)

(

)

[29]

(

)

∇

log

(

[17]

(

)

(

) ≈ ∇

log

(

)

(1)

4/23/23, 2:11 PM

Generative Modeling by Estimating Gradients of the Data Distribution | Yang Song

https://yang-song.net/blog/2021/score/

4/22

Note that the score-based model is independent of the normalizing constant ! This signicantly expands the family of models that we can tractably u

since we don’t need any special architectures to make the normalizing constant tractable.

Similar to likelihood-based models, we can train score-based models by minimizing the Fisher divergence between the model and the data distributions, den

Intuitively, the Fisher divergence compares the squared distance between the ground-truth data score and the score-based model. Directly computing this

divergence, however, is infeasible because it requires access to the unknown data score . Fortunately, there exists a family of methods called scor

matching that minimize the Fisher divergence without knowledge of the ground-truth data score. Score matching objectives can directly be estimate

on a dataset and optimized with stochastic gradient descent, analogous to the log-likelihood objective for training likelihood-based models (with known

normalizing constants). We can train the score-based model by minimizing a score matching objective, without requiring adversarial optimization.

Additionally, using the score matching objective gives us a considerable amount of modeling exibility. The Fisher divergence itself does not require to be

actual score function of any normalized distribution—it simply compares the distance between the ground-truth data score and the score-based model, with

additional assumptions on the form of . In fact, the only requirement on the score-based model is that it should be a vector-valued function with the same

input and output dimensionality, which is easy to satisfy in practice.

As a brief summary, we can represent a distribution by modeling its score function, which can be estimated by training a score-based model of free-form

architectures with score matching.

Langevin dynamics

Once we have trained a score-based model , we can use an iterative procedure called Langevin dynamics to draw samples from it

Langevin dynamics provides an MCMC procedure to sample from a distribution using only its score function . Specically, it initializes the ch

from an arbitrary prior distribution , and then iterates the following

where . When and , obtained from the procedure in converges to a sample from under some regularity conditions.

practice, the error is negligible when is suciently small and is suciently large.

(

) = ∇

log

(

) = −∇

(

) − ∇

log

= −∇

(











(

)

Parameterizing probability density functions. No matter how you change the model family and parameters, it has to be normalized (area under the curve must integrate to one).

Parameterizing score functions. No need to worry about normalization.

(

)

[∥∇

log

(

) −

(

)∥

]

ℓ

∇

log

(

)

[15, 16, 30]

(

)

ℓ

(

)

(

) ≈ ∇

log

(

)

[31, 32]

(

) ∇

log

(

)

∼

(

)

←

∇

log

(

) +

√



= 0, 1, ⋯ ,

∼

(0,

)

→ 0

→ ∞

(6)

(

)

ϵ K

剩余21页未读，继续阅读

评论收藏

内容反馈

IT徐师兄

粉丝: 2323
资源: 2862

Generative Modeling by Estimating Gradients of the.pdf

最新资源

Generative Modeling by Estimating Gradients of the.pdf

A NOTE ON THE EVALUATION OF GENERATIVE MODELS.pdf

ExtremeLearningMachine资源共享-Combining-information-theoretic-kernels-with-generative-embed_2013_Neurocomp.pdf

为什么消费者喜欢GENERATIVE AI（英译中）.pdf

Generative Deep Learning.epub

generative programming.pdf

interim-measures-for-generative-ai-services.pdf

以密度为导向的3D芯片的布局设计.pdf

人工智能-从CHAT-GPT到生成式AI（Generative AI）：人工智能新范式，重新定义生产力.pdf

Generative AI Perspectives from Stanford HAI.pdf

Deep Point of View Generative AI.pdf

深度学习神经网络(英文版PDF教程）

Super-Resolution-using-Generative-Adversarial-Networks-master.zip

CVPR2018_Oral_论文合集_人工智能_机器学习

The inside story of ChatGPT.pdf.zip

WEF_Jobs_of_Tomorrow_Generative_AI_2023.pdf

Generative Pretraining From Pixels.pdf

MIRROR-GENERATIVE NEURAL MACHINE TRANSLATION.pdf

Score-Based Generative Modeling的一个代码示例，已经训练好，并且有代码注释，帮助更深入的理解学习

Generative Adversarial nets鉴赏.pdf

On-Manifold Preintegration for Real-Time Visual-Inertial Odometry.pdf

Diffusion-Based Generative Models.pdf

Generative Models VAE and GAN.pdf

最新资源