dnn.zip_DNN_Help!资源-CSDN文库

共6个文件

pdf：6个

版权申诉

dnn

98 浏览量 2022-09-15 01:29:47 上传评论收藏 9.35MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

dnn.zip （6个子文件）

dnn

dnn-ivectorFromGoogle.pdf 488KB

基于DNN的低资源语音识别特征提取技术.pdf 1.78MB

深度学习的昨天、今天和明天.pdf 872KB

基于深度神经网络和Bottleneck特征的说话人识别系统.pdf 1.06MB

dnn-ivector.pdf 460KB

em_algrithom.pdf 8.56MB

AGentleTutorialoftheEMAlgorithm

and its Application to Parameter

Estimation for Gaussian Mixture and

Hidden Markov Models

Jeff A. Bilmes (bilmes@cs.berkeley.edu)

International Computer Science Institute

Berkeley CA, 94704

and

Computer Science Division

Department of Electrical Engineering and Computer Science

U.C. Berkeley

TR-97-021

April 1998

Abstract

We describe the maximum-likelihood parameter estimation problem and how the Expectation-

Maximization (EM) algorithm can be used for its solution. We ﬁrst describe the abstract

form of the EM algorithm as it is often given in the literature. We then develop the EM pa-

rameter estimation procedure for two applications: 1) ﬁnding the parameters of a mixture of

Gaussian densities, and 2) ﬁnding the parameters of a hidden Markov model (HMM) (i.e.,

the Baum-Welch algorithm) for both discrete and Gaussian mixture observation models.

We derive the update equations in fairly explicit detail but we do not prove any conver-

gence properties. We try to emphasize intuition rather than mathematical rigor.

1 Maximum-likelihood

Recall the deﬁnition of the maximum-likelihood estimation problem. We have a density function

that is governed by the set of parameters (e.g., might be a set of Gaussians and could

be the means and covariances). We also have a data set of size

, supposedly drawn from this

distribution, i.e.,

. That is, we assume that these data vectors are independent and

identically distributed (i.i.d.) with distribution

. Therefore, the resulting density for the samples is

This function is called the likelihood of the parameters given the data, or just the likelihood

function. The likelihood is thought of as a function of the parameters where the data is ﬁxed.

In the maximum likelihood problem, our goal is to ﬁnd the

that maximizes . That is, we wish

to ﬁnd

where

argmax

Often we maximize instead because it is analytically easier.

Depending on the form of this problem can be easy or hard. For example, if

is simply a single Gaussian distribution where , then we can set the derivative of

to zero, and solve directly for and (this, in fact, results in the standard formulas

for the mean and variance of a data set). For many problems, however, it is not possible to ﬁnd such

analytical expressions, and we must resort to more elaborate techniques.

2 Basic EM

The EM algorithm is one such elaborate technique. The EM algorithm [ALR77, RW84, GJ95, JJ94,

Bis95, Wu83] is a general method of ﬁnding the maximum-likelihood estimate of the parameters of

an underlying distribution from a given data set when the data is incomplete or has missing values.

There are two main applications of the EM algorithm. The ﬁrst occurs when the data indeed

has missing values, due to problems with or limitations of the observation process. The second

occurs when optimizing the likelihood function is analytically intractable but when the likelihood

function can be simpliﬁed by assuming the existence of and values for additional but missing (or

hidden) parameters. The latter application is more common in the computational pattern recognition

community.

As before, we assume that data

is observed and is generated by some distribution. We call

the incomplete data. We assume that a complete data set exists and also assume (or

specify) a joint density function:

Where does this joint density come from? Often it “arises” from the marginal density function

and the assumption of hidden variables and parameter value guesses (e.g., our two exam-

ples, Mixture-densities and Baum-Welch). In other cases (e.g., missing data values in samples of a

distribution), we must assume a joint relationship between the missing and observed values.

With this new density function, we can deﬁne a new likelihood function,

,calledthecomplete-datalikelihood.Notethatthisfunction is in fact a random variable

since the missing information

is unknown, random, and presumably governed by an underlying

distribution. That is, we can think of

for some function where

and are constant and is a random variable. The original likelihood is referred to as the

incomplete-data likelihood function.

The EM algorithm ﬁrst ﬁnds the expected value of the complete-data log-likelihood

with respect to the unknown data given the observed data and the current parameter estimates.

That is, we deﬁne:

(1)

Where

are the current parameters estimates that we used to evaluate the expectation and

are the new parameters that we optimize to increase .

This expression probably requires some explanation.

The key thing to understand is that

and are constants, is a normal variable that we wish to adjust, and is a random

variable governed by the distribution

. The right side of Equation 1 can therefore be

re-written as:

(2)

Note that

is the marginal distribution of the unobserved data and is dependent on

both the observed data

and on the current parameters, and is the space of values can take on.

In the best of cases, this marginal distribution is a simple analytical expression of the assumed pa-

rameters

and perhaps the data. In the worst of cases, this density might be very hard to obtain.

Sometimes, in fact, the density actually used is

but

this doesn’t effect subsequent steps since the extra factor, is not dependent on .

As an analogy, suppose we have a function

of two variables. Consider where

is a constant and is a random variable governed by some distribution .Then

is now a deterministic function that could be maximized if

desired.

The evaluation of this expectation is called the E-step of the algorithm. Notice the meaning of

the two arguments in the function . The ﬁrst argument corresponds to the parameters

that ultimately will be optimized in an attempt to maximize the likelihood. The second argument

corresponds to the parameters that we use to evaluate the expectation.

The second step (the M-step) of the EM algorithm is to maximize theexpectationwecomputed

in the ﬁrst step. That is, we ﬁnd:

argmax

These two steps are repeated as necessary. Each iteration is guaranteed to increase the log-

likelihood and the algorithm is guaranteed to converge to a local maximum of the likelihood func-

tion. There are many rate-of-convergence papers (e.g., [ALR77, RW84, Wu83, JX96, XJ96]) but

we will not discuss them here.

Recall that . In the following discussion, we drop the subscripts from

different density functions since argument usage should should disambiguate different ones.

A modiﬁed form of the M-step is to, instead of maximizing , we ﬁnd some

such that .ThisformofthealgorithmiscalledGeneralizedEM

(GEM) and is also guaranteed to converge.

As presented above, it’s not clear how exactly to “code up” the algorithm. This is the way,

however, that the algorithm is presented in its most general form. The details of the steps required

to compute the given quantities are very dependent on the particular application so they are not

discussed when the algorithm is presented in this abstract form.

3 Finding Maximum Likelihood Mixture Densities Parameters via EM

The mixture-density parameter estimation problem is probably one of the most widely used appli-

cations of the EM algorithm in the computational pattern recognition community. In this case, we

assume the following probabilistic model:

where the parameters are such that and each is a

density function parameterized by

. In other words, we assume we have component densities

mixed together with

mixing coefﬁcients .

The incomplete-data log-likelihood expression for this density from the data

is given by:

which is difﬁcult to optimize because it contains the log of the sum. If we consider as incomplete,

however, and posit the existence of unobserved data items

whose values inform us

which component density “generated” each data item, the likelihood expression is signiﬁcantly

simpliﬁed. That is, we assume that

for each ,and if the sample was

generated by the

mixture component. If we know the values of , the likelihood becomes:

which, given a particular form of the component densities, can be optimized using a variety of

techniques.

The problem, of course, is that we do not know the values of

. If we assume is a random

vector, however, we can proceed.

We ﬁrst must derive an expression for the distribution of the unobserved data. Let’s ﬁrst guess

at parameters for the mixture density, i.e., we guess that

are the

appropriate parameters for the likelihood

.Given , we can easily compute

for each and . In addition, the mixing parameters, can be though of as prior probabilities

of each mixture component, that is

component j . Therefore, using Bayes’s rule, we can

compute:

评论收藏

内容反馈

版权申诉

Kinonoyomeo

粉丝: 76
资源: 1万+

dnn.zip_DNN_Help!

dnn.zip_DNN神经网络_TensorFlow 识别_手写数字识别_数字识别_神经网络

归档.zip_DNN神经网络_tensorflow_tensorflow dnn_函数拟合_用DNN拟合函数

DNN-SpeechEnhancement-master.zip_DNN_DNN 去噪_dnn信号去噪_语音_语音去噪

dnn(1).zip_DNN matlab_DNN神经网络_MATLAB DNN回归_matlab实现DNN_neural ne

mnist98.zip_DNN_MNIST_准确率_改进的dnn

tensor_proto.raw_data().empty() || !tensor_proto.float_data().empty() || !tensor_proto.double_data()

BCI_MI_CSP_DNN.rar_DNN分类_原理图_深度神经网络_神经网络_脑电特征

matlabdemo.zip_DNN_DNN matlab_DNN神经网络_dnn matlab实现_神经网络

DNN.zip_DNN_DNN实现_Python DNN_python DNN demo_python DNN实现

speech-derev-dnn-master.zip_DNN_DNN speech _DNN语音_深度神经网络_混响

RBM-Matlab-master.zip_DNN_DNN matlab_RBM matlab_matlab dnn_rbm

Enhanced_Qt-Opencv-DNN-master.zip_QT+opencv_QT图像识别_opencv dnn_qt

mkl-dnn-0.14.zip_DNN_matlab_mkl

MATLAB-DNN-master.zip_MATLAB DNN回归_MATLAB 深度学习_matlab dnn_matlab

DeepLearnToolbox-master.zip_DNN_DNN matlab预测_DNN 预测_matlab dnn_深

DNN.zip_DNN_DNN WMMSE_DNN算法_WMMSE_WMMSE方案

DNN_toolbox.zip_Composed_DNN matlab_DNN toolbox_DNN_toolbox_sp

DNN.zip_DNN预测_DNN预测模型_time series_时间序列预测_预测

DNN.zip_DNN 神经网络_DNN特征_语音 特征_语音特征_语音神经网络

dnn.zip_DNN_DNN matlab_PID NN_pid nn matlab_pid单神经元

sourceCode.zip_DNN python_DNN PYTHON_DNN实现_语音_语音 python

mnist.zip_DNN_DNN准确率_MNIST_mnist neural network

DNN.py_Pythondnn_机器学习_DNN神经网络_DNN_

mnist_dropout.zip_DNN_MNIST_belowwtu_drop out_mnist neural netwo

冰河的渗透实战笔记-冰河.pdf

大灰狼远控2021最新版，解压密码222

J-LINK V10 V11固件.rar

ISO21434.pdf

Web安全漏洞扫描工具-AWVS14

最新资源

DNN.zip_DNN 神经网络_DNN特征_语音特征_语音特征_语音神经网络