没有合适的资源?快使用搜索试试~ 我知道了~
Statistical Language Models Based on Neural Networks
需积分: 0 0 下载量 102 浏览量
2021-12-24
09:51:31
上传
评论
收藏 794KB PDF 举报
温馨提示
试读
133页
STATISTICAL LANGUAGE MODELS BASED ON NEURAL NETWORKS - Kvˇeten 2012文献,大家搜不到的可以看看
资源详情
资源评论
资源推荐
VYSOK
´
E U
ˇ
CEN
´
I TECHNICK
´
E V BRN
ˇ
E
BRNO UNIVERSITY OF TECHNOLOGY
FAKULTA INFORMA
ˇ
CN
´
ICH TECHNOLOGI
´
I
´
USTAV PO
ˇ
C
´
ITA
ˇ
COV
´
E GRAFIKY A MULTIM
´
EDI
´
I
FACULTY OF INFORMATION TECHNOLOGY
DEPARTMENT OF COMPUTER GRAPHICS AND MULTIMEDIA
STATISTICAL LANGUAGE MODELS BASED ON NEURAL
NETWORKS
DISERTA
ˇ
CN
´
I PR
´
ACE
PHD THESIS
AUTOR PR
´
ACE Ing. TOM
´
A
ˇ
S MIKOLOV
AUTHOR
BRNO 2012
VYSOK
´
E U
ˇ
CEN
´
I TECHNICK
´
E V BRN
ˇ
E
BRNO UNIVERSITY OF TECHNOLOGY
FAKULTA INFORMA
ˇ
CN
´
ICH TECHNOLOGI
´
I
´
USTAV PO
ˇ
C
´
ITA
ˇ
COV
´
E GRAFIKY A MULTIM
´
EDI
´
I
FACULTY OF INFORMATION TECHNOLOGY
DEPARTMENT OF COMPUTER GRAPHICS AND MULTIMEDIA
STATISTICK
´
E JAZYKOV
´
E MODELY ZALO
ˇ
ZEN
´
E
NA NEURONOV
´
YCH S
´
IT
´
ICH
STATISTICAL LANGUAGE MODELS BASED ON NEURAL NETWORKS
DISERTA
ˇ
CN
´
I PR
´
ACE
PHD THESIS
AUTOR PR
´
ACE Ing. TOM
´
A
ˇ
S MIKOLOV
AUTHOR
VEDOUC
´
I PR
´
ACE Doc. Dr. Ing. JAN
ˇ
CERNOCK
´
Y
SUPERVISOR
BRNO 2012
Abstrakt
Statistick´e jazykov´e modely jsou d˚uleˇzitou souˇc´ast´ı mnoha ´uspˇeˇsn´ych aplikac´ı, mezi nˇeˇz
patˇr´ı napˇr´ıklad automatick´e rozpozn´av´an´ı ˇreˇci a strojov´y pˇreklad (pˇr´ıkladem je zn´am´a
aplikace Google Translate). Tradiˇcn´ı techniky pro odhad tˇechto model˚u jsou zaloˇzeny
na tzv. N-gramech. Navzdory zn´am´ym nedostatk˚um tˇechto technik a obrovsk´emu ´usil´ı
v´yzkumn´ych skupin napˇr´ıˇc mnoha oblastmi (rozpozn´av´an´ı ˇreˇci, automatick´y pˇreklad, neu-
roscience, umˇel´a inteligence, zpracov´an´ı pˇrirozen´eho jazyka, komprese dat, psychologie
atd.), N -gramy v podstatˇe z˚ustaly nej´uspˇeˇsnˇejˇs´ı technikou. C´ılem t´eto pr´ace je prezen-
tace nˇekolika architektur jazykov´ych model˚u zaloˇzen´ych na neuronov´ych s´ıt´ıch. Aˇckoliv
jsou tyto modely v´ypoˇcetnˇe n´aroˇcnˇejˇs´ı neˇz N-gramov´e modely, s technikami vyvinut´ymi v
t´eto pr´aci je moˇzn´e jejich efektivn´ı pouˇzit´ı v re´aln´ych aplikac´ıch. Dosaˇzen´e sn´ıˇzen´ı poˇctu
chyb pˇri rozpozn´av´an´ı ˇreˇci oproti nejlepˇs´ım N -gramov´ym model˚um dosahuje 20%. Model
zaloˇzen´y na rekurentn´ı neurovov´e s´ıti dosahuje nejlepˇs´ıch publikovan´ych v´ysledk˚u na velmi
zn´am´e datov´e sadˇe (Penn Treebank).
Abstract
Statistical language models are crucial part of many successful applications, such as au-
tomatic speech recognition and statistical machine translation (for example well-known
Google Translate). Traditional techniques for estimating these models are based on N-
gram counts. Despite known weaknesses of N-grams and huge efforts of research commu-
nities across many fields (speech recognition, machine translation, neuroscience, artificial
intelligence, natural language processing, data compression, psychology etc.), N-grams
remained basically the state-of-the-art. The goal of this thesis is to present various archi-
tectures of language models that are based on artificial neural networks. Although these
models are computationally more expensive than N -gram models, with the presented
techniques it is possible to apply them to state-of-the-art systems efficiently. Achieved
reductions of word error rate of speech recognition systems are up to 20%, against state-
of-the-art N -gram model. The presented recurrent neural network based model achieves
the best published performance on well-known Penn Treebank setup.
Kl´ıˇcov´a slova
jazykov´y model, neuronov´a s´ıt
’
, rekurentn´ı, maxim´aln´ı entropie, rozpozn´av´an´ı ˇreˇci, komp-
rese dat, umˇel´a inteligence
Keywords
language model, neural network, recurrent, maximum entropy, speech recognition, data
compression, artificial intelligence
Citace
Tom´aˇs Mikolov: Statistical Language Models Based on Neural Networks, disertaˇcn´ı pr´ace,
Brno, FIT VUT v Brnˇe, 2012
Statistical Language Models Based on Neural Net-
works
Prohl´aˇsen´ı
Prohlaˇsuji, ˇze jsem tuto disertaˇcn´ı pr´aci vypracoval samostatnˇe pod veden´ım Doc. Dr.
Ing. Jana
ˇ
Cernock´eho. Uvedl jsem vˇsechny liter´arn´ı publikace, ze kter´ych jsem ˇcerpal.
Nˇekter´e experimenty byly provedeny ve spolupr´aci s dalˇs´ımi ˇcleny skupiny Speech@FIT,
pˇr´ıpadnˇe se studenty z Johns Hopkins University - toto je v pr´aci vˇzdy explicitnˇe uvedeno.
. . . . . . . . . . . . . . . . . . . . . . .
Tom´aˇs Mikolov
Kvˇeten 2012
Acknowledgements
I would like to thank my supervisor Jan
ˇ
Cernock´y for allowing me to explore new ap-
proaches to standard problems, for his support and constructive criticism of my work,
and for his ability to quickly organize everything related to my studies. I am grateful to
Luk´aˇs Burget for many advices he gave me about speech recognition systems, for long
discussions about many technical details and for his open-minded approach to research.
I would also like to thank all members of Speech@FIT group for cooperation, especially
Stefan Kombrink, Oldˇrich Plchot, Martin Karafi´at, Ondˇrej Glembek and Jiˇr´ı Kopeck´y.
It was great experience for me to visit Johns Hopkins University during my studies, and I
am grateful to Frederick Jelinek and Sanjeev Khudanpur for granting me this opportunity.
I always enjoyed discussions with Sanjeev, who was my mentor during my stay there. I
also collaborated with other students at JHU, especially Puyang Xu, Scott Novotney and
Anoop Deoras. With Anoop, we were able to push state-of-the-art on several standard
tasks to new limits, which was the most exciting for me.
As my thesis work is based on work of Yoshua Bengio, it was great for me that I could
have spent several months in his machine learning lab at University of Montreal. I al-
ways enjoyed reading Yoshua’s papers, and it was awesome to discuss with him my ideas
personally.
© Tom´aˇs Mikolov, 2012.
Tato pr´ace vznikla jako ˇskoln´ı d´ılo na Vysok´em uˇcen´ı technick´em v Brnˇe, Fakultˇe in-
formaˇcn´ıch technologi´ı. Pr´ace je chr´anˇena autorsk´ym z´akonem a jej´ı uˇzit´ı bez udˇelen´ı
opr´avnˇen´ı autorem je nez´akonn´e, s v´yjimkou z´akonem definovan´ych pˇr´ıpad˚u.
Contents
1 Introduction 4
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Claims of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Overview of Statistical Language Modeling 9
2.1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Perplexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.2 Word Error Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 N-gram Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Advanced Language Modeling Techniques . . . . . . . . . . . . . . . . . . . 17
2.3.1 Cache Language Models . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.2 Class Based Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.3 Structured Language Models . . . . . . . . . . . . . . . . . . . . . . 20
2.3.4 Decision Trees and Random Forest Language Models . . . . . . . . . 22
2.3.5 Maximum Entropy Language Models . . . . . . . . . . . . . . . . . . 22
2.3.6 Neural Network Based Language Models . . . . . . . . . . . . . . . . 23
2.4 Introduction to Data Sets and Experimental Setups . . . . . . . . . . . . . 24
3 Neural Network Language Models 26
3.1 Feedforward Neural Network Based Language Model . . . . . . . . . . . . . 27
3.2 Recurrent Neural Network Based Language Model . . . . . . . . . . . . . . 28
3.3 Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3.1 Backpropagation Through Time . . . . . . . . . . . . . . . . . . . . 33
3.3.2 Practical Advices for the Training . . . . . . . . . . . . . . . . . . . 35
3.4 Extensions of NNLMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1
剩余132页未读,继续阅读
陈一甲
- 粉丝: 1
- 资源: 2
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0