neuralnetworkmethodsfornaturallanguageprocessing资源-CSDN文库

nlp

5星 · 超过95%的资源需积分: 9 75 浏览量 2017-11-25 20:42:45 上传评论 1 收藏 1.83MB PDF 举报

资源推荐

资源详情

资源评论

Neural Network Methods for

Natural Language Processing

Yoav Goldberg

Bar Ilan University

SYNTHESIS LECTURES ON HUMAN LANGUAGE TECHNOLOGIES #37

cLaypoolMo

rgan publishers

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 e Challenges of Natural Language Processing. . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Neural Networks and Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Deep Learning in NLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.1 Success Stories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Coverage and Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 What’s not Covered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.6 A Note on Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.7 Mathematical Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

PART I Supervised Classiﬁcation and Feed-forward

Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Learning Basics and Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1 Supervised Learning and Parameterized Functions . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Train, Test, and Validation Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.1 Binary Classiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.2 Log-linear Binary Classiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.3.3 Multi-class Classiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.5 One-Hot and Dense Vector Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.6 Log-linear Multi-class Classiﬁcation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.7 Training as Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.7.1 Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.7.2 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.8 Gradient-based Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.8.1 Stochastic Gradient Descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.8.2 Worked-out Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.8.3 Beyond SGD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

From Linear Models to Multi-layer Perceptrons . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.1 Limitations of Linear Models: e XOR Problem . . . . . . . . . . . . . . . . . . . . . . . 37

3.2 Nonlinear Input Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.3 Kernel Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.4 Trainable Mapping Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Feed-forward Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.1 A Brain-inspired Metaphor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.2 In Mathematical Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.3 Representation Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.4 Common Nonlinearities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.5 Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.6 Regularization and Dropout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.7 Similarity and Distance Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.8 Embedding Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Neural Network Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.1 e Computation Graph Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.1.1 Forward Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.1.2 Backward Computation (Derivatives, Backprop) . . . . . . . . . . . . . . . . . . . 54

5.1.3 Software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.1.4 Implementation Recipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.1.5 Network Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.2 Practicalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.2.1 Choice of Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.2.2 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.2.3 Restarts and Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.2.4 Vanishing and Exploding Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2.5 Saturation and Dead Neurons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2.6 Shuﬄing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2.7 Learning Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2.8 Minibatches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

剩余281页未读，继续阅读

评论收藏

内容反馈

weixin_41744508

2018-06-18

还没看，之前下载的丢了
obeserver

2018-09-26

非常好的东西
Benz

2018-09-25

确实是好东西

skobe_kuang

粉丝: 0
资源: 1

neural network methods for natural language processing

最新资源

neural network methods for natural language processing

Neural network methods for natural language processing

Neural Network Methods for Natural Language Processing

Neural Network Methods for Natural Language Pr.pdf

高清版Neural Network Methods in Natural Language Processing April 17, 2017

Neural Network Methods for Natural Language Processing (part1/5)

Neural Network Methods for Natural Language Process

naural network methods for natural language processing

Advances in natural language processing

neural networks for natural language processing slides

natural language processing with python

Neural Network Methods in Natural Language Processing

(Yoav Goldberg)Neural Network Methods for Natural Language Processing适合nlp进阶

Neural Network Methods in Natural Language Processing epub

Neural_Network_Methods_in_Natural_Language_Processing

Hands-On-python-natural-language-processing-master.zip

Jacob Eisenstein -natural language processing notes

evaluate-news-article-with-natural-language-processing

natural-language-processing-exercises:自然语言处理（NLP）练习

Introduction to chinese natural language processing

A Primer on Neural Network Models for Natural Language Processing

Simple-natural-language-processing:作业1

最新资源