自然语言处理NLP中文分词之NLP

共1个文件

pdf：1个

需积分: 5 158 浏览量 2024-03-31 17:47:51 上传评论收藏 3.07MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

自然语言处理NLP中文分词之NLP_BOOK.zip （1个子文件）

eisenstein-nlp-notes.pdf 3.2MB

Natural Language Processing

Jacob Eisenstein

November 13, 2018

Contents

Contents 1

Preface i

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

How to use this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

1 Introduction 1

1.1 Natural language processing and its neighbors . . . . . . . . . . . . . . . . . 1

1.2 Three themes in natural language processing . . . . . . . . . . . . . . . . . . 6

1.2.1 Learning and knowledge . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.2 Search and learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.3 Relational, compositional, and distributional perspectives . . . . . . 9

I Learning 11

2 Linear text classiﬁcation 13

2.1 The bag of words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Na

ıve Bayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.1 Types and tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2.2 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.4 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2.5 Setting hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3 Discriminative learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3.1 Perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.3.2 Averaged perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.4 Loss functions and large-margin classiﬁcation . . . . . . . . . . . . . . . . . 27

2.4.1 Online large margin classiﬁcation . . . . . . . . . . . . . . . . . . . . 30

2.4.2 *Derivation of the online support vector machine . . . . . . . . . . . 32

2.5 Logistic regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2 CONTENTS

2.5.1 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.5.2 Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.6 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.6.1 Batch optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.6.2 Online optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.7 *Additional topics in classiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . 41

2.7.1 Feature selection by regularization . . . . . . . . . . . . . . . . . . . . 41

2.7.2 Other views of logistic regression . . . . . . . . . . . . . . . . . . . . . 41

2.8 Summary of learning algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Nonlinear classiﬁcation 47

3.1 Feedforward neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.2 Designing neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.2.1 Activation functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.2.2 Network structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.2.3 Outputs and loss functions . . . . . . . . . . . . . . . . . . . . . . . . 52

3.2.4 Inputs and lookup layers . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3 Learning neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3.1 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.3.2 Regularization and dropout . . . . . . . . . . . . . . . . . . . . . . . . 57

3.3.3 *Learning theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.3.4 Tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3.4 Convolutional neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4 Linguistic applications of classiﬁcation 69

4.1 Sentiment and opinion analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.1.1 Related problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.1.2 Alternative approaches to sentiment analysis . . . . . . . . . . . . . . 72

4.2 Word sense disambiguation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.2.1 How many word senses? . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.2.2 Word sense disambiguation as classiﬁcation . . . . . . . . . . . . . . 75

4.3 Design decisions for text classiﬁcation . . . . . . . . . . . . . . . . . . . . . . 76

4.3.1 What is a word? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.3.2 How many words? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.3.3 Count or binary? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.4 Evaluating classiﬁers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.4.1 Precision, recall, and F -MEASURE . . . . . . . . . . . . . . . . . . . . 81

4.4.2 Threshold-free metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.4.3 Classiﬁer comparison and statistical signiﬁcance . . . . . . . . . . . . 84

4.4.4 *Multiple comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.5 Building datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Jacob Eisenstein. Draft of November 13, 2018.

CONTENTS 3

4.5.1 Metadata as labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.5.2 Labeling data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5 Learning without supervision 95

5.1 Unsupervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.1.1 K-means clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.1.2 Expectation-Maximization (EM) . . . . . . . . . . . . . . . . . . . . . 98

5.1.3 EM as an optimization algorithm . . . . . . . . . . . . . . . . . . . . . 102

5.1.4 How many clusters? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.2 Applications of expectation-maximization . . . . . . . . . . . . . . . . . . . . 104

5.2.1 Word sense induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.2.2 Semi-supervised learning . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.2.3 Multi-component modeling . . . . . . . . . . . . . . . . . . . . . . . . 106

5.3 Semi-supervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

5.3.1 Multi-view learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

5.3.2 Graph-based algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.4 Domain adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.4.1 Supervised domain adaptation . . . . . . . . . . . . . . . . . . . . . . 111

5.4.2 Unsupervised domain adaptation . . . . . . . . . . . . . . . . . . . . 112

5.5 *Other approaches to learning with latent variables . . . . . . . . . . . . . . 114

5.5.1 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

5.5.2 Spectral learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

II Sequences and trees 123

6 Language models 125

6.1 N-gram language models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.2 Smoothing and discounting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.2.1 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.2.2 Discounting and backoff . . . . . . . . . . . . . . . . . . . . . . . . . . 130

6.2.3 *Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.2.4 *Kneser-Ney smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . 133

6.3 Recurrent neural network language models . . . . . . . . . . . . . . . . . . . 133

6.3.1 Backpropagation through time . . . . . . . . . . . . . . . . . . . . . . 136

6.3.2 Hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

6.3.3 Gated recurrent neural networks . . . . . . . . . . . . . . . . . . . . . 137

6.4 Evaluating language models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

6.4.1 Held-out likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

6.4.2 Perplexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

6.5 Out-of-vocabulary words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Under contract with MIT Press, shared under CC-BY-NC-ND license.

评论收藏

内容反馈

程序员蜗牛

粉丝: 1107
资源: 75

自然语言处理NLP中文分词之NLP_BOOK.zip

自然语言处理NLP中文分词之IT词库.zip

自然语言处理NLP中文分词之.logo图片.zip

自然语言处理NLP中文分词之公司名字词库.zip

自然语言处理NLP中文分词之中文分词词库整理.zip

自然语言处理NLP中文分词之地名词库.zip

自然语言处理NLP中文分词之paper.zip

自然语言处理NLP中文分词之中文谣言数据.zip

自然语言处理NLP中文分词之中文缩写库.zip

自然语言处理NLP中文分词之医学词库.zip

自然语言处理NLP中文分词之财经词库.zip

自然语言处理NLP中文分词之职业词库.zip

自然语言处理NLP中文分词之成语词库.zip

自然语言处理NLP中文分词之食物词库.zip

自然语言处理NLP中文分词之停用词.zip

自然语言处理NLP中文分词之动物词库.zip

自然语言处理NLP中文分词之法律词库.zip

自然语言处理NLP中文分词之拆字词库.zip

自然语言处理NLP中文分词之古诗词库.zip

自然语言处理NLP中文分词之诗词短句词库.zip

相关实用应用程序（Windows可用）

李飞飞自传 我看见的世界 The World I see

ChatGPT使用总结：150个ChatGPT提示词模板（完整版）

chromedriver-win64.zip

第十九届研电赛-技术论文模板

全国计算机二级WPSoffice精选350道选择题题库（含答案）.pdf

哈尔滨工业大学-ChatGPT调研报告-2023.3.6-94页.pdf

智联招聘：2024年大学生就业力调研报告.pdf

4个亲测好用的ChatGPT4渠道

1.txt

最新资源

李飞飞自传我看见的世界 The World I see