它涵盖了自然语言处理（NLP）和语音处理的各个方面资源-CSDN文库

自然语言处理

99 浏览量 2024-03-08 17:49:44 上传评论收藏 20.42MB PDF 举报

资源推荐

资源详情

资源评论

Speech and Language Processing

An Introduction to Natural Language Processing,

Computational Linguistics, and Speech Recognition

Third Edition draft

Daniel Jurafsky

Stanford University

James H. Martin

University of Colorado at Boulder

Draft of February 3, 2024. Comments and typos welcome!

Summary of Contents

I Fundamental Algorithms for NLP 1

1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Regular Expressions, Text Normalization, Edit Distance. . . . . . . . . 4

3 N-gram Language Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4 Naive Bayes, Text Classiﬁcation, and Sentiment . . . . . . . . . . . . . . . . . 60

5 Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6 Vector Semantics and Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7 Neural Networks and Neural Language Models . . . . . . . . . . . . . . . . . 136

8 Sequence Labeling for Parts of Speech and Named Entities . . . . . . 162

9 RNNs and LSTMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

10 Transformers and Large Language Models . . . . . . . . . . . . . . . . . . . . . 213

11 Fine-Tuning and Masked Language Models. . . . . . . . . . . . . . . . . . . . . 242

12 Prompting, In-Context Learning, and Instruct Tuning. . . . . . . . . . .263

II NLP Applications 265

13 Machine Translation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

14 Question Answering and Information Retrieval . . . . . . . . . . . . . . . . . 293

15 Chatbots & Dialogue Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

16 Automatic Speech Recognition and Text-to-Speech . . . . . . . . . . . . . . 337

III Annotating Linguistic Structure 365

17 Context-Free Grammars and Constituency Parsing . . . . . . . . . . . . . 367

18 Dependency Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391

19 Information Extraction: Relations, Events, and Time. . . . . . . . . . . .415

20 Semantic Role Labeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441

21 Lexicons for Sentiment, Affect, and Connotation. . . . . . . . . . . . . . . . 461

22 Coreference Resolution and Entity Linking . . . . . . . . . . . . . . . . . . . . . 481

23 Discourse Coherence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .511

Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .533

Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563

Contents

I Fundamental Algorithms for NLP 1

1 Introduction 3

2 Regular Expressions, Text Normalization, Edit Distance 4

2.1 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 Corpora . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4 Simple Unix Tools for Word Tokenization . . . . . . . . . . . . . 17

2.5 Word Tokenization . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.6 Word Normalization, Lemmatization and Stemming . . . . . . . . 23

2.7 Sentence Segmentation . . . . . . . . . . . . . . . . . . . . . . . 25

2.8 Minimum Edit Distance . . . . . . . . . . . . . . . . . . . . . . . 25

2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 30

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3 N-gram Language Models 32

3.1 N-Grams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2 Evaluating Language Models: Training and Test Sets . . . . . . . 38

3.3 Evaluating Language Models: Perplexity . . . . . . . . . . . . . . 39

3.4 Sampling sentences from a language model . . . . . . . . . . . . . 42

3.5 Generalization and Zeros . . . . . . . . . . . . . . . . . . . . . . 42

3.6 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.7 Huge Language Models and Stupid Backoff . . . . . . . . . . . . 50

3.8 Advanced: Kneser-Ney Smoothing . . . . . . . . . . . . . . . . . 51

3.9 Advanced: Perplexity’s Relation to Entropy . . . . . . . . . . . . 54

3.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 57

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4 Naive Bayes, Text Classiﬁcation, and Sentiment 60

4.1 Naive Bayes Classiﬁers . . . . . . . . . . . . . . . . . . . . . . . 61

4.2 Training the Naive Bayes Classiﬁer . . . . . . . . . . . . . . . . . 64

4.3 Worked example . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.4 Optimizing for Sentiment Analysis . . . . . . . . . . . . . . . . . 66

4.5 Naive Bayes for other text classiﬁcation tasks . . . . . . . . . . . 68

4.6 Naive Bayes as a Language Model . . . . . . . . . . . . . . . . . 69

4.7 Evaluation: Precision, Recall, F-measure . . . . . . . . . . . . . . 70

4.8 Test sets and Cross-validation . . . . . . . . . . . . . . . . . . . . 72

4.9 Statistical Signiﬁcance Testing . . . . . . . . . . . . . . . . . . . 73

4.10 Avoiding Harms in Classiﬁcation . . . . . . . . . . . . . . . . . . 77

4.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 78

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

5 Logistic Regression 81

5.1 The sigmoid function . . . . . . . . . . . . . . . . . . . . . . . . 82

5.2 Classiﬁcation with Logistic Regression . . . . . . . . . . . . . . . 84

4 CONTENTS

5.3 Multinomial logistic regression . . . . . . . . . . . . . . . . . . . 88

5.4 Learning in Logistic Regression . . . . . . . . . . . . . . . . . . . 91

5.5 The cross-entropy loss function . . . . . . . . . . . . . . . . . . . 92

5.6 Gradient Descent . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.7 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.8 Learning in Multinomial Logistic Regression . . . . . . . . . . . . 100

5.9 Interpreting models . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.10 Advanced: Deriving the Gradient Equation . . . . . . . . . . . . . 102

5.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 104

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6 Vector Semantics and Embeddings 105

6.1 Lexical Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.2 Vector Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.3 Words and Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 110

6.4 Cosine for measuring similarity . . . . . . . . . . . . . . . . . . . 114

6.5 TF-IDF: Weighing terms in the vector . . . . . . . . . . . . . . . 115

6.6 Pointwise Mutual Information (PMI) . . . . . . . . . . . . . . . . 118

6.7 Applications of the tf-idf or PPMI vector models . . . . . . . . . . 120

6.8 Word2vec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

6.9 Visualizing Embeddings . . . . . . . . . . . . . . . . . . . . . . . 127

6.10 Semantic properties of embeddings . . . . . . . . . . . . . . . . . 128

6.11 Bias and Embeddings . . . . . . . . . . . . . . . . . . . . . . . . 130

6.12 Evaluating Vector Models . . . . . . . . . . . . . . . . . . . . . . 131

6.13 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 133

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

7 Neural Networks and Neural Language Models 136

7.1 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

7.2 The XOR problem . . . . . . . . . . . . . . . . . . . . . . . . . . 139

7.3 Feedforward Neural Networks . . . . . . . . . . . . . . . . . . . . 142

7.4 Feedforward networks for NLP: Classiﬁcation . . . . . . . . . . . 147

7.5 Training Neural Nets . . . . . . . . . . . . . . . . . . . . . . . . 149

7.6 Feedforward Neural Language Modeling . . . . . . . . . . . . . . 156

7.7 Training the neural language model . . . . . . . . . . . . . . . . . 158

7.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 161

8 Sequence Labeling for Parts of Speech and Named Entities 162

8.1 (Mostly) English Word Classes . . . . . . . . . . . . . . . . . . . 163

8.2 Part-of-Speech Tagging . . . . . . . . . . . . . . . . . . . . . . . 165

8.3 Named Entities and Named Entity Tagging . . . . . . . . . . . . . 167

8.4 HMM Part-of-Speech Tagging . . . . . . . . . . . . . . . . . . . 169

8.5 Conditional Random Fields (CRFs) . . . . . . . . . . . . . . . . . 176

8.6 Evaluation of Named Entity Recognition . . . . . . . . . . . . . . 181

8.7 Further Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

8.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 184

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

CONTENTS 5

9 RNNs and LSTMs 187

9.1 Recurrent Neural Networks . . . . . . . . . . . . . . . . . . . . . 187

9.2 RNNs as Language Models . . . . . . . . . . . . . . . . . . . . . 191

9.3 RNNs for other NLP tasks . . . . . . . . . . . . . . . . . . . . . . 194

9.4 Stacked and Bidirectional RNN architectures . . . . . . . . . . . . 197

9.5 The LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

9.6 Summary: Common RNN NLP Architectures . . . . . . . . . . . 203

9.7 The Encoder-Decoder Model with RNNs . . . . . . . . . . . . . . 203

9.8 Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

9.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 211

10 Transformers and Large Language Models 213

10.1 The Transformer: A Self-Attention Network . . . . . . . . . . . . 214

10.2 Multihead Attention . . . . . . . . . . . . . . . . . . . . . . . . . 221

10.3 Transformer Blocks . . . . . . . . . . . . . . . . . . . . . . . . . 221

10.4 The Residual Stream view of the Transformer Block . . . . . . . . 224

10.5 The input: embeddings for token and position . . . . . . . . . . . 226

10.6 The Language Modeling Head . . . . . . . . . . . . . . . . . . . 228

10.7 Large Language Models with Transformers . . . . . . . . . . . . . 231

10.8 Large Language Models: Generation by Sampling . . . . . . . . . 234

10.9 Large Language Models: Training Transformers . . . . . . . . . . 237

10.10 Potential Harms from Language Models . . . . . . . . . . . . . . 239

10.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 241

11 Fine-Tuning and Masked Language Models 242

11.1 Bidirectional Transformer Encoders . . . . . . . . . . . . . . . . . 242

11.2 Training Bidirectional Encoders . . . . . . . . . . . . . . . . . . . 246

11.3 Contextual Embeddings . . . . . . . . . . . . . . . . . . . . . . . 250

11.4 Fine-Tuning Language Models . . . . . . . . . . . . . . . . . . . 254

11.5 Advanced: Span-based Masking . . . . . . . . . . . . . . . . . . 258

11.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 262

12 Prompting, In-Context Learning, and Instruct Tuning 263

II NLP Applications 265

13 Machine Translation 267

13.1 Language Divergences and Typology . . . . . . . . . . . . . . . . 268

13.2 Machine Translation using Encoder-Decoder . . . . . . . . . . . . 272

13.3 Details of the Encoder-Decoder Model . . . . . . . . . . . . . . . 276

13.4 Decoding in MT: Beam Search . . . . . . . . . . . . . . . . . . . 278

13.5 Translating in low-resource situations . . . . . . . . . . . . . . . . 282

13.6 MT Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

13.7 Bias and Ethical Issues . . . . . . . . . . . . . . . . . . . . . . . 288

13.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

Bibliographical and Historical Notes . . . . . . . . . . . . . . . . . . . . 290

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

14 Question Answering and Information Retrieval 293

剩余576页未读，继续阅读

评论收藏

内容反馈

xianyinsuifeng

粉丝: 433
资源: 55

它涵盖了自然语言处理（NLP）和语音处理的各个方面

自然语言处理NaturalLanguageProcessing(NLP).ppt

本科毕业设计-自然语言处理+NLP+中文文本分类实战-垃圾短信识别

自然语言处理和语音识别中必须的语言模型工具.rar

自然语言处理中文情感分类源代码

自然语言处理综论 第2版_2018.03_784_14391789.pdf

哈工大 智能技术与自然语言处理技术课程 NLP系列课程 自然语言处理大总结 脑图总结.pdf

python自然语言处理（NLP）入门.pdf

NLP自然语言处理的题目

NLP课件（自然语言处理课件）

自然语言处理NLP自然语言处理

自然语言处理分词大作业

NLP汉语自然语言处理原理与实践.pdf 有目录

深度学习自然语言处理进展综述论文（NLP advancements by DL: A Survey）.pdf

Python自然语言处理-BERT实战

北大语言学 自然语言处理课程 NLP系列课程 1_自然语言处理概论 共48页.pptx

fastNLP自然语言处理（NLP）工具包

北大语言学 自然语言处理课程 NLP系列课程 2_机器学习与自然语言处理 共33页.pptx

NLP汉语自然语言处理原理与实践-带目录完整版 郑捷

Python-sparknlp面向Spark的自然语言处理NLP库

stable-diffusion部署需要的包

大规模语言模型：从理论到实践

21个免费无限制免登录chatgpt资源， OpenAI GPT-4\3.5 模型的智能对话链接

人工智能大模型介绍.pptx

ChatGPT智能AI机器人微信小程序源码-带部署教程

diabetes糖尿病数据集

LM Studio windows版本安装

transformer代码

线性代数-同济大学第七版

《ChatGPT中文版提示词手册，学完工作效率提升百倍！.pdf》

最新资源

自然语言处理综论第2版_2018.03_784_14391789.pdf

哈工大智能技术与自然语言处理技术课程 NLP系列课程自然语言处理大总结脑图总结.pdf

北大语言学自然语言处理课程 NLP系列课程 1_自然语言处理概论共48页.pptx

北大语言学自然语言处理课程 NLP系列课程 2_机器学习与自然语言处理共33页.pptx

NLP汉语自然语言处理原理与实践-带目录完整版郑捷