CS 224D: Deep Learning for NLP
1
1
Course Instructor: Richard Socher
Lecture Notes: Part I
2
2
Authors: Francois Chaubard, Rohit
Mundra, Richard Socher
Spring 2016
Keyphrases: Natural Language Processing. Word Vectors. Singu-
lar Value Decomposition. Skip-gram. Continuous Bag of Words
(CBOW). Negative Sampling.
This set of notes begins by introducing the concept of Natural
Language Processing (NLP) and the problems NLP faces today. We
then move forward to discuss the concept of representing words as
numeric vectors. Lastly, we discuss popular approaches to designing
word vectors.
1 Introduction to Natural Language Processing
Natural Language Processing tasks
come in varying levels of difficulty:
Easy
• Spell Checking
• Keyword Search
• Finding Synonyms
Medium
• Parsing information from websites,
documents, etc.
Hard
• Machine Translation
• Semantic Analysis
• Coreference
• Question Answering
We begin with a general discussion of what is NLP. The goal of NLP
is to be able to design algorithms to allow computers to "understand"
natural language in order to perform some task. Example tasks come
in varying level of difficulty:
Easy
• Spell Checking
• Keyword Search
• Finding Synonyms
Medium
• Parsing information from websites, documents, etc.
Hard
• Machine Translation (e.g. Translate Chinese text to English)
• Semantic Analysis (What is the meaning of query statement?)
• Coreference (e.g. What does "he" or "it" refer to given a docu-
ment?)
• Question Answering (e.g. Answering Jeopardy questions).
The first and arguably most important common denominator
across all NLP tasks is how we represent words as input to any and
all of our models. Much of the earlier NLP work that we will not
cover treats words as atomic symbols. To perform well on most NLP
tasks we first need to have some notion of similarity and difference
评论1
最新资源