没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
DRAFT
Speech and Language Processing: An introduction to natural language processing,
computational linguistics, and speech recognition. Daniel Jurafsky & James H. Martin.
Copyright
c
2006, All rights reserved. Draft of June 25, 2007. Do not cite without
permission.
1
INTRODUCTION
Dave Bowman: Open the pod bay doors, HAL.
HAL: I’m sorry Dave, I’m afraid I can’t do that.
Stanley Kubrick and Arthur C. Clarke,
screenplay of 2001: A Space Odyssey
This book is about a new interdisciplinary field variously called computer speech
and language processing or human language technology or natural language pro-
cessing or computational linguistics. The goal of this new field is to get computers
to perform useful tasks involving human language, tasks like enabling human-machine
communication, improving human-human communication, or simply doing useful pro-
cessing of text or speech.
One example of a useful such task is a conversational agent. The HAL 9000 com-
CONVERSATIONAL
AGENT
puter in Stanley Kubrick’s film 2001: A Space Odyssey is one of the most recognizable
characters in twentieth-century cinema. HAL is an artificial agent capable of such ad-
vanced language-processing behavior as speaking and understanding English, and at a
crucial moment in the plot, even reading lips. It is now clear that HAL’s creator Arthur
C. Clarke was a little optimistic in predicting when an artificial agent such as HAL
would be available. But just how far off was he? What would it take to create at least
the language-related parts of HAL? We call programs like HAL that converse with hu-
mans via natural language conversational agents or dialogue systems. In this text we
CONVERSATIONAL
AGENTS
DIALOGUE SYSTEMS
study the various components that make up modern conversational agents, including
language input (automatic speech recognition and natural language understand-
ing) and language output (natural language generation and speech synthesis).
Let’s turn to another useful language-related task, that of making available to non-
English-speaking readers the vast amount of scientific information on the Web in En-
glish. Or translating for English speakers the hundreds of millions of Web pages written
in other languages like Chinese. The goal of machine translation is to automatically
MACHINE
TRANSLATION
translate a document from one language to another. Machine translation is far from
a solved problem; we will cover the algorithms currently used in the field, as well as
important component tasks.
Many other language processing tasks are also related to the Web. Another such
task is Web-based question answering. This is a generalization of simple web search,
QUESTION
ANSWERING
where instead of just typing keywords a user might ask complete questions, ranging
from easy to hard, like the following:
DRAFT
2 Chapter 1. Introduction
• What does “divergent” mean?
• What year was Abraham Lincoln born?
• How many states were in the United States that year?
• How much Chinese silk was exported to England by the end of the 18th century?
• What do scientists think about the ethics of human cloning?
Some of these, such as definition questions, or simple factoid questions like dates
and locations, can already be answered by search engines. But answering more com-
plicated questions might require extracting information that is embedded in other text
on a Web page, or doing inference (drawing conclusions based on known facts), or
synthesizing and summarizing information from multiple sources or web pages. In this
text we study the various components that make up modern understanding systems of
this kind, including information extraction, word sense disambiguation, and so on.
Although the subfields and problems we’ve described above are all very far from
completely solved, these are all very active research areas and many technologies are
already available commercially. In the rest of this chapter we briefly summarize the
kinds of knowledge that is necessary for these tasks (and others like spell correction,
grammar checking, and so on), as well as the mathematical models that will be intro-
duced throughout the book.
1.1 KNOWLEDGE IN SPEECH AND LANGUAGE PROCESSING
What distinguishes language processing applications from other data processing sys-
tems is their use of knowledge of language. Consider the Unix wc program, which is
used to count the total number of bytes, words, and lines in a text file. When used to
count bytes and lines, wc is an ordinary data processing application. However, when it
is used to count the words in a file it requires knowledge about what it means to be a
word, and thus becomes a language processing system.
Of course, wc is an extremely simple system with an extremely limited and im-
poverished knowledge of language. Sophisticated conversational agents like HAL,
or machine translation systems, or robust question-answering systems, require much
broader and deeper knowledge of language. To get a feeling for the scope and kind of
required knowledge, consider some of what HAL would need to know to engage in the
dialogue that begins this chapter, or for a question answering system to answer one of
the questions above.
HAL must be able to recognize words from an audio signal and to generate an
audio signal from a sequence of words. These tasks of speech recognition and speech
synthesis tasks require knowledge about phonetics and phonology; how words are
pronounced in terms of sequences of sounds, and how each of these sounds is realized
acoustically.
Note also that unlike Star Trek’s Commander Data, HAL is capable of producing
contractions like I’m and can’t. Producing and recognizing these and other variations
of individual words (e.g., recognizing that doors is plural) requires knowledge about
morphology, the way words break down into component parts that carry meanings like
singular versus plural.
DRAFT
Section 1.1. Knowledge in Speech and Language Processing 3
Moving beyond individual words, HAL must use structural knowledge to properly
string together the words that constitute its response. For example, HAL must know
that the following sequence of words will not make sense to Dave, despite the fact that
it contains precisely the same set of words as the original.
I’m I do, sorry that afraid Dave I’m can’t.
The knowledge needed to order and group words together comes under the heading of
syntax.
Now consider a question answering system dealing with the following question:
• How much Chinese silk was exported to Western Europe by the end of the 18th
century?
In order to answer this question we need to know something about lexical seman-
tics, the meaning of all the words (export, or silk) as well as compositional semantics
(what exactly constitutes Western Europe as opposed to Eastern or Southern Europe,
what does end mean when combined with the 18th century. We also need to know
something about the relationship of the words to the syntactic structure. For example
we need to know that by the end of the 18th century is a temporal end-point, and not a
description of the agent, as the by-phrase is in the following sentence:
• How much Chinese silk was exported to Western Europe by southern merchants?
We also need the kind of knowledge that lets HAL determine that Dave’s utterance
is a request for action, as opposed to a simple statement about the world or a question
about the door, as in the following variations of his original statement.
REQUEST: HAL, open the pod bay door.
STATEMENT: HAL, the pod bay door is open.
INFORMATION QUESTION: HAL, is the pod bay door open?
Next, despite its bad behavior, HAL knows enough to be polite to Dave. It could,
for example, have simply replied No or No, I won’t open the door. Instead, it first
embellishes its response with the phrases I’m sorry and I’m afraid, and then only indi-
rectly signals its refusal by saying I can’t, rather than the more direct (and truthful) I
won’t.
1
This knowledge about the kind of actions that speakers intend by their use of
sentences is pragmatic or dialogue knowledge.
Another kind of pragmatic or discourse knowledge is required to answer the ques-
tion
• How many states were in the United States that year?
What year is that year? In order to interpret words like that year a question answer-
ing system need to examine the the earlier questions that were asked; in this case the
previous question talked about the year that Lincoln was born. Thus this task of coref-
erence resolution makes use of knowledge about how words like that or pronouns like
it or she refer to previous parts of the discourse.
To summarize, engaging in complex language behavior requires various kinds of
knowledge of language:
1
For those unfamiliar with HAL, it is neither sorry nor afraid, nor is it incapable of opening the door. It
has simply decided in a fit of paranoia to kill its crew.
DRAFT
4 Chapter 1. Introduction
• Phonetics and Phonology — knowledge about linguistic sounds
• Morphology — knowledge of the meaningful components of words
• Syntax — knowledge of the structural relationships between words
• Semantics — knowledge of meaning
• Pragmatics — knowledge of the relationship of meaning to the goals and inten-
tions of the speaker.
• Discourse — knowledge about linguistic units larger than a single utterance
1.2 AMBIGUITY
A perhaps surprising fact about these categories of linguistic knowledge is that most
tasks in speech and language processing can be viewed as resolving ambiguity at one
AMBIGUITY
of these levels. We say some input is ambiguous if there are multiple alternative lin-
AMBIGUOUS
guistic structures that can be built for it. Consider the spoken sentence I made her duck.
Here’s five different meanings this sentence could have (see if you can think of some
more), each of which exemplifies an ambiguity at some level:
(1.1) I cooked waterfowl for her.
(1.2) I cooked waterfowl belonging to her.
(1.3) I created the (plaster?) duck she owns.
(1.4) I caused her to quickly lower her head or body.
(1.5) I waved my magic wand and turned her into undifferentiated waterfowl.
These different meanings are caused by a number of ambiguities. First, the words duck
and her are morphologically or syntactically ambiguous in their part-of-speech. Duck
can be a verb or a noun, while her can be a dative pronoun or a possessive pronoun.
Second, the word make is semantically ambiguous; it can mean create or cook. Finally,
the verb make is syntactically ambiguous in a different way. Make can be transitive,
that is, taking a single direct object (1.2), or it can be ditransitive, that is, taking two
objects (1.5), meaning that the first object (her) got made into the second object (duck).
Finally, make can take a direct object and a verb (1.4), meaning that the object (her) got
caused to perform the verbal action (duck). Furthermore, in a spoken sentence, there
is an even deeper kind of ambiguity; the first word could have been eye or the second
word maid.
We will often introduce the models and algorithms we present throughout the book
as ways to resolve or disambiguate these ambiguities. For example deciding whether
duck is a verb or a noun can be solved by part-of-speech tagging. Deciding whether
make means “create” or “cook” can be solved by word sense disambiguation. Reso-
lution of part-of-speech and word sense ambiguities are two important kinds of lexical
disambiguation. A wide variety of tasks can be framed as lexical disambiguation
problems. For example, a text-to-speech synthesis system reading the word lead needs
to decide whether it should be pronounced as in lead pipe or as in lead me on. By
contrast, deciding whether her and duck are part of the same entity (as in (1.1) or (1.4))
or are different entity (as in (1.2)) is an example of syntactic disambiguation and can
DRAFT
Section 1.3. Models and Algorithms 5
be addressed by probabilistic parsing. Ambiguities that don’t arise in this particu-
lar example (like whether a given sentence is a statement or a question) will also be
resolved, for example by speech act interpretation.
1.3 MODELS AND ALGORITHMS
One of the key insights of the last 50 years of research in language processing is that
the various kinds of knowledge described in the last sections can be captured through
the use of a small number of formal models, or theories. Fortunately, these models and
theories are all drawn from the standard toolkits of computer science, mathematics, and
linguistics and should be generally familiar to those trained in those fields. Among the
most important models are state machines, rule systems, logic, probabilistic models,
and vector-space models. These models, in turn, lend themselves to a small number
of algorithms, among the most important of which are state space search algorithms
such as dynamic programming, and machine learning algorithms such as classifiers
and EM and other learning algorithms.
In their simplest formulation, state machines are formal models that consist of
states, transitions among states, and an input representation. Some of the variations
of this basic model that we will consider are deterministic and non-deterministic
finite-state automata and finite-state transducers.
Closely related to these models are their declarative counterparts: formal rule sys-
tems. Among the more important ones we will consider are regular grammars and
regular relations, context-free grammars, feature-augmented grammars, as well
as probabilistic variants of them all. State machines and formal rule systems are the
main tools used when dealing with knowledge of phonology, morphology, and syntax.
The third model that plays a critical role in capturing knowledge of language is
logic. We will discuss first order logic, also known as the predicate calculus, as well
as such related formalisms as lambda-calculus, feature-structures, and semantic primi-
tives. These logical representations have traditionally been used for modeling seman-
tics and pragmatics, although more recent work has focused on more robust techniques
drawn from non-logical lexical semantics.
Probabilistic models are crucial for capturing every kind of linguistic knowledge.
Each of the other models (state machines, formal rule systems, and logic) can be aug-
mented with probabilities. For example the state machine can be augmented with
probabilities to become the weighted automaton or Markov model. We will spend
a significant amount of time on hidden Markov models or HMMs, which are used
everywhere in the field, in part-of-speech tagging, speech recognition, dialogue under-
standing, text-to-speech, and machine translation. The key advantage of probabilistic
models is their ability to to solve the many kinds of ambiguity problems that we dis-
cussed earlier; almost any speech and language processing problem can be recast as:
“given N choices for some ambiguous input, choose the most probable one”.
Finally, vector-space models, based on linear algebra, underlie information retrieval
and many treatments of word meanings.
Processing language using any of these models typically involves a search through
剩余883页未读,继续阅读
yaqin276
- 粉丝: 0
- 资源: 2
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- Win64OpenSSL-3-3-0.exe
- 课高分程设计-基于C++实现的民航飞行与地图简易管理系统-南京航空航天大学
- 航天器遥测数据故障检测系统python源码+文档说明+数据库(课程设计)
- 北京航空航天大学操作系统课设+ppt+实验报告
- 基于Vue+Echarts实现风力发电机中传感器的数据展示监控可视化系统+源代码+文档说明(高分课程设计)
- 基于单片机的风力发电机转速控制源码
- 基于C++实现的风力发电气动平衡监测系统+源代码+测量数据(高分课程设计)
- 毕业设计- 基于STM32F103C8T6 单片机,物联网技术的太阳能发电装置+源代码+文档说明+架构图+界面截图
- 基于 LSTM(长短期记忆)(即改进的循环神经网络)预测风力发电厂中风力涡轮机产生的功率+源代码+文档说明
- 基于stm32f103+空心杯电机+oled按键+运动算法
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
- 1
- 2
- 3
- 4
- 5
- 6
前往页