SpeechandLanguageProcessing2ndedition资源-CSDN文库

4星 · 超过85%的资源需积分: 11 21 浏览量 2010-01-27 11:56:50 上传评论 6 收藏 13.69MB PDF 举报

资源推荐

资源详情

资源评论

DRAFT

Speech and Language Processing: An introduction to natural language processing,

computational linguistics, and speech recognition. Daniel Jurafsky & James H. Martin.

permission.

INTRODUCTION

Dave Bowman: Open the pod bay doors, HAL.

HAL: I’m sorry Dave, I’m afraid I can’t do that.

Stanley Kubrick and Arthur C. Clarke,

screenplay of 2001: A Space Odyssey

This book is about a new interdisciplinary ﬁeld variously called computer speech

and language processing or human language technology or natural language pro-

cessing or computational linguistics. The goal of this new ﬁeld is to get computers

to perform useful tasks involving human language, tasks like enabling human-machine

communication, improving human-human communication, or simply doing useful pro-

cessing of text or speech.

One example of a useful such task is a conversational agent. The HAL 9000 com-

CONVERSATIONAL

AGENT

puter in Stanley Kubrick’s ﬁlm 2001: A Space Odyssey is one of the most recognizable

characters in twentieth-century cinema. HAL is an artiﬁcial agent capable of such ad-

vanced language-processing behavior as speaking and understanding English, and at a

crucial moment in the plot, even reading lips. It is now clear that HAL’s creator Arthur

C. Clarke was a little optimistic in predicting when an artiﬁcial agent such as HAL

would be available. But just how far off was he? What would it take to create at least

the language-related parts of HAL? We call programs like HAL that converse with hu-

mans via natural language conversational agents or dialogue systems. In this text we

CONVERSATIONAL

AGENTS

DIALOGUE SYSTEMS

study the various components that make up modern conversational agents, including

language input (automatic speech recognition and natural language understand-

ing) and language output (natural language generation and speech synthesis).

Let’s turn to another useful language-related task, that of making available to non-

English-speaking readers the vast amount of scientiﬁc information on the Web in En-

glish. Or translating for English speakers the hundreds of millions of Web pages written

in other languages like Chinese. The goal of machine translation is to automatically

MACHINE

TRANSLATION

translate a document from one language to another. Machine translation is far from

a solved problem; we will cover the algorithms currently used in the ﬁeld, as well as

important component tasks.

Many other language processing tasks are also related to the Web. Another such

task is Web-based question answering. This is a generalization of simple web search,

QUESTION

ANSWERING

where instead of just typing keywords a user might ask complete questions, ranging

from easy to hard, like the following:

DRAFT

2 Chapter 1. Introduction

• What does “divergent” mean?

• What year was Abraham Lincoln born?

• How many states were in the United States that year?

• How much Chinese silk was exported to England by the end of the 18th century?

• What do scientists think about the ethics of human cloning?

Some of these, such as deﬁnition questions, or simple factoid questions like dates

and locations, can already be answered by search engines. But answering more com-

plicated questions might require extracting information that is embedded in other text

on a Web page, or doing inference (drawing conclusions based on known facts), or

synthesizing and summarizing information from multiple sources or web pages. In this

text we study the various components that make up modern understanding systems of

this kind, including information extraction, word sense disambiguation, and so on.

Although the subﬁelds and problems we’ve described above are all very far from

completely solved, these are all very active research areas and many technologies are

already available commercially. In the rest of this chapter we brieﬂy summarize the

kinds of knowledge that is necessary for these tasks (and others like spell correction,

grammar checking, and so on), as well as the mathematical models that will be intro-

duced throughout the book.

1.1 KNOWLEDGE IN SPEECH AND LANGUAGE PROCESSING

What distinguishes language processing applications from other data processing sys-

tems is their use of knowledge of language. Consider the Unix wc program, which is

used to count the total number of bytes, words, and lines in a text ﬁle. When used to

count bytes and lines, wc is an ordinary data processing application. However, when it

is used to count the words in a ﬁle it requires knowledge about what it means to be a

word, and thus becomes a language processing system.

Of course, wc is an extremely simple system with an extremely limited and im-

poverished knowledge of language. Sophisticated conversational agents like HAL,

or machine translation systems, or robust question-answering systems, require much

broader and deeper knowledge of language. To get a feeling for the scope and kind of

required knowledge, consider some of what HAL would need to know to engage in the

dialogue that begins this chapter, or for a question answering system to answer one of

the questions above.

HAL must be able to recognize words from an audio signal and to generate an

audio signal from a sequence of words. These tasks of speech recognition and speech

synthesis tasks require knowledge about phonetics and phonology; how words are

pronounced in terms of sequences of sounds, and how each of these sounds is realized

acoustically.

Note also that unlike Star Trek’s Commander Data, HAL is capable of producing

contractions like I’m and can’t. Producing and recognizing these and other variations

of individual words (e.g., recognizing that doors is plural) requires knowledge about

morphology, the way words break down into component parts that carry meanings like

singular versus plural.

DRAFT

Section 1.1. Knowledge in Speech and Language Processing 3

Moving beyond individual words, HAL must use structural knowledge to properly

string together the words that constitute its response. For example, HAL must know

that the following sequence of words will not make sense to Dave, despite the fact that

it contains precisely the same set of words as the original.

I’m I do, sorry that afraid Dave I’m can’t.

The knowledge needed to order and group words together comes under the heading of

syntax.

Now consider a question answering system dealing with the following question:

• How much Chinese silk was exported to Western Europe by the end of the 18th

century?

In order to answer this question we need to know something about lexical seman-

tics, the meaning of all the words (export, or silk) as well as compositional semantics

(what exactly constitutes Western Europe as opposed to Eastern or Southern Europe,

what does end mean when combined with the 18th century. We also need to know

something about the relationship of the words to the syntactic structure. For example

we need to know that by the end of the 18th century is a temporal end-point, and not a

description of the agent, as the by-phrase is in the following sentence:

• How much Chinese silk was exported to Western Europe by southern merchants?

We also need the kind of knowledge that lets HAL determine that Dave’s utterance

is a request for action, as opposed to a simple statement about the world or a question

about the door, as in the following variations of his original statement.

REQUEST: HAL, open the pod bay door.

STATEMENT: HAL, the pod bay door is open.

INFORMATION QUESTION: HAL, is the pod bay door open?

Next, despite its bad behavior, HAL knows enough to be polite to Dave. It could,

for example, have simply replied No or No, I won’t open the door. Instead, it ﬁrst

embellishes its response with the phrases I’m sorry and I’m afraid, and then only indi-

rectly signals its refusal by saying I can’t, rather than the more direct (and truthful) I

won’t.

This knowledge about the kind of actions that speakers intend by their use of

sentences is pragmatic or dialogue knowledge.

Another kind of pragmatic or discourse knowledge is required to answer the ques-

tion

• How many states were in the United States that year?

What year is that year? In order to interpret words like that year a question answer-

ing system need to examine the the earlier questions that were asked; in this case the

previous question talked about the year that Lincoln was born. Thus this task of coref-

erence resolution makes use of knowledge about how words like that or pronouns like

it or she refer to previous parts of the discourse.

To summarize, engaging in complex language behavior requires various kinds of

knowledge of language:

For those unfamiliar with HAL, it is neither sorry nor afraid, nor is it incapable of opening the door. It

has simply decided in a ﬁt of paranoia to kill its crew.

DRAFT

4 Chapter 1. Introduction

• Phonetics and Phonology — knowledge about linguistic sounds

• Morphology — knowledge of the meaningful components of words

• Syntax — knowledge of the structural relationships between words

• Semantics — knowledge of meaning

• Pragmatics — knowledge of the relationship of meaning to the goals and inten-

tions of the speaker.

• Discourse — knowledge about linguistic units larger than a single utterance

1.2 AMBIGUITY

A perhaps surprising fact about these categories of linguistic knowledge is that most

tasks in speech and language processing can be viewed as resolving ambiguity at one

AMBIGUITY

of these levels. We say some input is ambiguous if there are multiple alternative lin-

AMBIGUOUS

guistic structures that can be built for it. Consider the spoken sentence I made her duck.

Here’s ﬁve different meanings this sentence could have (see if you can think of some

more), each of which exempliﬁes an ambiguity at some level:

(1.1) I cooked waterfowl for her.

(1.2) I cooked waterfowl belonging to her.

(1.3) I created the (plaster?) duck she owns.

(1.4) I caused her to quickly lower her head or body.

(1.5) I waved my magic wand and turned her into undifferentiated waterfowl.

These different meanings are caused by a number of ambiguities. First, the words duck

and her are morphologically or syntactically ambiguous in their part-of-speech. Duck

can be a verb or a noun, while her can be a dative pronoun or a possessive pronoun.

Second, the word make is semantically ambiguous; it can mean create or cook. Finally,

the verb make is syntactically ambiguous in a different way. Make can be transitive,

that is, taking a single direct object (1.2), or it can be ditransitive, that is, taking two

objects (1.5), meaning that the ﬁrst object (her) got made into the second object (duck).

Finally, make can take a direct object and a verb (1.4), meaning that the object (her) got

caused to perform the verbal action (duck). Furthermore, in a spoken sentence, there

is an even deeper kind of ambiguity; the ﬁrst word could have been eye or the second

word maid.

We will often introduce the models and algorithms we present throughout the book

as ways to resolve or disambiguate these ambiguities. For example deciding whether

duck is a verb or a noun can be solved by part-of-speech tagging. Deciding whether

make means “create” or “cook” can be solved by word sense disambiguation. Reso-

lution of part-of-speech and word sense ambiguities are two important kinds of lexical

disambiguation. A wide variety of tasks can be framed as lexical disambiguation

problems. For example, a text-to-speech synthesis system reading the word lead needs

to decide whether it should be pronounced as in lead pipe or as in lead me on. By

contrast, deciding whether her and duck are part of the same entity (as in (1.1) or (1.4))

or are different entity (as in (1.2)) is an example of syntactic disambiguation and can

DRAFT

Section 1.3. Models and Algorithms 5

be addressed by probabilistic parsing. Ambiguities that don’t arise in this particu-

lar example (like whether a given sentence is a statement or a question) will also be

resolved, for example by speech act interpretation.

1.3 MODELS AND ALGORITHMS

One of the key insights of the last 50 years of research in language processing is that

the various kinds of knowledge described in the last sections can be captured through

the use of a small number of formal models, or theories. Fortunately, these models and

theories are all drawn from the standard toolkits of computer science, mathematics, and

linguistics and should be generally familiar to those trained in those ﬁelds. Among the

most important models are state machines, rule systems, logic, probabilistic models,

and vector-space models. These models, in turn, lend themselves to a small number

of algorithms, among the most important of which are state space search algorithms

such as dynamic programming, and machine learning algorithms such as classiﬁers

and EM and other learning algorithms.

In their simplest formulation, state machines are formal models that consist of

states, transitions among states, and an input representation. Some of the variations

of this basic model that we will consider are deterministic and non-deterministic

ﬁnite-state automata and ﬁnite-state transducers.

Closely related to these models are their declarative counterparts: formal rule sys-

tems. Among the more important ones we will consider are regular grammars and

regular relations, context-free grammars, feature-augmented grammars, as well

as probabilistic variants of them all. State machines and formal rule systems are the

main tools used when dealing with knowledge of phonology, morphology, and syntax.

The third model that plays a critical role in capturing knowledge of language is

logic. We will discuss ﬁrst order logic, also known as the predicate calculus, as well

as such related formalisms as lambda-calculus, feature-structures, and semantic primi-

tives. These logical representations have traditionally been used for modeling seman-

tics and pragmatics, although more recent work has focused on more robust techniques

drawn from non-logical lexical semantics.

Probabilistic models are crucial for capturing every kind of linguistic knowledge.

Each of the other models (state machines, formal rule systems, and logic) can be aug-

mented with probabilities. For example the state machine can be augmented with

probabilities to become the weighted automaton or Markov model. We will spend

a signiﬁcant amount of time on hidden Markov models or HMMs, which are used

everywhere in the ﬁeld, in part-of-speech tagging, speech recognition, dialogue under-

standing, text-to-speech, and machine translation. The key advantage of probabilistic

models is their ability to to solve the many kinds of ambiguity problems that we dis-

cussed earlier; almost any speech and language processing problem can be recast as:

“given N choices for some ambiguous input, choose the most probable one”.

Finally, vector-space models, based on linear algebra, underlie information retrieval

and many treatments of word meanings.

Processing language using any of these models typically involves a search through

剩余883页未读，继续阅读

评论收藏

内容反馈

yning

2014-09-11

可惜不是最新版
yangyinfei

2012-02-09

是从第一章开始的啊。没有目录，不过PDF有带索引。另外第一章是开放的，网上随处可以下到。英文版，还是不错的。支持下载。
wuguize

2014-05-20

不是第二版，只是草稿，而且缺页。
wsh00086

2014-11-16

坑爹啊正好没有我需要的18章...
windresser

2014-09-21

不是正式出版的，只是手稿，不过先留着备用吧。