encodingsentenceswithgraphconvolutionalnetworksforsemanticrolelabeling资源-CSDN文库

需积分: 10 178 浏览量 2018-01-27 10:33:54 上传评论收藏 622KB PDF 举报

### 编码句子与图卷积网络在语义角色标注中的应用 #### 摘要本文探讨了如何利用图卷积网络（Graph Convolutional Networks, GCNs）对句子进行编码，进而应用于语义角色标注（Semantic Role Labeling, SRL）。语义角色标注是一项重要的自然语言处理任务，其目标是识别出句子中的谓词-论元结构。通过分析给定的文章摘要部分，我们可以总结出以下几个关键知识点： 1. **语义角色标注（SRL）**：SRL 是一项识别句子中谓词及其论元结构的任务。它通常被视为标准自然语言处理管道中的一个重要步骤，并为下游任务如信息抽取和问答系统提供有用的信息。 2. **语法信息的应用**：由于语义表示与语法结构密切相关，研究者们利用语法信息来增强模型的表现。文章中提到，通过利用句法依赖性信息，可以更有效地进行语义分析。 3. **图卷积网络（GCNs）**：GCNs 是一种新型的神经网络架构，可以在图结构数据上运行。在这项研究中，GCNs 被用于建模句法依赖图，从而对句子进行编码。通过这种方式，每个单词都可以被转换成一个潜在的特征表示。 4. **GCNs 与 LSTM 的结合**：研究发现，GCN 层与 LSTM 层互补。当这两种类型的层组合在一起时，相比于仅使用 LSTM 的模型，可以获得显著的性能提升。这一发现表明，在处理序列数据时，图卷积网络能够捕捉到 LSTM 可能遗漏的信息。 5. **实验结果**：通过在 CoNLL-2009 标准基准上进行评估，该模型在中文和英文上的表现都超过了先前的最佳成绩，证明了方法的有效性。 #### 引言 - **语义角色标注的定义**：语义角色标注任务主要包括三个子任务： - **谓词检测**：识别出句子中的谓词（例如，“makes”）。 - **谓词标记**：根据预定的词汇表（sense inventory），为谓词分配语义标记（例如，“make.01”）。 - **论元识别与角色分配**：识别谓词的论元，并将其标记为特定的角色（例如，Sequa 是 A0，即执行动作的主体；engines 是 A1，即受到动作影响的对象）。 - **语法与语义的关系**：尽管语法与语义之间的关系复杂，但二者之间存在着密切的联系。文章通过一个示例句子 “Sequa makes and repairs jet engines.” 解释了这种关系。在这个例子中，Sequa 是谓词 “makes” 和 “repairs” 的 A0 论元，而 jet engines 分别作为两个谓词的 A1 论元。本文提出了一种利用图卷积网络对句子进行编码的新方法，并将这种方法成功地应用于语义角色标注任务中。实验结果显示，这种方法不仅有效提高了模型的性能，还为进一步探索语法和语义之间的关系提供了新的视角。

资源推荐

资源详情

资源评论

Encoding Sentences with Graph Convolutional Networks

for Semantic Role Labeling

Diego Marcheggiani

Ivan Titov

1,2

ILLC, University of Amsterdam

ILCC, School of Informatics, University of Edinburgh

marcheggiani@uva.nl ititov@inf.ed.ac.uk

Abstract

Semantic role labeling (SRL) is the task of

identifying the predicate-argument struc-

ture of a sentence. It is typically re-

garded as an important step in the stan-

dard NLP pipeline. As the semantic rep-

resentations are closely related to syntac-

tic ones, we exploit syntactic information

in our model. We propose a version of

graph convolutional networks (GCNs), a

recent class of neural networks operating

on graphs, suited to model syntactic de-

pendency graphs. GCNs over syntactic de-

pendency trees are used as sentence en-

coders, producing latent feature represen-

tations of words in a sentence. We ob-

serve that GCN layers are complementary

to LSTM ones: when we stack both GCN

and LSTM layers, we obtain a substantial

improvement over an already state-of-the-

art LSTM SRL model, resulting in the best

reported scores on the standard benchmark

(CoNLL-2009) both for Chinese and En-

glish.

1 Introduction

Semantic role labeling (SRL) (Gildea and Juraf-

sky, 2002) can be informally described as the task

of discovering who did what to whom. For ex-

ample, consider an SRL dependency graph shown

above the sentence in Figure 1. Formally, the task

includes (1) detection of predicates (e.g., makes);

(2) labeling the predicates with a sense from a

sense inventory (e.g., make.01); (3) identifying

and assigning arguments to semantic roles (e.g.,

Sequa is A0, i.e., an agent / ‘doer’ for the corre-

sponding predicate, and engines is A1, i.e., a pa-

tient / ‘an affected entity’). SRL is often regarded

Sequa makes and repairs jet engines.

SBJ

COORD

OBJ

CONJ NMOD

ROOT

make.01

repair.01 engine.01

Figure 1: An example sentence annotated with se-

mantic (top) and syntactic dependencies (bottom).

as an important step in the standard NLP pipeline,

providing information to downstream tasks such

as information extraction and question answering.

The semantic representations are closely re-

lated to syntactic ones, even though the syntax-

semantics interface is far from trivial (Levin,

1993). For example, one can observe that many

arcs in the syntactic dependency graph (shown in

black below the sentence in Figure 1) are mir-

rored in the semantic dependency graph. Given

these similarities and also because of availability

of accurate syntactic parsers for many languages,

it seems natural to exploit syntactic information

when predicting semantics. Though historically

most SRL approaches did rely on syntax (Thomp-

son et al., 2003; Pradhan et al., 2005; Punyakanok

et al., 2008; Johansson and Nugues, 2008), the last

generation of SRL models put syntax aside in fa-

vor of neural sequence models, namely LSTMs

(Zhou and Xu, 2015; Marcheggiani et al., 2017),

and outperformed syntactically-driven methods on

standard benchmarks. We believe that one of the

reasons for this radical choice is the lack of sim-

ple and effective methods for incorporating syn-

tactic information into sequential neural networks

(namely, at the level of words). In this paper we

arXiv:1703.04826v4 [cs.CL] 30 Jul 2017

propose one way how to address this limitation.

Speciﬁcally, we rely on graph convolutional

networks (GCNs) (Duvenaud et al., 2015; Kipf

and Welling, 2017; Kearnes et al., 2016), a recent

class of multilayer neural networks operating on

graphs. For every node in the graph (in our case

a word in a sentence), GCN encodes relevant in-

formation about its neighborhood as a real-valued

feature vector. GCNs have been studied largely in

the context of undirected unlabeled graphs. We in-

troduce a version of GCNs for modeling syntactic

dependency structures and generally applicable to

labeled directed graphs.

One layer GCN encodes only information about

immediate neighbors and K layers are needed

to encode K-order neighborhoods (i.e., informa-

tion about nodes at most K hops aways). This

contrasts with recurrent and recursive neural net-

works (Elman, 1990; Socher et al., 2013) which, at

least in theory, can capture statistical dependencies

across unbounded paths in a trees or in a sequence.

However, as we will further discuss in Section 3.3,

this is not a serious limitation when GCNs are used

in combination with encoders based on recurrent

networks (LSTMs). When we stack GCNs on top

of LSTM layers, we obtain a substantial improve-

ment over an already state-of-the-art LSTM SRL

model, resulting in the best reported scores on the

standard benchmark (CoNLL-2009), both for En-

glish and Chinese.

Interestingly, again unlike recursive neural net-

works, GCNs do not constrain the graph to be

a tree. We believe that there are many applica-

tions in NLP, where GCN-based encoders of sen-

tences or even documents can be used to incor-

porate knowledge about linguistic structures (e.g.,

representations of syntax, semantics or discourse).

For example, GCNs can take as input combined

syntactic-semantic graphs (e.g., the entire graph

from Figure 1) and be used within downstream

tasks such as machine translation or question an-

swering. However, we leave this for future work

and here solely focus on SRL.

The contributions of this paper can be summa-

rized as follows:

• we are the ﬁrst to show that GCNs are effec-

tive for NLP;

• we propose a generalization of GCNs suited

The code is available at https://github.com/

diegma/neural-dep-srl.

to encoding syntactic information at word

level;

• we propose a GCN-based SRL model and

obtain state-of-the-art results on English and

Chinese portions of the CoNLL-2009 dataset;

• we show that bidirectional LSTMs and

syntax-based GCNs have complementary

modeling power.

2 Graph Convolutional Networks

In this section we describe GCNs of Kipf and

Welling (2017). Please refer to Gilmer et al.

(2017) for a comprehensive overview of GCN ver-

sions.

GCNs are neural networks operating on graphs

and inducing features of nodes (i.e., real-valued

vectors / embeddings) based on properties of their

neighborhoods. In Kipf and Welling (2017), they

were shown to be very effective for the node clas-

siﬁcation task: the classiﬁer was estimated jointly

with a GCN, so that the induced node features

were informative for the node classiﬁcation prob-

lem. Depending on how many layers of convolu-

tion are used, GCNs can capture information only

about immediate neighbors (with one layer of con-

volution) or any nodes at most K hops aways (if

K layers are stacked on top of each other).

More formally, consider an undirected graph

G = (V, E), where V (|V | = n) and E are

sets of nodes and edges, respectively. Kipf and

Welling (2017) assume that edges contain all the

self-loops, i.e., (v, v) ∈ E for any v. We can de-

ﬁne a matrix X ∈ R

m×n

with each its column

∈ R

(v ∈ V) encoding node features. The

vectors can either encode genuine features (e.g.,

this vector can encode the title of a paper if citation

graphs are considered) or be a one-hot vector. The

node representation, encoding information about

its immediate neighbors, is computed as

= ReLU





u∈N (v)

(W x

+ b)





, (1)

where W ∈ R

m×m

and b ∈ R

are a weight ma-

trix and a bias, respectively; N(v) are neighbors

of v; ReLU is the rectiﬁer linear unit activation

function.

Note that v ∈ N(v) (because of self-

loops), so the input feature representation of v (i.e.

) affects its induced representation h

We dropped normalization factors used in Kipf and

Welling (2017), as they are not used in our syntactic GCNs.

剩余10页未读，继续阅读

评论收藏

内容反馈

樱夕夕

粉丝: 45
资源: 3

encoding sentences with graph convolutional networks for semanti...

最新资源

encoding sentences with graph convolutional networks for semanti...

多标签分类问题multi-label recognition

tensorflow构建图像多标签分类器入门文件

Recurrent Convolutional Neural Networks for Text Classification

A convolutional neural network for modelling sentences论文及翻译

Deep Visual-Semantic Alignments for Generating Image Descriptions

The Analysis of Sentences Containing Words with Multiple Heads based on Chinese Semantic Dependency Graph

Text Generation from Knowledge Graphs with Graph Transformers.pdf

近期必读的5篇AI顶会CVPR 2020（图神经网络GNN) 相关论文.zip

自然语言处理.zip

论文研究-Sentiment Analysis of Complex Sentences for Chinese Document.pdf

a-few-sentences-for-success.zip_Success

Natural Language Processing with PyTorch

Go-sentences-一个多语种的命令行句子分词器用于将文本转换成一组句子

A Guide to Writing Sentences and Paragraphs 2009(8th Edition)

Packt.Natural.Language.Processing.with.Java.Cookbook..rar

Python库 | corenlp-python-3.4.1-0.tar.gz

Python Natural Language Processing

Natural.Language.Processing.with.Java.178439179

1000 sentences of english

stanford-corenlp-full-2018-01-31.zip

Rewrite these sentences

大学英语4级模拟题（2）

unix练习2-with key

Distributed Representations of Sentences and Documents

2015高考英语专项复习完成句子+写作训练7

GENERALIZATION OF COMPUTER-ASSISTED PROSODY TRAINING

最新资源