【免费】2020年9月13日-GCN原始论文1资源-CSDN文库

需积分: 0 170 浏览量 2022-08-03 22:19:06 上传评论收藏 1.13MB PDF 举报

GCN原始论文解读 GCN（Graph Convolutional Networks，图卷积神经网络）是一种基于图结构的半监督学习方法，于2017年被提出。该方法基于一种高效的卷积神经网络变体，可以直接操作图结构数据。该方法可以 Scale 线性地扩展到大量的图边缘，并学习到隐藏层表示，同时捕捉到图结构和节点特征。 GCN 的提出是为了解决半监督学习问题，即在图结构数据中只有少数节点拥有标签，而大多数节点没有标签。该方法通过将标签信息平滑到图结构中，以便学习到节点之间的关系。 GCN 方法可以 outperform 相关方法，具有广泛的应用前景。 GCN 的关键点在于它可以直接操作图结构数据，而不需要将图结构转换为固定维度的向量表示。这种方法可以 Capture 图结构的本质特征，如节点之间的连接关系和图结构的拓扑特征。 GCN 的架构主要由以下几部分组成： 1. 图卷积层（Graph Convolutional Layer）：该层使用图结构信息对节点特征进行卷积运算，以捕捉到图结构的特征。 2.(pooling)：该层将图卷积层的输出结果进行池化，以减少维度和捕捉到更高阶的特征。 3. 全连接层（Fully Connected Layer）：该层将池化后的结果输入到全连接网络中，以进行分类或回归任务。 GCN 的优点在于： 1. 可以直接操作图结构数据，无需将图结构转换为固定维度的向量表示。 2. 可以 Capture 图结构的本质特征，如节点之间的连接关系和图结构的拓扑特征。 3. 可以 Scale 线性地扩展到大量的图边缘。 GCN 的应用前景广泛，包括： 1. 社交网络分析：GCN 可以用于分析社交网络中的节点关系和群体特征。 2. 知识图谱分析：GCN 可以用于分析知识图谱中的实体关系和知识特征。 3. 计算机视觉：GCN 可以用于图像 segmentation 和图像分类任务。 GCN 是一种强大的半监督学习方法，可以 Capture 图结构的本质特征，并可以应用于各种领域。

资源详情

资源评论

资源推荐

Published as a conference paper at ICLR 2017

SEMI-SUPERVISED CLASSIFICATION WITH

GRAPH CONVOLUTIONAL NETWORKS

Thomas N. Kipf

University of Amsterdam

T.N.Kipf@uva.nl

Max Welling

University of Amsterdam

Canadian Institute for Advanced Research (CIFAR)

M.Welling@uva.nl

ABSTRACT

We present a scalable approach for semi-supervised learning on graph-structured

data that is based on an efﬁcient variant of convolutional neural networks which

operate directly on graphs. We motivate the choice of our convolutional archi-

tecture via a localized ﬁrst-order approximation of spectral graph convolutions.

Our model scales linearly in the number of graph edges and learns hidden layer

representations that encode both local graph structure and features of nodes. In

a number of experiments on citation networks and on a knowledge graph dataset

we demonstrate that our approach outperforms related methods by a signiﬁcant

margin.

1 INTRODUCTION

We consider the problem of classifying nodes (such as documents) in a graph (such as a citation

network), where labels are only available for a small subset of nodes. This problem can be framed

as graph-based semi-supervised learning, where label information is smoothed over the graph via

some form of explicit graph-based regularization (Zhu et al., 2003; Zhou et al., 2004; Belkin et al.,

2006; Weston et al., 2012), e.g. by using a graph Laplacian regularization term in the loss function:

L = L

+ λL

reg

, with L

reg

i,j

kf(X

) − f(X

= f(X)

∆f(X) . (1)

Here, L

denotes the supervised loss w.r.t. the labeled part of the graph, f(·) can be a neural network-

like differentiable function, λ is a weighing factor and X is a matrix of node feature vectors X

∆ = D − A denotes the unnormalized graph Laplacian of an undirected graph G = (V, E) with

N nodes v

∈ V, edges (v

, v

) ∈ E, an adjacency matrix A ∈ R

N×N

(binary or weighted) and

a degree matrix D

. The formulation of Eq. 1 relies on the assumption that connected

nodes in the graph are likely to share the same label. This assumption, however, might restrict

modeling capacity, as graph edges need not necessarily encode node similarity, but could contain

additional information.

In this work, we encode the graph structure directly using a neural network model f(X, A) and

train on a supervised target L

for all nodes with labels, thereby avoiding explicit graph-based

regularization in the loss function. Conditioning f (·) on the adjacency matrix of the graph will

allow the model to distribute gradient information from the supervised loss L

and will enable it to

learn representations of nodes both with and without labels.

Our contributions are two-fold. Firstly, we introduce a simple and well-behaved layer-wise prop-

agation rule for neural network models which operate directly on graphs and show how it can be

motivated from a ﬁrst-order approximation of spectral graph convolutions (Hammond et al., 2011).

Secondly, we demonstrate how this form of a graph-based neural network model can be used for

fast and scalable semi-supervised classiﬁcation of nodes in a graph. Experiments on a number of

datasets demonstrate that our model compares favorably both in classiﬁcation accuracy and efﬁ-

ciency (measured in wall-clock time) against state-of-the-art methods for semi-supervised learning.

arXiv:1609.02907v4 [cs.LG] 22 Feb 2017

Published as a conference paper at ICLR 2017

2 FAST APPROXIMATE CONVOLUTIONS ON GRAPHS

In this section, we provide theoretical motivation for a speciﬁc graph-based neural network model

f(X, A) that we will use in the rest of this paper. We consider a multi-layer Graph Convolutional

Network (GCN) with the following layer-wise propagation rule:

(l+1)

= σ



−

(l)



. (2)

Here,

A = A + I

is the adjacency matrix of the undirected graph G with added self-connections.

is the identity matrix,

and W

(l)

is a layer-speciﬁc trainable weight matrix. σ(·)

denotes an activation function, such as the ReLU(·) = max(0, ·). H

(l)

∈ R

N×D

is the matrix of ac-

tivations in the l

layer; H

(0)

= X. In the following, we show that the form of this propagation rule

can be motivated

via a ﬁrst-order approximation of localized spectral ﬁlters on graphs (Hammond

et al., 2011; Defferrard et al., 2016).

2.1 SPECTRAL GRAPH CONVOLUTIONS

We consider spectral convolutions on graphs deﬁned as the multiplication of a signal x ∈ R

scalar for every node) with a ﬁlter g

= diag(θ) parameterized by θ ∈ R

in the Fourier domain,

i.e.:

? x = Ug

x , (3)

where U is the matrix of eigenvectors of the normalized graph Laplacian L = I

− D

−

UΛU

, with a diagonal matrix of its eigenvalues Λ and U

x being the graph Fourier transform

of x. We can understand g

as a function of the eigenvalues of L, i.e. g

(Λ). Evaluating Eq. 3 is

computationally expensive, as multiplication with the eigenvector matrix U is O(N

). Furthermore,

computing the eigendecomposition of L in the ﬁrst place might be prohibitively expensive for large

graphs. To circumvent this problem, it was suggested in Hammond et al. (2011) that g

(Λ) can be

well-approximated by a truncated expansion in terms of Chebyshev polynomials T

(x) up to K

order:

(Λ) ≈

k=0

(

Λ) , (4)

with a rescaled

Λ =

max

Λ − I

. λ

max

denotes the largest eigenvalue of L. θ

∈ R

is now a

vector of Chebyshev coefﬁcients. The Chebyshev polynomials are recursively deﬁned as T

(x) =

2xT

k−1

(x) − T

k−2

(x), with T

(x) = 1 and T

(x) = x. The reader is referred to Hammond et al.

(2011) for an in-depth discussion of this approximation.

Going back to our deﬁnition of a convolution of a signal x with a ﬁlter g

, we now have:

? x ≈

k=0

(

L)x , (5)

with

L =

max

L − I

; as can easily be veriﬁed by noticing that (U ΛU

)

= U Λ

. Note that

this expression is now K-localized since it is a K

-order polynomial in the Laplacian, i.e. it depends

only on nodes that are at maximum K steps away from the central node (K

-order neighborhood).

The complexity of evaluating Eq. 5 is O(|E|), i.e. linear in the number of edges. Defferrard et al.

(2016) use this K-localized convolution to deﬁne a convolutional neural network on graphs.

2.2 LAYER-WISE LINEAR MODEL

A neural network model based on graph convolutions can therefore be built by stacking multiple

convolutional layers of the form of Eq. 5, each layer followed by a point-wise non-linearity. Now,

imagine we limited the layer-wise convolution operation to K = 1 (see Eq. 5), i.e. a function that is

linear w.r.t. L and therefore a linear function on the graph Laplacian spectrum.

We provide an alternative interpretation of this propagation rule based on the Weisfeiler-Lehman algorithm

(Weisfeiler & Lehmann, 1968) in Appendix A.

特征向量

Published as a conference paper at ICLR 2017

In this way, we can still recover a rich class of convolutional ﬁlter functions by stacking multiple

such layers, but we are not limited to the explicit parameterization given by, e.g., the Chebyshev

polynomials. We intuitively expect that such a model can alleviate the problem of overﬁtting on

local neighborhood structures for graphs with very wide node degree distributions, such as social

networks, citation networks, knowledge graphs and many other real-world graph datasets. Addition-

ally, for a ﬁxed computational budget, this layer-wise linear formulation allows us to build deeper

models, a practice that is known to improve modeling capacity on a number of domains (He et al.,

2016).

In this linear formulation of a GCN we further approximate λ

max

≈ 2, as we can expect that neural

network parameters will adapt to this change in scale during training. Under these approximations

Eq. 5 simpliﬁes to:

? x ≈ θ

x + θ

(L − I

) x = θ

x − θ

−

x , (6)

with two free parameters θ

and θ

. The ﬁlter parameters can be shared over the whole graph.

Successive application of ﬁlters of this form then effectively convolve the k

-order neighborhood of

a node, where k is the number of successive ﬁltering operations or convolutional layers in the neural

network model.

In practice, it can be beneﬁcial to constrain the number of parameters further to address overﬁtting

and to minimize the number of operations (such as matrix multiplications) per layer. This leaves us

with the following expression:

? x ≈ θ



+ D

−



x , (7)

with a single parameter θ = θ

= −θ

. Note that I

+ D

−

now has eigenvalues in

the range [0, 2]. Repeated application of this operator can therefore lead to numerical instabilities

and exploding/vanishing gradients when used in a deep neural network model. To alleviate this

problem, we introduce the following renormalization trick: I

−

→

−

, with

A = A + I

and

We can generalize this deﬁnition to a signal X ∈ R

N×C

with C input channels (i.e. a C-dimensional

feature vector for every node) and F ﬁlters or feature maps as follows:

Z =

−

XΘ , (8)

where Θ ∈ R

C×F

is now a matrix of ﬁlter parameters and Z ∈ R

N×F

is the convolved signal

matrix. This ﬁltering operation has complexity O(|E|F C), as

AX can be efﬁciently implemented

as a product of a sparse matrix with a dense matrix.

3 SEMI-SUPERVISED NODE CLASSIFICATION

Having introduced a simple, yet ﬂexible model f(X, A) for efﬁcient information propagation on

graphs, we can return to the problem of semi-supervised node classiﬁcation. As outlined in the in-

troduction, we can relax certain assumptions typically made in graph-based semi-supervised learn-

ing by conditioning our model f (X, A) both on the data X and on the adjacency matrix A of the

underlying graph structure. We expect this setting to be especially powerful in scenarios where the

adjacency matrix contains information not present in the data X, such as citation links between doc-

uments in a citation network or relations in a knowledge graph. The overall model, a multi-layer

GCN for semi-supervised learning, is schematically depicted in Figure 1.

3.1 EXAMPLE

In the following, we consider a two-layer GCN for semi-supervised node classiﬁcation on a graph

with a symmetric adjacency matrix A (binary or weighted). We ﬁrst calculate

A =

−

a pre-processing step. Our forward model then takes the simple form:

Z = f(X, A) = softmax



A ReLU



AXW

(0)



(1)



. (9)

剩余13页未读，继续阅读

评论收藏

内容反馈

白小俗

粉丝: 37
资源: 302

2020年9月13日-GCN原始论文1

评论0

最新资源

2020年9月13日-GCN原始论文1

评论0

2020年9月13日-GCN-实例1

mp3splt:提供mp3，ogg vorbis，FLAC和其他音频格式的帧精确拆分，而无需解码或重新编码。 该项目于2020-09-13从原始Sourceforge项目迁移而来

ST-GCN基于图卷积的行为识别修改模型文件

2020年9月15日-GCN长什么样1

基于时空图卷积ST-GCN的骨骼动作识别python源码+项目说明.zip

基于时空图卷积（ST-GCN）的骨骼动作识别（python源码+项目说明）.zip

T-GCN（图卷积神经网络-交通流预测）（代码）.zip

基于时空图卷积（ST-GCN）的骨骼动作识别（python源码+项目说明）高分项目

基于时空图卷积ST-GCN的骨骼动作识别毕设项目源码+模型+示例效果.zip

EVA-GCN-main.zip

ST-GCN 论文原文及解读

人工智能AI源代码解析-AM-GCN：自适应多通道图卷积网络

st-gcn时空图卷积神经网络

2019-STAR-GCN, Stacked and Reconstructed Graph Convolutional Net

python实现基于EEG的抑郁症诊断模型（SSPA-GCN）源码.zip

geo-gcn-master（2023.2.5，未）.zip

2019-KDD-Cluster-GCN, An Efficient Algorithm for Training Deep a

课程设计基于EEG的抑郁症诊断模型python源码(SSPA-GCN).zip

python毕业设计-基于时空图卷积（ST-GCN）的骨骼动作识别+源代码+文档说明

Cluster-GCN An Efficient Algorithm for Training Deep and Large Graph...ppt pdf

基于时空图卷积（ST-GCN）的骨骼动作识别（python源码+毕业设计）.zip

基于时空图卷积ST-GCN的骨骼动作识别毕设项目源码+模型+示例效果（高分项目）.zip

T-GCN（图卷积神经网络-交通流预测）.zip

ASTGCN_AST-GCN_PEMS08_PEMS深度学习_python交通_PEMS04_源码.rar.rar

2020年9月13日-图神经网络 GNN 之图卷积网络1

毕业设计代码，基于时空图卷积（ST-GCN）的骨骼动作识别.zip

基于时空图卷积（ST-GCN）的骨骼动作识别python源码+项目说明（高分毕设）.zip

最新资源

mp3splt:提供mp3，ogg vorbis，FLAC和其他音频格式的帧精确拆分，而无需解码或重新编码。该项目于2020-09-13从原始Sourceforge项目迁移而来