没有合适的资源?快使用搜索试试~ 我知道了~
2019-清华-Batch Virtual Adversarial Training for Graph Convolution
需积分: 0 0 下载量 137 浏览量
2022-08-04
12:38:53
上传
评论
收藏 363KB PDF 举报
温馨提示
试读
9页
Batch Virtual Adversarial Training forZhijie Deng, Yinpeng Dong and Jun ZhuCente
资源详情
资源评论
资源推荐
Batch Virtual Adversarial Training for
Graph Convolutional Networks
Zhijie Deng, Yinpeng Dong and Jun Zhu
Dept. of Comp. Sci. & Tech., State Key Lab of Intell. Tech. & Sys., TNList Lab,
Center for Bio-Inspired Computing Research, Tsinghua University, Beijing, 100084, China
Abstract
We present batch virtual adversarial training (BVAT), a
novel regularization method for graph convolutional networks
(GCNs). BVAT addresses the shortcoming of GCNs that do
not consider the smoothness of the model’s output distribu-
tion against local perturbations around the input. We propose
two algorithms, sample-based BVAT and optimization-based
BVAT, which are suitable to promote the smoothness of the
model for graph-structured data by either finding virtual adver-
sarial perturbations for a subset of nodes far from each other
or generating virtual adversarial perturbations for all nodes
with an optimization process. Extensive experiments on three
citation network datasets Cora, Citeseer and Pubmed and a
knowledge graph dataset Nell validate the effectiveness of the
proposed method, which establishes state-of-the-art results in
the semi-supervised node classification tasks.
1 Introduction
Recent models for graph-structured data (Kipf and Welling
2017; Hamilton, Ying, and Leskovec 2017; Veli
ˇ
ckovi
´
c et
al
.
2018) demonstrate remarkable performance in semi-
supervised node classification tasks. These methods essen-
tially adopt different aggregators to aggregate feature infor-
mation from the neighborhood of a node. The aggregators
take the connectivity patterns and node features into con-
sideration and enable the information propagation through
edges (e.g., the gradient information can be distributed from
nodes with labels to nodes without labels). By aggregating
the features of a node with its nearby neighbors, the mod-
els can promote the smoothness between nodes in a neigh-
borhood, which is helpful for semi-supervised node classi-
fication based on the assumption that connected nodes in
the graph are likely to have similar representations (Kipf
and Welling 2017). However, these methods only consider
the smoothness between nodes in a neighborhood without
considering the smoothness of the output distribution. Sev-
eral studies have confirmed that smoothing the output dis-
tribution of neural networks (i.e., encourage the neural net-
works to produce similar outputs) against local perturba-
tions around the input can improve the generalization perfor-
mance in supervised and especially semi-supervised learn-
ing (Wager, Wang, and Liang 2013; Sajjadi, Javanmardi, and
Tasdizen 2016; Laine and Aila 2017; Miyato et al
.
2017;
Luo et al
.
2018). Moreover, it’s crucial to encourage the
smoothness of the output distribution for aggregator-based
graph models since that the receptive field (e.g., Fig. 1a)
of a single node grows exponentially with respect to the
number of aggregators in a model (Chen and Zhu 2017), and
neural networks tend to be non-smooth with such high dimen-
sional input space (Goodfellow, Shlens, and Szegedy 2015;
Peck et al
.
2017). Therefore, it is necessary for aggregator-
based graph models to encourage the
smoothness of their
output distribution
in addition to the smoothness in the
neighborhood.
In this work, we focus on graph convolutional networks
(GCNs) (Kipf and Welling 2017), a typical and effective
instance of aggregator-based graph learning models, and
the proposed algorithms are generally applicable for other
models such as graph attention networks (Veli
ˇ
ckovi
´
c et al
.
2018), GraphSAGE (Hamilton, Ying, and Leskovec 2017),
etc. GCN model has fixed aggregators based on the adjacency
matrix of input graph and is a special form of Laplacian
smoothing (Li, Han, and Wu 2018). Considering an undi-
rected graph
1
G = (V, E)
with
N = |V|
nodes and
|E|
edges,
GCNs are built upon the symmetric sparse adjacency matrix
A ∈ R
N×N
. The input feature matrix of all nodes in
V
is
denoted as
X ∈ R
N×D
where
D
is the feature dimension
of each node. GCNs use a normalized version of
A
as the
propagation matrix P
˜
A = A + I,
˜
D
ii
=
X
j
˜
A
ij
and P =
˜
D
−
1
2
˜
A
˜
D
−
1
2
.
In a
K
-layer GCN, each layer aggregates and processes infor-
mation from itself and its nearby neighbors with the following
propagation rule
H
(0)
= X, H
(l+1)
= σ(P H
(l)
W
(l)
), (1)
where
H
(l)
is the activation matrix of the
l
-th layer,
W
(l)
is
the trainable transformation matrix and
σ
denotes an activa-
tion function. Particularly, the prediction of a node
u
depends
on the input features of nodes in its receptive field (as shown
in Fig. 1a), whose dimension grows exponentially with re-
spect to the number of layers. The GCN model tends to be
non-smooth with high dimensional input space, so it’s neces-
sary to add regularization terms to promote the smoothness
1
GCNs represent a directed graph as an undirected bipartite
graph with additional nodes that represent edges in the original
graph.
arXiv:1902.09192v1 [cs.LG] 25 Feb 2019
一曲歌长安
- 粉丝: 48
- 资源: 302
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于opencv的dnn模块实现Yolo-Fastest的目标检测python源码+模型+说明(高分项目).zip
- 使用Python调用微信本地ocr服务.zip
- 【精品推荐】人工智能在医疗中的应用.pptx
- 【精品推荐】电子医疗仪器人机接口-(1).ppt
- 【精品推荐】电子医疗仪器人机接口.ppt
- ubuntu镜像ubuntu镜像01
- 基于paddle搭建神经网络实现5种水果识别分类python源码+数据集(高分毕设).zip
- 【精品推荐】电子商务网店类型介绍.ppt
- 基于paddle搭建神经网络实现水果识别分类python源码+数据集(高分项目).zip
- 三菱plc编程口通信学习笔记.doc
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0