没有合适的资源?快使用搜索试试~ 我知道了~
2018-港-Deeper Insights into Graph Convolutional Networks for Sem
需积分: 0 0 下载量 98 浏览量
2022-08-04
13:42:40
上传
评论
收藏 1.92MB PDF 举报
温馨提示
试读
9页
Deeper Insights into Graph Convolutional NetworksQimai Li1, Zhichao Han12, Xiao-
资源详情
资源评论
资源推荐
Deeper Insights into Graph Convolutional Networks
for Semi-Supervised Learning
Qimai Li
1
, Zhichao Han
12
, Xiao-Ming Wu
1∗
1
The Hong Kong Polytechnic University
2
ETH Zurich
Abstract
Many interesting problems in machine learning are being
revisited with new deep learning tools. For graph-based semi-
supervised learning, a recent important development is graph
convolutional networks (GCNs), which nicely integrate local
vertex features and graph topology in the convolutional lay-
ers. Although the GCN model compares favorably with other
state-of-the-art methods, its mechanisms are not clear and it
still requires considerable amount of labeled data for valida-
tion and model selection.
In this paper, we develop deeper insights into the GCN model
and address its fundamental limits. First, we show that the
graph convolution of the GCN model is actually a special
form of Laplacian smoothing, which is the key reason why
GCNs work, but it also brings potential concerns of over-
smoothing with many convolutional layers. Second, to over-
come the limits of the GCN model with shallow architectures,
we propose both co-training and self-training approaches to
train GCNs. Our approaches significantly improve GCNs in
learning with very few labels, and exempt them from requir-
ing additional labels for validation. Extensive experiments on
benchmarks have verified our theory and proposals.
1 Introduction
The breakthroughs in deep learning have led to a paradigm
shift in artificial intelligence and machine learning. On the
one hand, numerous old problems have been revisited with
deep neural networks and huge progress has been made in
many tasks previously seemed out of reach, such as machine
translation and computer vision. On the other hand, new
techniques such as geometric deep learning (Bronstein et al.
2017) are being developed to generalize deep neural models
to new or non-traditional domains.
It is well known that training a deep neural model typi-
cally requires a large amount of labeled data, which cannot
be satisfied in many scenarios due to the high cost of labeling
training data. To reduce the amount of data needed for train-
ing, a recent surge of research interest has focused on few-
shot learning (Lake, Salakhutdinov, and Tenenbaum 2015;
Rezende et al. 2016) – to learn a classification model with
very few examples from each class. Closely related to few-
shot learning is semi-supervised learning, where a large
∗
Corresponding author.
Copyright
c
2018, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
amount of unlabeled data can be utilized to train with typi-
cally a small amount of labeled data.
Many researches have shown that leveraging unlabeled
data in training can improve learning accuracy significantly
if used properly (Zhu and Goldberg 2009). The key issue is
to maximize the effective utilization of structural and fea-
ture information of unlabeled data. Due to the powerful fea-
ture extraction capability and recent success of deep neu-
ral networks, there have been some successful attempts to
revisit semi-supervised learning with neural-network-based
models, including ladder network (Rasmus et al. 2015),
semi-supervised embedding (Weston et al. 2008), planetoid
(Yang, Cohen, and Salakhutdinov 2016), and graph convo-
lutional networks (Kipf and Welling 2017).
The recently developed graph convolutional neural net-
works (GCNNs) (Defferrard, Bresson, and Vandergheynst
2016) is a successful attempt of generalizing the power-
ful convolutional neural networks (CNNs) in dealing with
Euclidean data to modeling graph-structured data. In their
pilot work (Kipf and Welling 2017), Kipf and Welling pro-
posed a simplified type of GCNNs, called graph convolu-
tional networks (GCNs), and applied it to semi-supervised
classification. The GCN model naturally integrates the con-
nectivity patterns and feature attributes of graph-structured
data, and outperforms many state-of-the-art methods signif-
icantly on some benchmarks. Nevertheless, it suffers from
similar problems faced by other neural-network-based mod-
els. The working mechanisms of the GCN model for semi-
supervised learning are not clear, and the training of GCNs
still requires considerable amount of labeled data for param-
eter tuning and model selection, which defeats the purpose
for semi-supervised learning.
In this paper, we demystify the GCN model for semi-
supervised learning. In particular, we show that the graph
convolution of the GCN model is simply a special form of
Laplacian smoothing, which mixes the features of a vertex
and its nearby neighbors. The smoothing operation makes
the features of vertices in the same cluster similar, thus
greatly easing the classification task, which is the key rea-
son why GCNs work so well. However, it also brings poten-
tial concerns of over-smoothing. If a GCN is deep with
many convolutional layers, the output features may be over-
smoothed and vertices from different clusters may become
indistinguishable. The mixing happens quickly on small
arXiv:1801.07606v1 [cs.LG] 22 Jan 2018
城北伯庸
- 粉丝: 27
- 资源: 315
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0