2
dimensional vectors. In the field of graph analysis, tradi-
tional machine learning approaches usually rely on hand
engineered features and are limited by its inflexibility and
high cost. Following the idea of representation learning and
the success of word embedding [11], DeepWalk [12], which
is regarded as the first graph embedding method based
on representation learning, applies SkipGram model [11]
on the generated random walks. Similar approaches such
as node2vec [13], LINE [14] and TADW [15] also achieved
breakthroughs. However, these methods suffer two severe
drawbacks [16]. First, no parameters are shared between
nodes in the encoder, which leads to computationally in-
efficiency, since it means the number of parameters grows
linearly with the number of nodes. Second, the direct em-
bedding methods lack the ability of generalization, which
means they cannot deal with dynamic graphs or generalize
to new graphs.
Based on CNNs and graph embedding, graph neural
networks (GNNs) are proposed to collectively aggregate in-
formation from graph structure. Thus they can model input
and/or output consisting of elements and their dependency.
Further, graph neural network can simultaneously model
the diffusion process on the graph with the RNN kernel.
In the following part, we explain the fundamental rea-
sons why graph neural networks are worth investigating.
Firstly, the standard neural networks like CNNs and RNNs
cannot handle the graph input properly in that they stack
the feature of nodes by a specific order. However, there
isn’t a natural order of nodes in the graph. To present a
graph completely, we should traverse all the possible orders
as the input of the model like CNNs and RNNs, which is
very redundant when computing. To solve this problem,
GNNs propagate on each node respectively, ignoring the
input order of nodes. In other words, the output of GNNs
is invariant for the input order of nodes. Secondly, an
edge in a graph represents the information of dependency
between two nodes. In the standard neural networks, the
dependency information is just regarded as the feature of
nodes. However, GNNs can do propagation guided by
the graph structure instead of using it as part of features.
Generally, GNNs update the hidden state of nodes by a
weighted sum of the states of their neighborhood. Thirdly,
reasoning is a very important research topic for high-level
artificial intelligence and the reasoning process in human
brain is almost based on the graph which is extracted from
daily experience. The standard neural networks have shown
the ability to generate synthetic images and documents by
learning the distribution of data while they still cannot learn
the reasoning graph from large experimental data. However,
GNNs explore to generate the graph from non-structural
data like scene pictures and story documents, which can be
a powerful neural model for further high-level AI. Recently,
it has been proved that an untrained GNN with a simple
architecture also perform well [17].
There exist several comprehensive reviews on graph
neural networks. [18] gives a formal definition of early
graph neural network approaches. And [19] demonstrates
the approximation properties and computational capabil-
ities of graph neural networks. [20] proposed a unified
framework, MoNet, to generalize CNN architectures to
non-Euclidean domains (graphs and manifolds) and the
framework could generalize several spectral methods on
graphs [2], [21] as well as some models on manifolds [22],
[23]. [24] provides a thorough review of geometric deep
learning, which presents its problems, difficulties, solutions,
applications and future directions. [20] and [24] focus on
generalizing convolutions to graphs or manifolds, how-
ever in this paper we only focus on problems defined on
graphs and we also investigate other mechanisms used in
graph neural networks such as gate mechanism, attention
mechanism and skip connection. [25] proposed the message
passing neural network (MPNN) which could generalize
several graph neural network and graph convolutional net-
work approaches. It presents the definition of the message
passing neural network and demonstrates its application on
quantum chemistry. [26] proposed the non-local neural net-
work (NLNN) which unifies several “self-attention”-style
methods. However, the model is not explicitly defined on
graphs in the original paper. Focusing on specific applica-
tion domains, [25] and [26] only give examples of how to
generalize other models using their framework and they
do not provide a review over other graph neural network
models. [27] proposed the graph network (GN) framework.
The framework has a strong capability to generalize other
models and its relational inductive biases promote combi-
natorial generalization, which is thought to be a top priority
for AI. However, [27] is part position paper, part review
and part unification and it only gives a rough classification
of the applications. In this paper, we provide a thorough
review of different graph neural network models as well as
a systematic taxonomy of the applications.
To summarize, this paper presents an extensive survey
of graph neural networks with the following contributions.
• We provide a detailed review over existing graph
neural network models. We introduce the original
model, its variants and several general frameworks.
We examine various models in this area and provide
a unified representation to present different propaga-
tion steps in different models. One can easily make a
distinction between different models using our repre-
sentation by recognizing corresponding aggregators
and updaters.
• We systematically categorize the applications and
divide the applications into structural scenarios, non-
structural scenarios and other scenarios. We present
several major applications and their corresponding
methods in different scenarios.
• We propose four open problems for future research.
Graph neural networks suffer from over-smoothing
and scaling problems. There are still no effective
methods for dealing with dynamic graphs as well
as modeling non-structural sensory data. We provide
a thorough analysis of each problem and propose
future research directions.
The rest of this survey is organized as follows. In Sec. 2,
we introduce various models in the graph neural network
family. We first introduce the original framework and its
limitations. And then we present its variants that try to