2

dimensional vectors. In the ﬁeld of graph analysis, tradi-

tional machine learning approaches usually rely on hand

engineered features and are limited by its inﬂexibility and

high cost. Following the idea of representation learning and

the success of word embedding [11], DeepWalk [12], which

is regarded as the ﬁrst graph embedding method based

on representation learning, applies SkipGram model [11]

on the generated random walks. Similar approaches such

as node2vec [13], LINE [14] and TADW [15] also achieved

breakthroughs. However, these methods suffer two severe

drawbacks [16]. First, no parameters are shared between

nodes in the encoder, which leads to computationally in-

efﬁciency, since it means the number of parameters grows

linearly with the number of nodes. Second, the direct em-

bedding methods lack the ability of generalization, which

means they cannot deal with dynamic graphs or generalize

to new graphs.

Based on CNNs and graph embedding, graph neural

networks (GNNs) are proposed to collectively aggregate in-

formation from graph structure. Thus they can model input

and/or output consisting of elements and their dependency.

Further, graph neural network can simultaneously model

the diffusion process on the graph with the RNN kernel.

In the following part, we explain the fundamental rea-

sons why graph neural networks are worth investigating.

Firstly, the standard neural networks like CNNs and RNNs

cannot handle the graph input properly in that they stack

the feature of nodes by a speciﬁc order. However, there

isn’t a natural order of nodes in the graph. To present a

graph completely, we should traverse all the possible orders

as the input of the model like CNNs and RNNs, which is

very redundant when computing. To solve this problem,

GNNs propagate on each node respectively, ignoring the

input order of nodes. In other words, the output of GNNs

is invariant for the input order of nodes. Secondly, an

edge in a graph represents the information of dependency

between two nodes. In the standard neural networks, the

dependency information is just regarded as the feature of

nodes. However, GNNs can do propagation guided by

the graph structure instead of using it as part of features.

Generally, GNNs update the hidden state of nodes by a

weighted sum of the states of their neighborhood. Thirdly,

reasoning is a very important research topic for high-level

artiﬁcial intelligence and the reasoning process in human

brain is almost based on the graph which is extracted from

daily experience. The standard neural networks have shown

the ability to generate synthetic images and documents by

learning the distribution of data while they still cannot learn

the reasoning graph from large experimental data. However,

GNNs explore to generate the graph from non-structural

data like scene pictures and story documents, which can be

a powerful neural model for further high-level AI. Recently,

it has been proved that an untrained GNN with a simple

architecture also perform well [17].

There exist several comprehensive reviews on graph

neural networks. [18] gives a formal deﬁnition of early

graph neural network approaches. And [19] demonstrates

the approximation properties and computational capabil-

ities of graph neural networks. [20] proposed a uniﬁed

framework, MoNet, to generalize CNN architectures to

non-Euclidean domains (graphs and manifolds) and the

framework could generalize several spectral methods on

graphs [2], [21] as well as some models on manifolds [22],

[23]. [24] provides a thorough review of geometric deep

learning, which presents its problems, difﬁculties, solutions,

applications and future directions. [20] and [24] focus on

generalizing convolutions to graphs or manifolds, how-

ever in this paper we only focus on problems deﬁned on

graphs and we also investigate other mechanisms used in

graph neural networks such as gate mechanism, attention

mechanism and skip connection. [25] proposed the message

passing neural network (MPNN) which could generalize

several graph neural network and graph convolutional net-

work approaches. It presents the deﬁnition of the message

passing neural network and demonstrates its application on

quantum chemistry. [26] proposed the non-local neural net-

work (NLNN) which uniﬁes several “self-attention”-style

methods. However, the model is not explicitly deﬁned on

graphs in the original paper. Focusing on speciﬁc applica-

tion domains, [25] and [26] only give examples of how to

generalize other models using their framework and they

do not provide a review over other graph neural network

models. [27] proposed the graph network (GN) framework.

The framework has a strong capability to generalize other

models and its relational inductive biases promote combi-

natorial generalization, which is thought to be a top priority

for AI. However, [27] is part position paper, part review

and part uniﬁcation and it only gives a rough classiﬁcation

of the applications. In this paper, we provide a thorough

review of different graph neural network models as well as

a systematic taxonomy of the applications.

To summarize, this paper presents an extensive survey

of graph neural networks with the following contributions.

• We provide a detailed review over existing graph

neural network models. We introduce the original

model, its variants and several general frameworks.

We examine various models in this area and provide

a uniﬁed representation to present different propaga-

tion steps in different models. One can easily make a

distinction between different models using our repre-

sentation by recognizing corresponding aggregators

and updaters.

• We systematically categorize the applications and

divide the applications into structural scenarios, non-

structural scenarios and other scenarios. We present

several major applications and their corresponding

methods in different scenarios.

• We propose four open problems for future research.

Graph neural networks suffer from over-smoothing

and scaling problems. There are still no effective

methods for dealing with dynamic graphs as well

as modeling non-structural sensory data. We provide

a thorough analysis of each problem and propose

future research directions.

The rest of this survey is organized as follows. In Sec. 2,

we introduce various models in the graph neural network

family. We ﬁrst introduce the original framework and its

limitations. And then we present its variants that try to