【免费】图卷积引用最高论文1_DGCNN论文资源-CSDN文库

需积分: 0 11 浏览量 2022-08-03 15:23:20 上传评论收藏 555KB PDF 举报

资源详情

资源评论

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. XX, NO. Y, MONTH, YEAR 1

The Graph Neural Network Model

Franco Scarselli, Marco Gori, Ah Chung Tsoi, Gabriele Monfardini

Abstract

Many underlying relationships among data in several areas of science and engineering, e.g. computer vision,

molecular chemistry, molecular biology, pattern recognition, data mining, can be represented in terms of graphs. In this

paper, we propose a new neural network model, called graph neural network (GNN) model, that extends existing neural

network methods for processing the data represented in the graph domain. This GNN model, which can directly process

most of the practically useful types of graphs, e.g. acyclic, cyclic, directed, un-directed, implements a transduction

function τ (G, n) ∈ IR

that maps a graph G and one of its nodes n into an m-dimensional Euclidean space. A

supervised learning algorithm is derived to estimate the parameters of the proposed GNN model. Computational cost

of the proposed algorithm is also considered. Some experimental results are shown to validate the proposed learning

algorithm, and demonstrate its generalization capability.

Index Terms

Graph Neural Networks, Graph Processing, Recursive Neural Networks, Graphical Domains.

I. INTRODUCTION

Data can be naturally represented by graph structures in several application areas including proteomics [1], pattern

recognition and image analysis [2], scene description [3], [4], software engineering [5], [6] and natural language

processing [7]. The simplest kinds of graph structures include single nodes, and sequences. But in several application

domains, the information is organized in more complex graph structures such as trees, acyclic graphs, or cyclic

graphs. Traditionally, the exploitation of data relationships has been the subject of many studies in the community

of inductive logic programming and, recently, this research theme has been evolving in different directions [8], also

because of the marriage with statistics and neural networks (see e.g. the recent workshops [9], [10], [11], [12]).

In machine learning, the structured data is often associated with the goal of (supervised or unsupervised) learning

from examples a function τ that maps a graph G and one of its nodes n to a vector of reals

: τ(G, n) ∈ IR

Applications to a graphical domain can generally be divided into two classes: graph focused and node focused

applications respectively in this paper.

Scarselli, Gori, Monfardini are with the University of Siena, Siena, Italy. Email: {franco,marco,monfardini}@dii.unisi.it.

Tsoi is with Hong Kong Baptist University, Kowloon, Hong Kong. Email: act@hkbu.edu.hk

Hagenbuchner is with University of Wollongong, Wollongong, Australia. Email: markus@uow.edu.au

Note that in most classiﬁcation problems, the mapping is to a set of integers IN

, while in regression problems, the mapping is to a set of

reals IR

. Here for simplicity of exposition, we will denote only the regression case. The proposed formulation can be trivially re-written for

the situation of classiﬁcation.

May 24, 2007 DRAFT

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. XX, NO. Y, MONTH, YEAR 2

In graph focused applications, the function τ is independent of the node n and implements a classiﬁer or a

regressor on a graph structured dataset. For example, a chemical compound can be modeled by a graph G, the

nodes of which stand for atoms and the edges of which represent chemical bonds (see Figure 1-A) linking some of

the atoms together. The mapping τ (G) may be used to estimate the probability that the chemical compound is active

against a certain disease. In Figure 1-B, an image can be represented by a region adjacency graph (RAG) where

nodes denote homogeneous regions of intensity of the image and arcs represent their adjacency relationship [13].

In this case, τ (G) may be used to classify the image into different classes according to its contents, e.g. castles,

cars, people, and so on.

In node focused applications, τ depends on the node n, so that the classiﬁcation (or the regression) depends

on the properties of each node. Object detection is an example of this class of applications. It consists of ﬁnding

whether an image contains a given object or not, and, if so, localizing its position. This problem can be solved by

a function τ which classiﬁes the nodes of the RAG according to whether the corresponding region belongs to the

object or not. For example, the output of τ for Figure 1-B might be 1 for the black nodes, which corresponds to

the castle, and 0 otherwise. Another example comes from web page classiﬁcations. The web can be represented by

a graph where nodes stand for pages and edges represent the hyperlinks between the web pages (Figure 1-C). The

web connectivity can be exploited, along with page contents, for several purposes, e.g. classifying the pages into a

set of topics.

A B

www.uow.edu.au/~markus

www.dii.unisi.it www.dii.unisi.it/people

www.dii.unisi.it/~franco

www.dii.unisi.it/~marco www.uow.edu.au/~act

Fig. 1. Some applications where the information is represented by graphs: A is a chemical compound (adrenaline); B an image; and C a subset

of the web.

May 24, 2007 DRAFT

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. XX, NO. Y, MONTH, YEAR 3

Traditional machine learning applications cope with graph structured data by using a preprocessing phase which

maps the graph structured information to a simpler representation, e.g. vectors of reals. In other words, the

preprocessing step ﬁrst “squashes” the graph structured data into a vector of reals and then deal with the preprocessed

data using a list based data processing technique. However, important information, e.g., the topological dependency

of information on node n may be lost during the preprocessing stage and the ﬁnal result may depend, in an

unpredictable manner, on the details of the preprocessing algorithm. More recently, there are various approaches

[14], [15] attempting to preserve the graph structured nature of the data for as long as required before processing

the data. In these recent approaches the idea is to encode the underlying graph structured data using the topological

relationships among the nodes of the graph to incorporate the graph structured information in the data processing

step. Recursive neural networks [14], [16], [17] and Markov chains [15], [18], [19] belong to this set of techniques

and are commonly applied both to graph and node focused problems. Our method extends these two approaches

in that it can deal directly with graph structured information.

Existing recursive neural networks are neural network models whose input domain consists of directed acyclic

graphs [14], [16], [17]. The method estimates the parameters ϕ

of a function which maps a graph to a vector

of reals. The approach can also be used for node focused applications, but in this case, the graph must undergo

a preprocessing phase [20]. Similarly, using a preprocessing phase, it is possible to handle certain types of cyclic

graphs [21]. Recursive neural networks have been applied to several problems including logical term classiﬁcation [22],

chemical compound classiﬁcation [23], logo recognition [2], [24], web Page scoring [25], and face localization [26].

Recursive neural networks are also related to support vector machines [27], [28], [29], which adopt special

kernels to operate on graph structured data. For example, the diffusion kernel [30] is based on a heat equation; the

kernels proposed in [31], [32] exploit the vectors produced by a graph random walker and those designed in [33],

[34], [35] use a method of counting the number of common substructures of two trees. In fact, recursive neural

networks, as support vector machine methods, automatically encode the input graph into an internal representation.

However, in recursive neural networks, the internal encoding is learned, while in support vector machine it is

On the other hand, Markov chain models can emulate processes where the causal connections among events are

represented by graphs. Recently, random walk theory, which addresses a particular class of Markov chain models,

has been applied with some success to the realization of web page ranking algorithms [15], [18]. Internet search

engines use ranking algorithms to measure the relative “importance” of web pages. Such measurements are generally

exploited, along other page features, by “horizontal” search engines, e.g., Google [15], or by personalized search

engines (“vertical” search engines, see, e.g., [19]) to sort the universal resource locators (URLs) returned on user

queries

. Some attempts have been made to extend these models with learning capabilities such that a parametric

model representing the behavior of the system can be estimated from training examples [19], [37], [38]. Such

models are able to generalize the results to score all the web pages in the collection.

The relative importance measure of a web page is also used to serve other goals, e.g. to improve the efﬁciency of crawlers [36].

May 24, 2007 DRAFT

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. XX, NO. Y, MONTH, YEAR 4

In this paper, we present a new neural network model which is suitable for both graph and node focused

applications. This new model uniﬁes these two existing models into a common framework. We will call this new

neural network model a graph neural network (GNN). It will be shown that the GNN is an extension of both

recursive neural networks and random walk models and that it retains their characteristics.

The model extends recursive neural networks since it can process a more general class of graphs including cyclic,

directed and undirected graphs, and to deal with node focused applications without any preprocessing steps. The

approach extends random walk theory by the introduction of a learning algorithm and by enlarging the class of

processes that can be modeled.

In this paper a learning algorithm will be introduced which estimates the parameters of the GNN model on a

set of given training examples. In addition, the computational cost of the parameter estimation algorithm will be

considered. It is also worth to mention that elsewhere it is proved that GNNs show a sort of universal approximation

property and, under mild conditions, they can approximate most of the practically useful functions ϕ on graphs

[39].

The structure of this paper is as follows: after a brief description of the notation used in this paper as well

as some preliminary deﬁnitions, Section II presents the concept of a graph neural network model, together with

a learning algorithm for the parameter estimation. Moreover, Section III discusses the computational cost of the

learning algorithm. Some experimental results are presented in Section IV. Conclusions are drawn in Section V.

II. THE GRAPH NEURAL NETWORK MODEL

We begin by introducing some notations that will be used throughout the paper. A graph G is a pair (N , E),

where N is the set of nodes and E is the set of edges. The set ne[n] stands for the neighbours of n, i.e. the

nodes connected to n by an arc, while co[n] denotes the set of arcs having n as a vertex. Nodes and edges may

have labels represented by real vectors. The labels attached to node n and edge (n

, n

) will be represented by

∈ IR

and l

)

∈ IR

respectively. Let l denote the vector obtained by stacking together all the labels of

the graph. The notation adopted for labels follows a more general scheme: if y is a vector that contains data from

a graph and S is subset of the nodes (the edges), then y

denotes the vector obtained by selecting from y the

components related to the node (the edges) in S. For example, l

ne[n]

stands for the vector containing the labels of

all the neighbours of n. Labels usually include features of objects related to nodes and features of the relationships

between the objects. For example, in the case of Figure 1-b, in an image, node labels might represent properties of

the regions (e.g., area, perimeter, average color intensity), while edge labels might represent the relative position

of the regions (e.g. the distance between their barycenters and the angle between the momentums). No assumption

is made on the arcs, directed and undirected edges are both permitted. However, when different kinds of edges

co-exist in the same dataset, it is necessary to distinguish them. This can be easily achieved by attaching a proper

label to each edge. In this case, different kinds of arcs turn out to be just arcs with different labels.

The considered graphs may be either positional or non–positional. Non–positional graphs are those described so

far, positional graphs differs since for each node n, there exists an injective function ν

: ne[n] → {1, . . . , |N|}

Due to the length of proof, such a result cannot be proved here and is included [39].

May 24, 2007 DRAFT

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. XX, NO. Y, MONTH, YEAR 5

which assigns to each neighbour of n a different position. Notice that, the position of the neighbour can be implicitly

used for storing useful information. For instance, let us consider the example of the Region Adjacency Graph (RAG),

(see Figure 1-b). ν

can be used to represent the relative spatial position of the regions; e.g., ν

might enumerate

the neighbours of a node n, which represents the adjacent regions, following a clockwise ordering.

The domain considered in this paper is the set D of pairs graph–node, i.e. D = G × N where G is a set of

the graphs and N is a subset of their nodes. We assume a supervised learning framework with the learning set

L = {(G

, n

i,j

, t

i,j

)|, G

= (N

, E

) ∈ G; n

i,j

∈ N

; t

i,j

∈ IR

, 1 ≤ i ≤ p, 1 ≤ j ≤ q

}, where n

i,j

∈ N

denotes the j-the node in the set N

∈ N ; (G

, n

i,j

, t

i,j

) indicates that t

i,j

is the target for node n

i,j

(of graph

), p ≤ |G| and q

≤ |N

|. Interestingly, all the graphs of the learning set can be combined into a unique

disconnected graph, and, therefore, one might think of the learning set as the pair L = (G, T ) where G = (N, E)

is a graph and T a is set of pairs {(n

, t

)| n

∈ N, t

∈ IR

, 1 ≤ i ≤ q}. It is worth mentioning that this compact

deﬁnition is not only useful for its simplicity, but that it also captures directly the very nature of some problems

where the domain consists of only one graph, for instance, a large portion of the Internet (see Figure 1-c).

A. The model

The intuitive idea underlining the proposed approach is that nodes in a graph represent objects or concepts, and

edges represent their relationships. Each concept is naturally deﬁned by its features and the related concepts. Thus,

we can attach a state x

∈ IR

to each node n, that is based on the information contained in the neighborhood

of n (see Figure 2). The variable x

contains a representation of the concept denoted by n and can be used to

produce an output o

, i.e. a decision about the concept.

ne[n]

co[1]

ne[1]

(1,2)

(3,1)

l l

(1,4)

, ,

(6,1)

(1,4)

(6,1)

(1,2)

(3,1)

(4,5)

(5,6)

(6,8)

(6,7)

(5,7)

)

Fig. 2. The variable x

depends on the information in the neighborhood of node 1.

May 24, 2007 DRAFT

剩余31页未读，继续阅读

评论收藏

内容反馈

图卷积引用最高论文1

评论0

最新资源

图卷积引用最高论文1

评论0

最新资源

相关推荐

“工业设计”中引用率最高的 100 篇论文：文献计量分析-研究论文

遥感，地理信息领域引用最高的100篇论文

CVPR 引用量最高的10篇论文.zip

关于衡量最高被引用率最高的前1％出版物的讨论：以中国为例-研究论文

引用度最高的20篇深度学习论文

《图卷积神经网络》中文综述论文

二维图像卷积matlab程序

理解图卷积网络的节点分类

c++图像卷积操作

图卷积网络图卷积网络图卷积网络

cnn卷积神经网络论文.zip

Python-在TensorFlow中实现实现图像卷积网络

反卷积论文汇总

卷积神经网络9篇文章,卷积神经网络论文,Python

图卷积相关ppt下载.zip

分布式图卷积神经网路

卷积神经网络论文

st-gcn时空图卷积神经网络

T-GCN（图卷积神经网络-交通流预测）（代码）.zip

1_生成对应特征图_图像卷积_用numpy完成图像卷积运算_自定义卷积核_

图卷积神经网络综述.pdf

图卷积神经网络教程学习

juanji.rar_C++图像卷积_卷积_图像卷积_图像卷积处理_图像处理卷积

图卷积神经网络的变种与挑战.zip

BurpLoaderKeygen.jar.zip

最新版ISO/IEC 27001:2022、ISO 27002:2022中英文合集

Goby红队版-win-x64-2.4.7版本

Chrome Header Editor 插件

ISO SAE 21434-2021 中文版.pdf