Diffusion-ConvolutionalNeuralNetworks_图神经网络__densenet模型结构资源-CSDN文库

共1个文件

pdf：1个

版权申诉

图神经网络

1星 53 浏览量 2021-10-03 02:24:56 上传评论收藏 262KB RAR 举报

资源推荐

资源详情

资源评论

收起资源包目录

Diffusion-Convolutional Neural Networks.rar （1个子文件）

Diffusion-Convolutional Neural Networks.pdf 366KB

Diffusion-Convolutional Neural Networks

James Atwood and Don Towsley

College of Information and Computer Science

University of Massachusetts

Amherst, MA, 01003

{jatwood|towsley}@cs.umass.edu

Abstract

We present diffusion-convolutional neural networks (DCNNs), a new model for

graph-structured data. Through the introduction of a diffusion-convolution oper-

ation, we show how diffusion-based representations can be learned from graph-

structured data and used as an effective basis for node classiﬁcation. DCNNs

have several attractive qualities, including a latent representation for graphical

data that is invariant under isomorphism, as well as polynomial-time prediction

and learning that can be represented as tensor operations and efﬁciently imple-

mented on the GPU. Through several experiments with real structured datasets, we

demonstrate that DCNNs are able to outperform probabilistic relational models

and kernel-on-graph methods at relational node classiﬁcation tasks.

1 Introduction

Working with structured data is challenging. On one hand, ﬁnding the right way to express and

exploit structure in data can lead to improvements in predictive performance; on the other, ﬁnding

such a representation may be difﬁcult, and adding structure to a model can dramatically increase the

complexity of prediction and learning.

The goal of this work is to design a ﬂexible model for a general class of structured data that offers

improvements in predictive performance while avoiding an increase in complexity. To accomplish

this, we extend convolutional neural networks (CNNs) to general graph-structured data by introducing

a ‘diffusion-convolution’ operation. Brieﬂy, rather than scanning a ‘square’ of parameters across a

grid-structured input like the standard convolution operation, the diffusion-convolution operation

builds a latent representation by scanning a diffusion process across each node in a graph-structured

input.

This model is motivated by the idea that a representation that encapsulates graph diffusion can provide

a better basis for prediction than a graph itself. Graph diffusion can be represented as a matrix power

series, providing a straightforward mechanism for including contextual information about entities

that can be computed in polynomial time and efﬁciently implemented on the GPU.

In this paper, we present diffusion-convolutional neural networks (DCNNs) and explore their per-

formance at various classiﬁcation tasks on graphical data. Many techniques include structural infor-

mation in classiﬁcation tasks, such as probabilistic relational models and kernel methods; DCNNs

offer a complementary approach that provides a signiﬁcant improvement in predictive performance at

node classiﬁcation tasks.

As a model class, DCNNs offer several advantages:

• Accuracy:

In our experiments, DCNNs signiﬁcantly outperform alternative methods for

node classiﬁcation tasks and offer comparable performance to baseline methods for graph

classiﬁcation tasks.

29th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.

arXiv:1511.02136v6 [cs.LG] 8 Jul 2016

(H x F)

(a) Node classiﬁcation

(H x F)

(b) Graph classiﬁcation

(H x F)

Figure 1: DCNN model deﬁnition for node, graph, and edge classiﬁcation tasks.

• Flexibility:

DCNNs provide a ﬂexible representation of graphical data that encodes node

features, edge features, and purely structural information with little preprocessing. DC-

NNs can be used for a variety of classiﬁcation tasks with graphical data, including node

classiﬁcation, edge classiﬁcation, and whole-graph classiﬁcation.

• Speed:

Prediction from an DCNN can be expressed as a series of polynomial-time tensor

operations, allowing the model to be implemented efﬁciently on a GPU using existing

libraries.

The remainder of this paper is organized as follows. In Section 2, we present a formal deﬁnition of

the model, including descriptions of prediction and learning procedures. This is followed by several

experiments in Section 3 that explore the performance of DCNNs at node and graph classiﬁcation

tasks. We brieﬂy describe the limitations of the model in Section 4, then, in Section 5, we present

related work and discuss the relationship between DCNNs and other methods. Finally, conclusions

and future work are presented in Section 6.

2 Model

Consider a situation where we have a set of

graphs

G = {G

|t ∈ 1...T }

. Each graph

= (V

, E

)

is composed of vertices

and edges

. The vertices are collectively described by an

× F

design

matrix

of features

, where

is the number of nodes in

, and the edges

are encoded by an

× N

adjacency matrix

, from which we can compute a degree-normalized transition matrix

that gives the probability of jumping from node

to node

in one step. No constraints are placed

on the form

; the graph can be weighted or unweighted, directed or undirected. Either the nodes,

edges, or graphs have labels

associated with them, with the dimensionality of

differing in each

case.

We are interested in learning to predict

; that is, to predict a label for each of the nodes in each

graph, or a label for each of the edges in each graph, or a label for each graph itself. In each case,

we have access to some labeled entities (be they nodes, graphs, or edges), and our task is predict the

values of the remaining unlabeled entities.

This setting is capable of representing several well-studied machine learning tasks. If

T = 1

(i.e.

there is only one input graph) and the labels

are associated with the nodes or edges, this reduces to

the problem of semisupervised classiﬁcation; if there are no edges present in the input graph, this

reduces further to standard supervised classiﬁcation. If

T > 1

and the labels

are associated with

each graph, then this represents the problem of supervised graph classiﬁcation.

DCNNs were designed to perform any task that can be represented within this formulation. An

DCNN takes

as input and returns either a hard prediction for

or a conditional distribution

P(Y |X)

. Each entity of interest (be it a node, a graph, or an edge) is transformed to a diffusion-

convolutional representation, which is a

H × F

real matrix deﬁned by

hops of graph diffusion over

features, and it is deﬁned by an

H × F

real-valued weight tensor

and a nonlinear differentiable

function

that computes the activations. So, for node classiﬁcation tasks, the diffusion-convolutional

Without loss of generality, we assume that the features are real-valued.

representation of graph

, will be a

× H × F

tensor, as illustrated in Figure 1a; For graph

or edge classiﬁcation tasks,

will be a

H × F

matrix or a

× H × F

tensor respectively, as

illustrated in Figures 1b and 1c.

The term ‘diffusion-convolution’ is meant to evoke the ideas of feature learning, parameter tying, and

invariance that are characteristic of convolutional neural networks. The core operation of a DCNN is

a mapping from nodes and their features to the results of a diffusion process that begins at that node.

In contrast with standard CNNs, DCNN parameters are tied according to search depth rather than

their position in a grid. The diffusion-convolutional representation is invariant with respect to node

index rather than position; in other words, the diffusion-convolututional activations of two isomorphic

input graphs will be the same

. Unlike standard CNNs, DCNNs have no pooling operation.

Node Classiﬁcation

Consider a node classiﬁcation task where a label

is predicted for each input

node in a graph. If we let

∗

be an

× H × N

tensor containing the power series of

, the

diffusion-convolutional activation Z

tijk

for node i, hop j, and feature k of graph t is given by

tijk

= f

l=1

∗

tijl

tlk

(1)

The activations can be expressed more concisely using tensor notation as

= f (W

 P

∗

) (2)

where the



operator represents element-wise multiplication; see Figure 1a. The model only

entails

O(H × F )

parameters, making the size of the latent diffusion-convolutional representation

independent of the size of the input.

The model is completed by a dense layer that connects

. A hard prediction for

, denoted

can be obtained by taking the maximum activation and a conditional probability distribution

P(Y |X)

can be found by applying the softmax function:

Y = arg max



 Z



(3)

P(Y |X) = softmax



 Z



(4)

This keeps the same form in the following extensions.

Graph Classiﬁcation

DCNNs can be extended to graph classiﬁcation by simply taking the mean

activation over the nodes

= f



 1

∗



(5)

where 1

is an N

× 1 vector of ones, as illustrated in Figure 1b.

Edge Classiﬁcation and Edge Features

Edge features and labels can be included by converting

each edge to a node that is connected to the nodes at the tail and head of the edge. This graph graph

can be constructed efﬁciently by augmenting the adjacency matrix with the incidence matrix:





(6)

can then be used to compute P

and used in place of P

to classify nodes and edges.

Purely Structural DCNNs

DCNNs can be applied to input graphs with no features by associating

a ‘bias feature’ with value 1.0 with each node. Richer structure can be encoded by adding additional

structural node features such as Pagerank or clustering coefﬁcient, although this does introduce some

hand-engineering and pre-processing.

Learning

DCNNs are learned via stochastic minibatch gradient descent on backpropagated error.

At each epoch, node indices are randomly grouped into several batches. The error of each batch is

computed by taking slices of the graph deﬁnition power series and propagating the input forward to

predict the output, then setting the weights by gradient ascent on the back-propagated error. We also

make use of windowed early stopping; training is ceased if the validation error of a given epoch is

greater than the average of the last few epochs.

A proof is given in the appendix.

评论收藏

内容反馈

版权申诉

wgdewvcewg

2022-02-25

这里面根本没有源码？？？

食肉库玛

粉丝: 57
资源: 4740

Diffusion-Convolutional Neural Networks_图神经网络_

最新资源

Diffusion-Convolutional Neural Networks_图神经网络_

Convolutional-Neural-Networks

Convolutional Neural Networks in Python

Guide to Convolutional Neural Networks

Fundamentals of Convolutional Neural Networks

Ali-R-Ansari_Numerical-solution-of-a-convection-diffusion-problem-with-Robin-boundary-conditions_Journal-of-Computation-and_Applied-Mathematics_2003

BigRedT-Anisotropic_Diffusion_and_Canny_Edge_Detection.zip

DT-MRI.rar_DT-MRI_diffusion_diffusion tensor_mri c++_tensor

PM-anirotropy-diffusion.rar_P-M diffusion_diffusion_pm扩散模型_扩散_扩散

Practical Convolutional Neural Networks

Convolutional Neural Networks.pdf

Convolutional Neural Networks的Matlab代码

Backpropagation In Convolutional Neural Networks.pdf

8x-NMKD-Superscale-150000-G.pth

error-diffusion-(2).rar_Floyd-Steinberg_diffusion_error diffusio

Passivity analysis of coupled reaction-diffusion neural networks with Dirichlet boundary conditions

2D Unsteady convection-diffusion.zip_FEM_convection_convection d

anisotropic-diffusion.zip_anisotropic_diffusion matlab_各向异性_各向异性

Convolutional Neural Networks for Sentence Classification

V-Net：Fully Convolutional Neural Networks

Convolutional Neural Networks for Visual Recognition 8

Convolutional Neural Networks for Visual Recognition 7

Convolutional Neural Networks for Visual Recognition 5

Passivity and synchronization of coupled reaction-diffusion Cohen-Grossberg neural networks with state coupling and spatial diffusion coupling

error-diffusion.rar_加网_数字加网_误差扩散

01一维-扩散-显式_C语言_diffusion_

pagerankmatlab代码-Link_Prediction_in_Multi-relational_Networks:Link_Pred

stable-diffusion-webui-extensions 扩展

最新资源