用拓扑和标签属性做图分类_怎样写出拓扑序列资源-CSDN文库

需积分: 11 176 浏览量 2018-03-23 18:21:00 上传评论收藏 245KB PDF 举报

图分类是一种重要的数据挖掘任务，在这一领域内，各种基于图的核函数方法被提出用于解决这一任务。该方法的有效性已经得到验证，但是，它们通常伴随着较高的计算开销。在这篇文章中，作者提出了一种基于全局拓扑属性以及全局标签特征构造特征向量的图分类方法。这种方法的核心思想是属于同一类别图的拓扑和标签属性应该是相似的。该方法简单易于实现，并通过在真实基准数据集上的详细比较显示，基于拓扑和标签特征的方法提供了更好的或具有竞争力的分类准确率，同时在计算效率上远超其他图核函数。这种图分类方法对于大型的无标签图来说是最有效的。文档中提到的“random walk kernel”和“shortest path kernel”是图核方法的两种主要技术。Random walk kernel基于随机游走算法，通过图中路径的相似性来进行分类。它可以在图的不同部分之间建立联系，而不需要明确的节点对应关系。Shortest path kernel通过考虑所有节点对之间的最短路径来进行图的比较。这些方法可以捕捉到图结构的复杂特征，但如前所述，它们的计算代价很高。图分类的应用领域广泛，从化学信息学到生物信息学，从电信网络到社交网络。在化学信息学中，可以用来区分对某个特定目标具有活性或无活性的化合物。在生物信息学中，它可以用于将蛋白质分类到不同的家族，或将组织样本进行分类。在电信领域，它可以通过客户的呼叫行为进行客户分类。在社交网络分析中，它可以通过用户在Twitter、Facebook等社交平台上的动态来进行用户分类。图分类问题可以这样描述：有一组图Gi，i=1,...,N，其中每个图Gi=(Vi,Ei)，由顶点Vi的集合和边Ei的集合构成。每个图Gi可能在节点和边上有一些标记，这些标记来自数据集D的共同标记集Σ。每个图Gi都有一个对应的分类yi，yi属于一个分类集合C，C={1,...,k}。图分类的目标是学习一个模型：D→C，这个模型能够预测任何图的分类标签。目前，在大型图数据集上进行有效图分类的挑战之一是如何处理庞大的图结构和潜在的计算复杂性。这通常需要算法具有高度的可伸缩性和鲁棒性，并且能够有效处理图数据的异质性和动态变化。通过使用拓扑和标签属性，可以提取出有助于分类的关键信息，从而减小了计算复杂性，并改善了分类效果。

资源推荐

资源详情

资源评论

Graph Classiﬁcation via Topological and Label Attributes

Geng Li, Murat Semerci

†

, Bülent Yener, and Mohammed J. Zaki

Rensselaer Polytechnic Institute, Troy, NY

†

Bogazici University, Istanbul, Turkey

{lig2,yener,zaki}@cs.rpi.edu, semercim@gmail.com

ABSTRACT

Graph classiﬁcation is an important data mining task, and

various graph kernel method s have been proposed recently

for this task. These methods have proven to be eﬀective,

but they tend to have high computational overhead. In this

paper, we p ropose an alternative approach to graph clas-

siﬁcation that is based on feature-vectors constructed from

diﬀerent global topological attributes, as well as global la-

bel features. The main idea here is that the graphs from

the same class should have similar topological and label at-

tributes. Our method is simple and easy to implement, and

via a detailed comparison on real benchmark datasets, we

show that our topological and label feature-based approach

delivers better or competitive classiﬁcation accuracy, and is

also substantially faster than other graph kernels. It is the

most eﬀective method for large unlabeled graphs.

1. INTRODUCTION

With the proliferation of graph data, there has b een a

lot of interest in recent years to develop eﬀective methods

for classifying graph objects [13]. Applications range from

chem-informatics [21, 19] (e.g., compounds that are active

or inactive for some target) and bioinformatics [5, 2] (e.g.,

classifying proteins into diﬀerent families, classifying tissue

samples), to telecommunication networks (e.g., classifying

customers based on their calling behavior) and social net-

works (e.g., classifying users based on their feeds on Twitter,

Faceb ook, etc.).

The graph classiﬁcation problem can be stated as follows:

There is a d ataset of graphs G

∈ D, with i = 1, . . . , N.

Each graph G

= (V

, E

) is given as a collection of vertices

= {v

, . . . , v

} and edges E

= {(v

, v

)|v

, v

∈ V

The graph G

may h ave labels on the no des and edges, drawn

from some common set of labels Σ for the entire dataset D.

Finally, each graph G

has a corresponding class y

∈ C,

where C is the set of k categorical class labels, given as

C = {1, . . . , k}. The goal of graph classiﬁcation is to learn

a model f : D → C that predicts the class label for any

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for proﬁt or commercial advantage and that copies

bear this notice and the full citation on the ﬁrst page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior speciﬁc

permission and/or a fee.

MLG ’11 San Diego, CA, USA

graph. Typically the model is learned from a training set of

graphs with known class labels. The model is then evaluated

on a testing set of graphs. The accuracy of the classiﬁcation

mod el can be tested by comparing the predicted output ˆy

f(G

) with the tru e class label y

(provided it is known).

The main challenge in classifying graphs is how to convert

the discrete graph objects into numeric features or similar-

ities for eﬀective classiﬁcation. Graph kernel method s have

attracted a lot of attention due to their ability to represent

the graph data as a N × N symmetric, positive semi-deﬁnite

kernel matrix K = {κ(G

, G

)}

i,j=1

that records the pair-

wise similarities between graphs in D. Conceptually, the ker-

nel function κ(G

, G

) represents an inner-product between

the vectors corresponding to the two graphs G

and G

some N-dimensional feature space; see [23] for more details

on kernel methods. Once the kernel matrix has been con-

structed, it is p ossible to classify the graphs with a Support

Vector Machine (SVM) [27], using the supplied kernel ma-

trix K. There has been a lot of research activity in trying to

develop more eﬀective and eﬃcient graph kernel functions κ.

These met hods can broadly be classiﬁed into methods based

on random walks [10, 15], shortest paths [4], cycles [12] sub-

trees [22, 21, 24], and subgraphs [25, 17, 26]. Despite the

research above, it is fair to say that eﬃcient and eﬀective

graph classiﬁcation still remains a challenge, especially for

large graphs.

In this paper we propose an alternative approach to con-

structing a feature-vector for graph classiﬁcation. Instead

of relying on “patterns” like path, cycles, subtrees and sub-

graphs, we comput e several global topological and label at-

tributes from each graph G

∈ D. The values for these

attributes yield a numeric feature-vector F

= (f

, . . . , f

The set of feature vectors F

and the corresponding class

labels y

are then used to construct an SVM classiﬁer. We

show that our approach is both eﬀective and scalable com-

pared to state-of-the-art graph kernel methods. We con-

duct an extensive set of experiments over several real graphs,

representing chemical compounds, proteins, and cell-graph

datasets. We demonstrate that our approach yields b et ter

or competitive accuracy in a fraction of the time taken by

other kernels. Our method is particularly eﬀective in clas-

sifying large unlabeled graphs, since it is able to eﬀectively

capture the structural diﬀerences among the classes.

2. RELATED WORK

Graph kernels compute the similarity between pairs of

graphs in D, based on the common patterns they share. The

patterns can range from the simple to the complex. Specif-

ically the kernels are designed to exploit random walks [10,

15, 5, 28], shortest paths [4], cyclic patterns [12], subtrees [22,

21, 19, 24] and subgraphs [25, 17, 26]. Another class of graph

kernels, e.g., the diﬀusion kernel [16], deal with the similar-

ity between nodes of a single graph. However, our focus in

this paper is on kernels between diﬀerent graphs, which we

discuss in more detail below.

Random Walk Kernels: The similarity of two graphs

, G

∈ D can be quantiﬁed by counting labeled walks

that are common to both of them. The random walk kernel

[10], one of the ﬁrst graph kernels, is based on this idea.

The kernels in [15, 5] are also based on random walks over

labeled graphs. Computing the pair-wise kernel values has

worst case O(n

) complexity, where n denotes the number of

nodes in G

and G

. A more eﬃcient version of the random

walk kernel was proposed in [28], reducing the complexity to

O(n

) per pair of graphs. One potential problem with these

kernels is that artiﬁcially high kernel values may be obtained

by rep eatedly visiting same nodes and edges multiple times

[18]. We refer to [29] for a recent overview of random walk

based graph kernels.

Shortest Path Kernels: The shortest-path graph ker-

nel [4] ﬁrst computes the shortest-path graph S = (V

, E

)

for each graph G = (V, E) ∈ D. Here V

= V , and a

weighted edge (v

, v

) exists in E

if v

and v

are con-

nected by a path in G, with t he edge weight representing

the shortest path length between v

and v

(inﬁnity if they

are not reachable). Given the sh ortest- path graphs S

and

for two inp ut graph G

and G

the kernel is deﬁned as th e

sum over all pairs of edges from S

and S

, using any suitable

positive deﬁnite kernel on the edges. The all-pairs shortest-

path graphs can be computed in O(n

) time, and the kernel

can then be computed in O(n

) time, since S

and S

each

have O(n

) edges. Other variants of the shortest path ker-

nel include equal length shortest paths, k shortest paths, k

shortest d isjoint paths, and so on [4].

Cyclic Pattern Kernels: The cyclic pattern kernel [12] is

based on counting the number of common cycles t hat occur

in both graphs. Since there is n o known polynomial time

algorithm to ﬁnd all the cycles in a graph, sampling and

time-bounded enumeration of cycles are used to measure

the similarity of the graphs.

Subtree Kernels: Subtree kernels are based on common

subtrees in the graphs [22]. The main idea is to consider

pairs of nodes from G

and G

and see if they share common

tree-like neighborhoods, i.e., to count the pairs of identical

subtrees of height h rooted at vertex v

∈ G

and v

∈ G

The kernel is deﬁned as the sum over all pairs of vertices of

a suitably deﬁned vertex pair kernel. The complexity of this

approach is O(n

), where d denotes the maximum de-

gree. Another subtree kernel was proposed in [21], based on

a path representation of the trees obtained via a depth-ﬁrst

search on the input graphs. The kernel function is computed

on these paths (e.g., the ratio of t he longest common path).

The recently p roposed Weisfeiler-Lehman Kernel [24], is a

fast subtree kernel that scales up t o large, labeled graphs. It

uses the Weisfeiler-Lehman isomorphism test, which u ses it-

erative multiset-label determination, label compression, and

relabeling steps. The isomorphism test terminates after a

pre-speciﬁed number of iterations h. If the sets of labels

for nodes are not identical, t hen two graphs are considered

as non-isomorphic, otherwise, they are isomorphic. The WL

graph kernel counts th e matching multiset labels for th e two

graphs G

and G

in each iteration of the W L isomorphism

test. The WL kernel has O(mh) complexity, where m is the

number of edges in the graphs.

Graphlet and Subgraph Kernels: Similar graphs should

have similar subgraphs. Graphlet kernels measure the simi-

larity of two graphs by the dot produ ct of count vectors of all

possible connected subgraphs of order k (i.e., the graphlets,

also called as k-minors) [25, 17]. For any k (u sually set to 3,

4, or 5), there are 2

(

)

possible graphlets of size k, but many

of them are isomorphic. Usually, to avoid the dependence

on the size, the count vector is normalized into a probabil-

ity vector, and the graphlet kernel is re-deﬁned as the dot

product of the normalized count vectors for two graph s. Ex-

haustive enumeration of all graphlets has complex ity O(n

For a graph with bounded degree d, t he conn ected graphlets

can be enumerated in O(nd

(k−1)

) [25].

Frequent subgraph mining can also be used to deﬁne a

kernel between two graphs [26]. Let F = {s

, . . . , s

} denote

the set of p frequent and discriminative subgraph patterns

mined from D. Each graph G

∈ D is then represented as

a binary feature vector {0, 1}

where feature j is set to 1

if and only if s

is isomorphic to a su bgraph in G

. The

kernel between G

and G

can be deﬁned over their binary

feature vectors. CORK [26] implements this approach; it

uses gSpan [31] to m in e the subgraphs, and selects near-

optimal features (subgraphs) from that set, that are most

discriminative for classiﬁcation.

In our experiments in Section 4, we compare with the fol-

lowing graph kernel meth ods: fast geometric Random-walk

(RW) kernel [28], Shortest-path (SP) kernel [4], Graphlet

(GK) kernel [25], Ramon-G

artner (RG) su btree kernel [22],

and Weisfeiler-Lehman (WL) subtree kernel [24]. We also

compare with CORK [26].

3. GRAPH ATTRIBUTES FOR CLASSIFI-

CATION

As we have seen above, while many soph isticated graph

kernels have been proposed, eﬃciency and scalability remain

as challenges, for large graph datasets. Our basic idea is to

compute several topological and label attributes for each

graph in the dataset, and to use the derived feature-vector

attributes for classiﬁcation. Like most of the graph kernel

work, we use a Support Vector Machine (SVM) as the classi-

ﬁer of choice. The graph attributes we use are listed below.

Figure 1: A triangle with its t hree t rip les

1. Average degree: The degree of a node is deﬁned as

the number of its neighboring edges. Average degree

is the average value of the degree of all nodes in the

graph, i.e.,

d(G) =

d(u

)/n, where d(u

) denotes

the degree of node u

2. Average clustering coeﬃcient: For a node u, the

clustering co eﬃcient c(u) represents the likelihood that

any two neighbors of u are connected. More formally,

the clustering coeﬃcient of a node u is deﬁned as:

剩余8页未读，继续阅读

评论收藏

内容反馈

heheSakura

粉丝: 5
资源: 27

用拓扑和标签属性做图分类

python 思维拓扑图

python网络拓扑可视化

拓扑build和clean区别

费米面的拓扑和异常流入

检查拓扑字符串中的完整性属性

Oracle Spatial 拓扑和网络数据模型

通过数据分析的规范张量模型：尺寸，拓扑和几何

论文研究-基于属性拓扑图的形式概念构造算法.pdf

opensoc-streaming:一组可扩展的Storm拓扑和拓扑属性，用于在Hadoop中流式传输，丰富，索引和存储遥测

上位机程序设计-绘制网络拓扑图节点属性.pptx

网络拓扑和常有设备识别.doc

螺旋位错时空中狄拉克场的拓扑和旋转效应

网络拓扑特征的不平衡数据分类.pdf

ARCgis数据检查和拓扑处理.pptx

弹性力学优化算法：拓扑优化：材料属性与拓扑优化.docx

材料力学优化算法：拓扑优化：材料属性与拓扑优化.docx

基于节点拓扑结构和属性的重叠社区检测算法 (2016年)

网络拓扑绘图 网络拓扑绘图 网络拓扑绘图 网络拓扑绘图

华为拓扑图标visio用

基于MapX的道路拓扑和最短路径分析的讨论与实现

没有眼泪的拓扑

计算机课程（计算机网络）-网络拓扑和网络设备.docx

Open_CASCADE学习笔记-拓扑和几何.pdf

用SOM实现对动物的分类

用拓扑数据分析检测股市崩盘Detecting stock market crashes with topological data analysis.pdf

等变拓扑量子场论和对称受保护的拓扑相

具有广义恶棍晶格作用的拓扑和指数定理–二维测试

skydive：开源实时网络拓扑和协议分析器

课程安排，用拓扑排序实现

最新资源

网络拓扑绘图网络拓扑绘图网络拓扑绘图网络拓扑绘图