Polynomial-TimeAlgorithmforFindingDensestSubgraphsinUncertainGraphs资源-CSDN文库

66 浏览量 2021-02-09 16:02:42 上传评论收藏 132KB PDF 举报

在这篇标题为“Polynomial-Time Algorithm for Finding Densest Subgraphs in Uncertain Graphs”的研究论文中，作者Zhaonian Zou详细探讨了在不确定图中寻找最密集子图的问题。在传统的图形数据库研究中，已经提出了许多算法来寻找给定图中的密集子图，这些算法使用了各种密集子图的定义，例如团(cliques)、准团(quasi-cliques)、k-核(k-cores)、k-骨架(k-truss)等。然而，这些定义并不适用于不确定图。该论文介绍了不确定图的预期密度(expected density)概念。预期密度是在考虑了图中边出现的概率之后，对图的密度做出的一种度量。基于预期密度，作者重新定义了问题：给定一个不确定图G=(V,E,P)和顶点集R⊆V，寻找G的一个诱导子图G′=(V′,E′,P′)，使得R⊆V′并且G′的预期密度最大。作者展示了如何通过最大流(maximum flow)技术在O(nmlog(n2/m))时间内找到最优解，其中n=|V|代表顶点的数量，m=|E|代表边的数量。此外，与现有的不确定图模型不同，该论文所用模型非常一般化，不假定图中边的存在是相互独立的。这种一般性模型考虑了边的存在概率和顶点集合R的边缘约束(marginal constraint)。边缘约束在这里意味着所求的最密集子图必须包含预先给定的顶点集R。论文的引言部分介绍了密集子图发现是图形数据库研究中的一个基础问题。同时，作者指出，在文献中，已经提出了多种算法用于寻找给定图中的密集子图，但是这些算法都假设了边之间是相互独立的。这在不确定图的环境中可能不成立，因而需要新的算法和理论来处理这种更一般化的情况。本文的研究成果具有重要的理论和实际应用价值。在理论层面，通过引入预期密度和边缘约束等概念，拓展了对不确定图密集子图问题的理解，并且提供了一种多项式时间复杂度的算法。在实际应用层面，这种算法能够应用于各种需要处理不确定数据的场景，例如社交网络分析、生物信息学以及供应链网络优化等。具体到算法实现方面，作者利用了最大流最小割(maximum flow minimum cut)定理，这是图论中一个核心的概念，用于确定网络中可以传输的最大流量。在本论文中，最大流技术被用来寻找不确定图中预期密度最大的子图。通过构建一个相应的流量网络模型，并利用已有的高效最大流算法，可以在多项式时间内求解出最优解。论文在分类和主题描述器上标注为G.2.2[离散数学]: 图论—图算法；H.2.8[数据库管理]: 数据库应用—数据挖掘。关键词包括不确定图(uncertain graph)、边缘约束(marginal constraint)、预期密度(expected density)和参数最大流(parametric maximum flow)。这些都是本研究的关键概念和工具。在介绍部分的作者强调了其研究成果的重要性，即它不仅为不确定图中寻找最密集子图的问题提供了理论支持，还为相关领域的应用提供了实用的算法工具。作者还提到了对于本文工作的复制、传播和引用的具体规定，以确保学术引用的正确性和尊重版权。总结来看，Zhaonian Zou的研究为不确定图中寻找最密集子图的算法领域带来了新的进展，提供了新的概念工具和算法，对于图形数据库和相关数据挖掘技术领域有着重要的推动作用。

资源推荐

资源详情

资源评论

Polynomial-Time Algorithm for Finding Densest

Subgraphs in Uncertain Graphs

Zhaonian Zou

School of Computer Science and Technology

Harbin Institute of Technology, China

znzou@hit.edu.cn

ABSTRACT

This paper studies the problem of ﬁnding the densest sub-

graph in an uncertain graph. Due to uncertainty in graphs,

the traditional deﬁnitions of dense subgraphs are not ap-

plicable to uncertain graphs. In this paper, we introduce

the expected density of an uncertain graph. Based on the

expected density, we formalize the problem that, given an

uncertain graph G = (V, E, P ) and a set of vertices R ⊆ V ,

ﬁnds an induced subgraph G

′

= (V

′

, E

′

, P

′

) of G of the max-

imum expected density such that R ⊆ V

′

. We show that the

optimal solution can be found in O(nm log(n

/m)) time us-

ing max imum ﬂow techniques, where n = |V | and m = |E|.

Moreover, unlike the existing models of uncertain graphs,

the model used in this p aper is very general, which doesn’t

assume the existence of edges is mutually independent.

Categories and Subject Descriptors

G.2.2 [Discrete Mathematics]: Graph Theory — Graph

algorithms; H.2.8 [Database Management]: Database Ap-

plications — Data mining

General Terms

Measurement, algorithms, performance

Keywords

Uncertain graph, marginal constraint, expected density, para-

metric maximum ﬂow

1. INTRODUCTION

Dense subgraph discovery is a fundamental problem in

the research on graph databases. In literature, a number of

algorithms have been proposed for ﬁnding dense subgraph-

s in a given graph, where a variety of deﬁnitions of dense

subgraphs have been used, e.g., cliques [23], quasi-cliques

[1], k-cores [10], k-truss [24], and so on. In this paper, we

consider the density measure that assesses th e ratio of the

Permission to make digital or hard copies of all or part of this work for per-

sonal or classroom use is granted without fee provided that copies are not

made or distributed for proﬁt or commercial advantage and that copies bear

this notice and the full citation on the ﬁrst page. Copyrights for components

of this work owned by others than the author(s) must be honored. Abstract-

ing with credit is permitted. To copy otherwise, or republish, to post on

servers or to redistribute to lists, requires prior speciﬁc permission and/or a

fee. Request permissions from Permissions@acm.org.

MLG’13, August 11, 2013, Chicago, Illinois, USA.

ACM. ACM 978-1-4503-2322-2 ...$15.00..

0.7

0.8

0.2

0.91.0

0.1

0.7

Figure 1: Uncertain graph G.

number of edges to the number of vertices [12]. More precise-

ly, given a graph G = (V, E), the density of G is deﬁned by

ρ(G) = |E|/|V |. This deﬁnition of density of graph G equiv-

alently measures the average d egree of G because 2|E|/|V |

is equal to the average degree of G. Based on this density

measure, many studies have been carried out on the prob-

lem of ﬁnding a subgraph or an induced subgraph of the

maximum density in a given graph [4, 8, 9, 12].

Recently, uncertainty has been recognized to be intrin-

sic in large graph databases due to errors of measurements,

delayed updates of data, and data integration. Managing

and mining uncertain graph data have attracted a lot of re-

search attentions [14, 15, 16, 17, 18, 20, 21, 25, 26, 27]. In

our prior work [27], we deﬁne an uncertain graph by a triple

G = (V, E, P ), where each edge e ∈ E has a probability

of P (e) to exist in practice. Due to uncertainty, the tradi-

tional deﬁ nition of density ρ(G) of a graph G doesn’t make

sense on an uncertain graph. Consider the uncertain graph

G = (V, E, P ) in Figure 1, where the real number on each

edge e is P (e). If we t hink of G as an exact graph, the densi-

ty of G is 8/5. However, since edges (v

, v

) and (v

, v

) exist

with very low probability, the density of G should actually

be much lower than 8/5, and be close to 6/5.

In t his paper, we ﬁrst formalize the problem of ﬁnding the

densest subgraph in an uncertain graph. A ccording to our

uncertain graph model, an uncertain graph G = (V, E, P )

exists as an exact graph G

′

= (V, E

′

) in practice, where

each edge e ∈ E exists in E

′

with probability P (e). More

formally, we say that G implicates G. Let Ω(G) be the set

of exact graphs implicated by G. The uncertain graph G es-

sentially represents a probability mass function p over Ω(G),

where p(G) is equal to the probability of G implicating G

for all G ∈ Ω(G). For each exact graph G = (V, E) ∈ Ω(G),

the density of G is ρ(G) = |E|/|V |. Therefore, we evaluate

the density of G by the expected value of density of an exact

graph G chosen at random from Ω(G) according to proba-

bility m ass function p. Namely, this measure is called the

expected density of G. Hence, the densest subgraph problem

on uncertain graphs can be formalized as follows. Given an

uncertain graph G = (V, E, P ) and a set R ⊆ V , ﬁ nd an

induced subgraph G[V

′

] of G of the maximum expected den-

sity such that R ⊆ V

′

. The input R of the problem is a

constraint on the outpu t induced subgraph. If R = ∅, the

output is an induced subgraph of G of the maximum density.

It is worth noting that the model of uncertain graphs pro-

posed in this paper is quite general. Unlike the existing work

on managing and mining uncertain graphs [14, 15, 16, 17,

18, 20, 21, 25, 26, 27], we don’t assume that the existence

of edges of an uncertain graph is mutually independent. In

fact, any probability mass fun ct ion over Ω(G) that satisﬁes

the marginal constraint given later can be used in our work.

Except the theoretical importance, the problem of ﬁnding

induced subgraphs of the maximum expected density from

an uncertain graph also has many practical applications. For

example, the densest su bgraphs have been used as interest-

ing regions of annotated biological networks, in which valu-

able cross genome patterns can be found [5]. In fact, due to

the inherent uncertainty of high- throughput biological ex-

periments, biological networks are uncertain graphs [6, 13].

Therefore, it is of practical signiﬁcance for biologists to ﬁnd

the den sest subgraphs from uncertain biological networks to

get more reliable patterns.

The densest subgraph problem has also been applied in

community detection in large network s [7]. Indeed, a sub-

stantial numb er of networks such as social networks are un-

certain graphs due to the volatile nature of relationships [2].

Therefore, it is very important for analysts to ﬁnd subgraphs

of the maximum expected density from uncertain social net-

works to get more reliable communities.

The traditional densest subgraph problem deﬁned on ex-

act graphs has attracted considerable research attentions.

Goldberg [12] proposed an algorithm th at requires O(log n)

maximum ﬂow computations to ﬁnd a subgraph of the max-

imum density. Charikar [9] developed a simple greedy algo-

rithm that ﬁnds a subgraph of d ensity within a factor 2 of

the optimum. Most recently, Bahmani et al. [7] studied the

problem in a d ata stream model, an d presented algorithms

that ﬁnd a subgraph of den sity within a factor 2(1+ǫ) of the

optimum by making O(log

1+ǫ

n) passes over the input graph

stream, where ǫ > 0. For the variant of the problem with

size constraint, Anderson et al. [4] gave a 3-approximation

algorithm for the problem of ﬁnding an d ensest subgraph in-

duced by at least k vertices. Bhaskara et al. [8] studied the

problem of ﬁnding an densest subgraph induced by exactly k

vertices, and showed that the problem can be approximated

within a ratio of O(n

1/4

) in n

O(log n)

time.

Although the existing algorithms for the densest subgraph

problem on exact graphs guarantee good approximation ra-

tios, they can’t b e used on uncertain graphs. From the as-

pect of semantics, all t hese algorithms don’t consider uncer-

tainties, so the outputs of the algorithms are unab le to be

explained with respect to uncertainties. In addition, while

some algorithms ﬁnd densest subgraphs that satisfy size con-

straints, they can’t ﬁnd a subgraph of the maximum expect-

ed density that consists of a set R of speciﬁed vertices.

In this paper, we ﬁrst study the special case of the problem

in which the input R is an empty set. That is, the outp ut of

the problem is an induced subgraph of the input uncertain

graph G = (V, E, P ) of the maximum expected density. We

show that this problem is equivalent to the problem of ﬁnd-

ing t he densest induced subgraph in a weighted exact graph.

Thus, it can be solved in O(nm log (n

/m)) time [11], where

n = |V | and m = |E|.

We next stu dy the problem when the input R is not an

empty set. The method is very interesting. Let λ ≥ 0 be a

real value that we guessed for the maximum expected den-

sity of an induced subgraph that contains R. We reduce the

densest subgraph problem to the problem of searching the

desired value of λ. Interestingly, we can tell whether λ is

too big or too small by computing a minimum cut of a ﬂow

network constructed with respect to λ. We show that, s-

tarting from an arbitrary guessed value of λ, we can ﬁnd the

desired value of λ by carrying out at most n + 1 minimum

cut computations, which in turn can be solved by maximum

ﬂow techniques. When the desired value of λ is found, we

can construct an induced subgraph of the maximum expect-

ed density that contains R from th e minimum cut of the

ﬂow network constructed with respect to the desired value

of λ. Note that the comput ation shared by the series of min-

imum cut computations can be saved by parametric maxi-

mum ﬂow techniques. Thus, the densest induced subgraph

that contains R can be found in O(nm log(n

/m)) time by

carrying out th e parametric maximum ﬂow algorithm [11]

implemented using dynamic trees [22], where n = |V | and

m = |E|.

The rest of the paper is organized as follows. Section 2

deﬁnes the densest subgraph problem on uncertain graph s.

Section 3 presents a method for ﬁ nding the densest induced

subgraph of the input uncertain graph G when the input set

R is empty. Section 4 gives an algorithm for ﬁnding the

densest induced subgraph containing R when the input R is

not empty. Finally, the paper is concluded in Section 5.

2. PROBLEM STATEMENT

In this section, we introduce a model of uncertain graph-

s, deﬁne the expected density of an uncertain graph, and

give a formal statement of the densest subgraph problem on

uncertain graphs. We also introduce some helpful notation.

2.1 Uncertain Graphs

An uncertain graph is a triple G = (V, E, P ) in which V is

a set of vertices, E is a set of edges, and P is a function from

E to (0, 1] that associate each edge e ∈ E with a quantity

P (e) ∈ (0, 1] which represents the probability of e existing

in practice. If P (e) = 1, edge e certainly exists.

Because it is uncertain whether an edge e with P (e) < 1

exists in practice, an uncertain graph G = (V, E, P ) actually

exists as an exact graph G = (V, E

′

) which satisﬁes that

{e|e ∈ E, P (e) = 1} ⊆ E

′

⊆ E, that is, (1) all the edges e

with P (e) = 1 exist, an d (2) some of the edges e with P (e) <

1 may be absent. Following up the terminology in [27], we

say that th e uncertain graph G = (V, E, P ) impl icates the

exact graph G = (V, E

′

), denoted by G ⇒ G. Let Ω(G) be

the set of exact graphs implicated by G. One can readily

verify that |Ω(G)| = 2

|{e|e∈E,P (e)<1}|

Given an uncertain graph G = (V, E, P ) , if the existence

of edges of G is mutually independent, the probability that

G implicates an ex act graph G = (V, E

′

) is given by

Pr[G ⇒ G] =

e∈E

′

P (e) ·

e∈E\E

′

(1 − P (e)) .

Therefore, the function p(x) = Pr [G ⇒ x] is a probability

mass function over Ω(G). For a proof, please refer to [27].

剩余6页未读，继续阅读

评论收藏

内容反馈

weixin_38680506

粉丝: 4
资源: 927

Polynomial-Time Algorithm for Finding Densest Subgraphs in Uncer...

最新资源

Polynomial-Time Algorithm for Finding Densest Subgraphs in Uncer...

A survey of algorithms for dense subgraph discovery

A Polynomial-time Algorithm for the Change-Making Problem.pdf

NEW POLYNOMIAL-TIME ALGORITHM.pdf

A new polynomial-time interior-point algorithms for the Cartesian $P_*(kappa)$-SCLCP

拉格朗日插值法Fast-Polynomial-Interpolation-and-Evaluation-master.zip

Polynomial-Time Perfect Matchings in Dense Hypergraphs - 2013 (stoc212fp-keevash)-计算机科学

用神经网络的方法做函数逼近nn4polynomial-master.zip

Polynomial-Js:具有处理多项式方法的JS类

3-Colouring AT-Free Graphs in Polynomial Time

多项式拟合Example-Polynomial-Curve-Fitting.

a-quadratic-polynomial-addition.rar_java polynomial_polynomial_q

Polynomial-time-T-count-algo

[Book]Theory of Computational Complexity

Global optimization of polynomial-expressed nonlinear.

An Experimental Comparison of Min-Cut/Max-Flow Algorithms

polynomial-long-division:of多项式的长除法

08IntractabilityII-2x2.pdf

np难问题近似算法(绝版好书)

[Introduction to Algorithms 3rd Edition][算法导论第3版]

matlabfig生成代码-Solving-Energy-Efficiency-Problems-through-Polynomial-Opt

Computing and Combinatorics

Homogeneous Polynomial Forms for Robustness Analysis of Uncertain Systems

Polynomial-Phase-Signal_DOA.rar_DOA估计_极化_极化 DOA_极化DOA_电磁矢量

Interior methods for nonlinear optimization

Implementation-of-Polynomial-Interpolation:高中研究生项目

最新资源