expected density of G. Hence, the densest subgraph problem
on uncertain graphs can be formalized as follows. Given an
uncertain graph G = (V, E, P ) and a set R ⊆ V , fi nd an
induced subgraph G[V
′
] of G of the maximum expected den-
sity such that R ⊆ V
′
. The input R of the problem is a
constraint on the outpu t induced subgraph. If R = ∅, the
output is an induced subgraph of G of the maximum density.
It is worth noting that the model of uncertain graphs pro-
posed in this paper is quite general. Unlike the existing work
on managing and mining uncertain graphs [14, 15, 16, 17,
18, 20, 21, 25, 26, 27], we don’t assume that the existence
of edges of an uncertain graph is mutually independent. In
fact, any probability mass fun ct ion over Ω(G) that satisfies
the marginal constraint given later can be used in our work.
Except the theoretical importance, the problem of finding
induced subgraphs of the maximum expected density from
an uncertain graph also has many practical applications. For
example, the densest su bgraphs have been used as interest-
ing regions of annotated biological networks, in which valu-
able cross genome patterns can be found [5]. In fact, due to
the inherent uncertainty of high- throughput biological ex-
periments, biological networks are uncertain graphs [6, 13].
Therefore, it is of practical significance for biologists to find
the den sest subgraphs from uncertain biological networks to
get more reliable patterns.
The densest subgraph problem has also been applied in
community detection in large network s [7]. Indeed, a sub-
stantial numb er of networks such as social networks are un-
certain graphs due to the volatile nature of relationships [2].
Therefore, it is very important for analysts to find subgraphs
of the maximum expected density from uncertain social net-
works to get more reliable communities.
The traditional densest subgraph problem defined on ex-
act graphs has attracted considerable research attentions.
Goldberg [12] proposed an algorithm th at requires O(log n)
maximum flow computations to find a subgraph of the max-
imum density. Charikar [9] developed a simple greedy algo-
rithm that finds a subgraph of d ensity within a factor 2 of
the optimum. Most recently, Bahmani et al. [7] studied the
problem in a d ata stream model, an d presented algorithms
that find a subgraph of den sity within a factor 2(1+ǫ) of the
optimum by making O(log
1+ǫ
n) passes over the input graph
stream, where ǫ > 0. For the variant of the problem with
size constraint, Anderson et al. [4] gave a 3-approximation
algorithm for the problem of finding an d ensest subgraph in-
duced by at least k vertices. Bhaskara et al. [8] studied the
problem of finding an densest subgraph induced by exactly k
vertices, and showed that the problem can be approximated
within a ratio of O(n
1/4
) in n
O(log n)
time.
Although the existing algorithms for the densest subgraph
problem on exact graphs guarantee good approximation ra-
tios, they can’t b e used on uncertain graphs. From the as-
pect of semantics, all t hese algorithms don’t consider uncer-
tainties, so the outputs of the algorithms are unab le to be
explained with respect to uncertainties. In addition, while
some algorithms find densest subgraphs that satisfy size con-
straints, they can’t find a subgraph of the maximum expect-
ed density that consists of a set R of specified vertices.
In this paper, we first study the special case of the problem
in which the input R is an empty set. That is, the outp ut of
the problem is an induced subgraph of the input uncertain
graph G = (V, E, P ) of the maximum expected density. We
show that this problem is equivalent to the problem of find-
ing t he densest induced subgraph in a weighted exact graph.
Thus, it can be solved in O(nm log (n
2
/m)) time [11], where
n = |V | and m = |E|.
We next stu dy the problem when the input R is not an
empty set. The method is very interesting. Let λ ≥ 0 be a
real value that we guessed for the maximum expected den-
sity of an induced subgraph that contains R. We reduce the
densest subgraph problem to the problem of searching the
desired value of λ. Interestingly, we can tell whether λ is
too big or too small by computing a minimum cut of a flow
network constructed with respect to λ. We show that, s-
tarting from an arbitrary guessed value of λ, we can find the
desired value of λ by carrying out at most n + 1 minimum
cut computations, which in turn can be solved by maximum
flow techniques. When the desired value of λ is found, we
can construct an induced subgraph of the maximum expect-
ed density that contains R from th e minimum cut of the
flow network constructed with respect to the desired value
of λ. Note that the comput ation shared by the series of min-
imum cut computations can be saved by parametric maxi-
mum flow techniques. Thus, the densest induced subgraph
that contains R can be found in O(nm log(n
2
/m)) time by
carrying out th e parametric maximum flow algorithm [11]
implemented using dynamic trees [22], where n = |V | and
m = |E|.
The rest of the paper is organized as follows. Section 2
defines the densest subgraph problem on uncertain graph s.
Section 3 presents a method for fi nding the densest induced
subgraph of the input uncertain graph G when the input set
R is empty. Section 4 gives an algorithm for finding the
densest induced subgraph containing R when the input R is
not empty. Finally, the paper is concluded in Section 5.
2. PROBLEM STATEMENT
In this section, we introduce a model of uncertain graph-
s, define the expected density of an uncertain graph, and
give a formal statement of the densest subgraph problem on
uncertain graphs. We also introduce some helpful notation.
2.1 Uncertain Graphs
An uncertain graph is a triple G = (V, E, P ) in which V is
a set of vertices, E is a set of edges, and P is a function from
E to (0, 1] that associate each edge e ∈ E with a quantity
P (e) ∈ (0, 1] which represents the probability of e existing
in practice. If P (e) = 1, edge e certainly exists.
Because it is uncertain whether an edge e with P (e) < 1
exists in practice, an uncertain graph G = (V, E, P ) actually
exists as an exact graph G = (V, E
′
) which satisfies that
{e|e ∈ E, P (e) = 1} ⊆ E
′
⊆ E, that is, (1) all the edges e
with P (e) = 1 exist, an d (2) some of the edges e with P (e) <
1 may be absent. Following up the terminology in [27], we
say that th e uncertain graph G = (V, E, P ) impl icates the
exact graph G = (V, E
′
), denoted by G ⇒ G. Let Ω(G) be
the set of exact graphs implicated by G. One can readily
verify that |Ω(G)| = 2
|{e|e∈E,P (e)<1}|
.
Given an uncertain graph G = (V, E, P ) , if the existence
of edges of G is mutually independent, the probability that
G implicates an ex act graph G = (V, E
′
) is given by
Pr[G ⇒ G] =
Y
e∈E
′
P (e) ·
Y
e∈E\E
′
(1 − P (e)) .
Therefore, the function p(x) = Pr [G ⇒ x] is a probability
mass function over Ω(G). For a proof, please refer to [27].