Our parallel implementation targets graph families that are representative of real-world,
large-scale networks [8, 25, 13, 53, 52]. Real-world g r aphs are typically characterized by a
low diameter, heavy-tailed degree distributions modeled by power laws, and self-similarity.
They are often very large, with the number of vertices and edges ranging from several hun-
dreds of thousands to billions. On current workstations, it is not possible to do exact in-core
computations on these g r aphs due to the limited physical memory. In such cases, parallel
computing techniques can be applied to obtain exact solutions for memory and compute-
intensive graph problems quickly. For instance, recent experimental studies on Breadth-First
Search fo r large-scale graphs show that a parallel in-core implementation is two orders of
magnitude faster than an optimized external memory implementation [5, 2]. In this paper,
we present an efficient parallel implementa tion for the single source shortest paths problem
that can handle scale-free instances in the order of billions of edges. In addition, we con-
duct an experimental study of performance on several other graph families, and this work
is our submission to the 9th DIMACS Implementation Challenge [19] on Shortest Paths.
Sequential algorithms for the single source shortest path problem with non-negative edge
weights (NSSP) are studied extensively, both theoretically [23, 21, 26, 27, 56, 58, 36, 33, 4 8]
and experimentally [22, 31, 30, 1 6, 61, 32]. Let m and n denote the number of edges and
vertices in the graph respectively. Nearly all NSSP algorithms a r e based on the classical
Dijkstra’s [23 ] algorithm. Using Fibonacci heaps [26], Dijkst r a’s algo rithm can be imple-
mented in O(m + n log n) time. Thorup [58] presents a n O(m + n) RAM algorithm for
undirected graphs that differs significantly different from Dijkstra’s approach. Instead of
visiting vertices in the order of increasing distance, it traverses a component tree. Meyer
[49] and Goldberg [32] propose simple algorithms with linear average time for uniformly
distributed edge weights.
Parallel algorithms for solving NSSP are reviewed in detail by Meyer and Sanders [48, 51].
There are no known PRAM alg orithms that run in sub-linear time and O(m + n log n)
work. Parallel priority queues [24, 12] for implementing Dijkstra’s algorithm have been
developed, but these linear work algorithms have a worst -case time bound of Ω(n), as they
only perform edge relaxations in parallel. Several matrix-multiplication based algorithms
[37, 29], proposed for the parallel All-Pairs Shortest Paths (APSP), involve running time
and efficiency trade-offs. Parallel approximate NSSP algorithms [43, 17, 57] based on the
randomized Breadth-First search algorithm of Ullman and Yannakakis [60 ] run in sub-linear
time. However, it is not known how to use the Ullman-Yannakakis randomized approach for
exact NSSP computations in sub-linear time.
Meyer and Sanders give the ∆-stepping [51] NSSP algorithm that divides Dijkstra’s algo -
rithm into a number of phases, each of which can be executed in parallel. For random graphs
with uniformly distributed edge weights, this algorithm runs in sub-linear time with linear
average case work. Several theoretical improvements [50, 46, 47] are given for ∆-stepping
(for instance, finding shortcut edges, adaptive bucket-splitting), but it is unlikely that they
would be faster than the simple ∆-stepping algorithm in practice, as the improvements in-
volve sophisticated data structures that a r e hard to implement efficiently. On a ra ndom
d-regular graph instance (2
19
vertices a nd d = 3), Meyer and Sanders report a speedup of
2
评论0
最新资源