DHT算法的实现，学习chord好材料_DHT在chord中的代码实现资源-CSDN文库

共17个文件

php：5个

png：4个

js：3个

4星 · 超过85%的资源需积分: 10 54 浏览量 2008-12-10 10:33:24 上传评论收藏 432KB ZIP 举报

分布式哈希表（DHT，Distributed Hash Table）是一种用于分布式系统中的数据存储技术，它通过将数据均匀地分布在大量的网络节点上，实现全局可寻址、去中心化的数据存储。DHT的设计目标是高效、容错和可扩展性。在DHT中，每个节点都负责一部分数据的存储，并且可以通过一个全局唯一的键来定位这些数据。标题提到的"chord"是DHT的一种具体实现，由MIT的研究人员提出。Chord协议基于环形结构，每个节点都有一个ID，ID空间是一个大的固定素数模环。节点通过简单的指针连接形成一个环状网络，这样每个节点都可以找到环上任何其他节点的前驱和后继节点。Chord利用了“Successor”和“Predecessor”概念，实现了快速的节点查找和数据定位。 `Distributed_hash_table.htm`可能是关于DHT基本原理的文档，它可能详细解释了DHT的工作机制，包括哈希函数的选择、数据分布策略、节点加入和离开网络时的维护操作等。在DHT中，通常使用一致性哈希算法来解决节点动态变化时的数据重新分布问题，以减少因节点增减导致的数据迁移。 `Tapestry_(DHT).htm`可能涉及另一种DHT实现——Tapestry。Tapestry是加州大学伯克利分校开发的，它引入了一种名为“Zoo”的分层结构，以及一种自适应路由算法，提高了系统的可靠性和性能。Tapestry还处理了网络延迟、节点故障和负载不均衡等问题。 `Tapestry__ A Resilient Global-Scale Overlay for Service Deployment JSAC03.pdf`是一个学术论文，可能深入讨论了Tapestry的详细设计、实验结果以及与Chord等其他DHT的比较。论文通常会涵盖理论背景、系统架构、算法设计、性能分析等方面。 `www.pudn.com.txt`可能是从某个网站下载的资料链接，可能包含更多关于DHT或Chord的相关资源和讨论。 `Distributed_hash_table_files`这个文件夹可能包含了更多与DHT相关的源代码、示例或者进一步的阅读材料。学习DHT和Chord，你需要理解分布式系统的基本概念，如节点通信、数据复制、容错机制和负载均衡。此外，掌握哈希函数、一致性哈希和分布式算法也是必不可少的。通过阅读这些文件，你可以深入了解DHT的工作原理，并有可能实现自己的DHT系统。

资源推荐

资源详情

资源评论

收起资源包目录

21840300DHT.zip （17个子文件）

Distributed_hash_table.htm 49KB

Tapestry_(DHT).htm 33KB

Tapestry__ A Resilient Global-Scale Overlay for Service Deployment JSAC03.pdf 422KB

www.pudn.com.txt 218B

Distributed_hash_table_files

wikimedia-button.png 890B

index.php 11B

wikibits.js 40KB

injection_graph_func.js 15KB

ajax.js 4KB

poweredby_mediawiki_88x31.png 2KB

people-meter-ltr.png 9KB

index_004.php 52KB

meter-en.png 2KB

index_005.php 11KB

index_002.php 34KB

commonPrint.css 5KB

index_003.php 193B

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 1, JANUARY 2004 41

Tapestry: A Resilient Global-Scale Overlay for

Service Deployment

Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, Member, IEEE, and

John D. Kubiatowicz, Member, IEEE

Abstract—We present Tapestry, a peer-to-peer overlay routing

infrastructure offering efficient, scalable, location-independent

routing of messages directly to nearby copies of an object or

service using only localized resources. Tapestry supports a generic

decentralized object location and routing applications program-

ming interface using a self-repairing, soft-state-based routing

layer. This paper presents the Tapestry architecture, algorithms,

and implementation. It explores the behavior of a Tapestry

deployment on PlanetLab, a global testbed of approximately 100

machines. Experimental results show that Tapestry exhibits stable

behavior and performance as an overlay, despite the instability

of the underlying network layers. Several widely distributed

applications have been implemented on Tapestry, illustrating its

utility as a deployment infrastructure.

Index Terms—Overlay networks, peer-to-peer (P2P), service de-

ployment, Tapestry.

I. INTRODUCTION

NTERNET developers are constantly proposing new and

visionary distributed applications. These new applications

have a variety of requirements for availability, durability, and

performance. One technique for achieving these properties is

to adapt to failures or changes in load through migration and

replication of data and services. Unfortunately, the ability to

place replicas or the frequency with which they may be moved

is limited by underlying infrastructure. The traditional way to

deploy new applications is to adapt them somehow to existing

infrastructures (often an imperfect match) or to standardize new

Internet protocols (encountering significant intertia to deploy-

ment). A flexible but standardized substrate on which to develop

new applications is needed.

In this paper, we present Tapestry [1], [2], an extensible

infrastructure that provides decentralized object location and

routing (DOLR) [3]. The DOLR interface focuses on

routing

of messages to endpoints such as nodes or object replicas.

DOLR virtualizes resources, since endpoints are named by

opaque identifiers encoding nothing about physical location.

Properly implemented, this virtualization enables message

Manuscript received November 15, 2002; revised May 1, 2003. This paper

was supported in part by the National Science Foundation (NSF) under Career

Award ANI-9985129 and Career Award ANI-9985250, in part by the NSF In-

formation Technology Research (ITR) under Award 5710001344, in part by the

California Micro Fund under Award 02-032 and Award 02-035, and in part by

Grants from IBM and Sprint.

B. Y. Zhao, L. Huang, S. C. Rhea, A. D. Joseph, and J. D. Kubiatowicz are

with the University of California, Berkeley, CA 94720 USA (e-mail: ravenben@

eecs.berkeley.edu; hling@eecs.berkeley.edu; srhea@eecs.berkeley.edu; adj@

eecs.berkeley.edu; kubitron@eecs.berkeley.edu).

J. Stribling is with Massachusetts Institute of Technology, Cambridge, MA

02139 USA (e-mail: strib@mit.edu).

Digital Object Identifier 10.1109/JSAC.2003.818784

delivery to mobile or replicated endpoints in the presence of

instability in the underlying infrastructure. As a result, a DOLR

network provides a simple platform on which to implement

distributed applications—developers can ignore the dynamics

of the network except as an optimization. Already, Tapestry has

enabled the deployment of global-scale storage applications

such as OceanStore [4] and multicast distribution systems such

as Bayeux [5]; we return to this in Section VI.

Tapestry is a peer-to-peer (P2P) overlay network that pro-

vides high-performance, scalable, and location-independent

routing of messages to close-by endpoints, using only localized

resources. The focus on routing brings with it a desire for effi-

ciency: minimizing message latency and maximizing message

throughput. Thus, for instance, Tapestry exploits locality in

routing messages to mobile endpoints such as object replicas;

this behavior is in contrast to other structured P2P overlay

networks [6]–[11].

Tapestry uses adaptive algorithms with soft state to maintain

fault tolerance in the face of changing node membership and

network faults. Its architecture is modular, consisting of an ex-

tensible upcall facility wrapped around a simple, high-perfor-

mance router. This applications programming interface (API)

enables developers to develop and extend overlay functionality

when the basic DOLR functionality is insufficient.

In the following pages, we describe a Java-based implemen-

tation of Tapestry, and present both microbenchmarks and mac-

robenchmarks from an actual, deployed system. During normal

operation, the relative delay penalty (RDP)

to locate mobile

endpoints is two or less in the wide area. Simulations show that

Tapestry operations succeed nearly 100% of the time under both

constant network changes and massive failures or joins, with

small periods of degraded performance during self-repair. These

results demonstrate Tapestry’s feasibility as a long running ser-

vice on dynamic, failure-prone networks such as the wide-area

Internet.

The following section discusses related work. Then,

Tapestry’s core algorithms appear in Section III, with details of

the architecture and implementation in Section IV. Section V

evaluates Tapestry’s performance. We then discuss the use of

Tapestry as an application infrastructure in Section VI and

conclude with Section VII.

II. R

ELATED WORK

The first generation of peer-to-peer systems included

file-sharing and storage applications: Napster, Gnutella,

RDP, or stretch, is the ratio between the distance traveled by a message to an

endpoint and the minimal distance from the source to that endpoint.

42 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 1, JANUARY 2004

MojoNation, and Freenet. Napster uses central directory

servers to locate files. Gnutella provides a similar, but dis-

tributed service using scoped broadcast queries, limiting

scalability. MojoNation [12] uses an online economic model to

encourage sharing of resources. Freenet [13] is a file-sharing

network designed to resist censorship. Neither Gnutella nor

Freenet guarantee that files can be located—even in a func-

tioning network.

The second generation of P2P systems are structured P2P

overlay networks, including Tapestry [1], [2], Chord [8], Pastry

[7], and CAN [6]. These overlays implement a basic key-based

routing (KBR) interface, that supports deterministic routing of

messages to a live node that has responsibility for the destina-

tion key. They can also support higher level interfaces such as

a distributed hash table (DHT) or a DOLR layer [3]. These sys-

tems scale well and guarantee that queries find existing objects

under nonfailure conditions.

One differentiating property between these systems is that

neither CAN nor Chord take network distances into account

when constructing their routing overlay; thus, a given overlay

hop may span the diameter of the network. Both protocols route

on the shortest overlay hops available and use runtime heuris-

tics to assist. In contrast, Tapestry and Pastry construct locally

optimal routing tables from initialization and maintain them in

order to reduce routing stretch.

While some systems fix the number and location of object

replicas by providing a DHT interface, Tapestry allows ap-

plications to place objects according to their needs. Tapestry

“publishes” location pointers throughout the network to facili-

tate efficient routing to those objects with low network stretch.

This technique makes Tapestry locality-aware [14]: queries for

nearby objects are generally satisfied in time proportional to the

distance between the query source and a nearby object replica.

Both Pastry and Tapestry share similarities to the work

of Plaxton

et al. [15] for a static network. Others [16], [17]

explore distributed object location schemes with provably

low search overhead, but they require precomputation, and so

are not suitable for dynamic networks. Recent works include

systems such as Kademlia [9], which uses XOR for overlay

routing, and Viceroy [10], which provides logarithmic hops

through nodes with constant degree routing tables. SkipNet

[11] uses a multidimensional skip-list data structure to support

overlay routing, maintaining both a DNS-based namespace for

operational locality and a randomized namespace for network

locality. Other overlay proposals [18], [19] attain lower bounds

on local routing state. Finally, proposals such as Brocade [20]

differentiate between local and interdomain routing to reduce

wide-area traffic and routing latency.

A new generation of applications have been proposed on top

of these P2P systems, validating them as novel application in-

frastructures. Several systems have application level multicast:

CAN-MC [21] (CAN), Scribe [22] (Pastry), and Bayeux [5]

(Tapestry). In addition, several decentralized file systems have

been proposed: CFS [23] (Chord), Mnemosyne [24] (Chord,

Tapestry), OceanStore [4] (Tapestry), and PAST [25] (Pastry).

Structured P2P overlays also support novel applications (e.g.,

attack resistant networks [26], network indirection layers [27],

and similarity searching [28]).

III. T

APESTRY

ALGORITHMS

This section details Tapestry’s algorithms for routing and ob-

ject location and describes how network integrity is maintained

under dynamic network conditions.

A. DOLR Networking API

Tapestry provides a datagram-like communications interface,

with additional mechanisms for manipulating the locations of

objects. Before describing the API, we start with a couple of

definitions.

Tapestry nodes participate in the overlay and are assigned

nodeIDs uniformly at random from a large identifier space.

More than one node may be hosted by one physical host.

Application-specific endpoints are assigned globally unique

identifiers (GUIDs), selected from the same identifier space.

Tapestry currently uses an identifier space of 160-bit values

with a globally defined radix (e.g., hexadecimal, yielding

40-digit identifiers). Tapestry assumes nodeIDs and GUIDs

are roughly evenly distributed in the namespace, which can

be achieved by using a secure hashing algorithm like SHA-1

[29]. We say that node

has nodeID , and an object has

GUID

Since the efficiency of Tapestry generally improves with net-

work size, it is advantageous for multiple applications to share a

single large Tapestry overlay network. To enable application co-

existence, every message contains an application-specific iden-

tifier

, which is used to select a process, or application for

message delivery at the destination [similar to the role of a port

in transmission control protocol/Internet protocol (TCP/IP)], or

an upcall handler where appropriate.

Given the above definitions, we state the four-part DOLR net-

working API as follows.

1) P

UBLISHOBJECT

: Publish, or make available, ob-

ject

on the local node. This call is best effort, and re-

ceives no confirmation.

2) U

NPUBLISHOBJECT

: Best-effort attempt to

remove location mappings for

3) R

OUTETOOBJECT

: Routes message to location

of an object with GUID

4) R

OUTETONODE( , Exact): Route message to applica-

tion

on node . “Exact” specifies whether destination

ID needs to be matched exactly to deliver payload.

B. Routing and Object Location

Tapestry dynamically maps each identifier

to a unique live

node called the identifier’s root or

. If a node exists with

, then this node is the root of . To deliver messages,

each node maintains a routing table consisting of nodeIDs and

IP addresses of the nodes with which it communicates. We refer

to these nodes as neighbors of the local node. When routing

toward

, messages are forwarded across neighbor links to

nodes whose nodeIDs are progressively closer (i.e., matching

larger prefixes) to

in the ID space.

1) Routing Mesh: Tapestry uses local tables at each node,

called neighbor maps, to route overlay messages to the des-

tination ID digit by digit (e.g.,

, where ’s represent wildcards). This approach is similar

ZHAO et al.: TAPESTRY: A RESILIENT GLOBAL-SCALE OVERLAY FOR SERVICE DEPLOYMENT 43

Fig. 1. Tapestry routing mesh from the perspective of a single node. Outgoing

neighbor links point to nodes with a common matching prefix. Higher level

entries match more digits. Together, these links form the local routing table.

Fig. 2. Path of a message. The path taken by a message originating from node

5230

destined for node

42AD

in a Tapestry mesh.

to longest prefix routing used by classless interdomain routing

(CIDR) IP address allocation [30]. A node

has a neighbor map

with multiple levels, where each level contains links to nodes

matching a prefix up to a digit position in the ID and contains a

number of entries equal to the ID’s base. The primary

th entry

in the

th level is the ID and location of the closest node that

begins with

“ ” (e.g., the ninth entry of the

fourth level for node

is the closest node with an ID that

begins with

. It is this prescription of “closest node” that

provides the locality properties of Tapestry. Fig. 1 shows some

of the outgoing links of a node.

Fig. 2 shows a path that a message might take through the in-

frastructure. The router for the

th hop shares a prefix of length

with the destination ID; thus, to route, Tapestry looks in

its

th level map for the entry matching the next digit

in the destination ID. This method guarantees that any existing

node in the system will be reached in at most

logical

hops, in a system with namespace size

, IDs of base , and

assuming consistent neighbor maps. When a digit cannot be

matched, Tapestry looks for a “close” digit in the routing table;

we call this surrogate routing [1], where each nonexistent ID

is mapped to some live node with a similar ID. Fig. 3 details

the

NEXTHOP function for chosing an outgoing link. It is this

dynamic process that maps every identifier

to a unique root

node

The challenge in a dynamic network environment is to

continue to route reliably even when intermediate links are

changing or faulty. To help provide resilience, we exploit

network path diversity in the form of redundant routing paths.

Fig. 3. Pseudocode for NEXTHOP

(

)

. This function locates the next hop toward

the root given the previous hop number,

, and the destination GUID

and

returns next hop or self if local node is the root.

Fig. 4. Tapestry object publish example. Two copies of an object

(

4378

)

are

published to their root node at

4377

. Publish messages route to root, depositing

a location pointer for the object at each hop encountered along the way.

Primary neighbor links shown in Fig. 1 are augmented by

backup links, each sharing the same prefix.

At the

th routing

level, the

neighbor links differ only on the th digit. There are

pointers on a level, and the total size of the neighbor map

. Each node also stores reverse references

(backpointers) to other nodes that point at it. The expected total

number of such entries is

2) Object Publication and Location: As shown above, each

identifier

has a unique root node assigned by the routing

process. Each such root node inherits a unique spanning tree for

routing, with messages from leaf nodes traversing intermediate

nodes en route to the root. We utilize this property to locate

objects by distributing soft-state directory information across

nodes (including the object’s root).

A server

, storing an object (with GUID, , and root

), periodically advertises or publishes this object by routing

a publish message toward

(see Fig. 4). In general, the

nodeID of

is different from is the unique [2]

node reached through surrogate routing by successive calls to

NEXTHOP . Each node along the publication path stores

a pointer mapping,

, instead of a copy of the object

itself. When there are replicas of an object on separate servers,

each server publishes its copy. Tapestry nodes store location

Current implementations keep two additional backups.

Note that objects can be assigned multiple GUIDs mapped to different root

nodes for fault tolerance.

44 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 1, JANUARY 2004

Fig. 5. Tapestry route to object example. Several nodes send messages to

object

4378

from different points in the network. The messages route toward

the root node of

4378

. When they intersect the publish path, they follow the

location pointer to the nearest copy of the object.

mappings for object replicas in sorted order of network latency

from themselves.

A client locates

by routing a message to (see Fig. 5).

Each node on the path checks whether it has a location map-

ping for

. If so, it redirects the message to . Otherwise, it for-

wards the message onwards to

(guaranteed to have a location

mapping).

Each hop toward the root reduces the number of nodes sat-

isfying the next hop prefix constraint by a factor of the identi-

fier base. Messages sent to a destination from two nearby nodes

will generally cross paths quickly because: each hop increases

the length of the prefix required for the next hop; the path to the

root is a function of the destination ID only, not of the source

nodeID (as in Chord); and neighbor hops are chosen for net-

work locality, which is (usually) transitive. Thus, the closer (in

network distance) a client is to an object, the sooner its queries

will likely cross paths with the object’s publish path, and the

faster they will reach the object. Since nodes sort object pointers

by distance to themselves, queries are routed to nearby object

replicas.

C. Dynamic Node Algorithms

Tapestry includes a number of mechanisms to maintain

routing table consistency and ensure object availability. In

this section, we briefly explore these mechanisms. See [2]

for complete algorithms and proofs. The majority of control

messages described here require acknowledgments and are

retransmitted where required.

1) Node Insertion: There are four components to inserting

a new node

into a Tapestry network.

a) Need-to-know nodes are notified of

, because fills a null

entry in their routing tables.

might become the new object root for existing objects.

References to those objects must be moved to

to main-

tain object availability.

c) The algorithms must construct a near optimal routing table

for

d) Nodes near

are notified and may consider using in their

routing tables as an optimization.

Node insertion begins at

’s surrogate (the “root” node that

maps to in the existing network). finds , the length of

the longest prefix its ID shares with

. sends out an Ac-

knowledged Multicast message that reaches the set of all ex-

isting nodes sharing the same prefix by traversing a tree based on

their nodeIDs. As nodes receive the message, they add

to their

routing tables and transfer references of locally rooted pointers

as necessary, completing components (a) and (b).

Nodes reached by the multicast contact

and become an

initial neighbor set used in its routing table construction.

performs an iterative nearest neighbor search beginning with

routing level

. uses the neighbor set to fill routing level ,

trims the list to the closest

nodes,

and requests these nodes

send their backpointers (see Section III-B) at that level. The

resulting set contains all nodes that point to any of the

nodes

at the previous routing level, and becomes the next neighbor

set.

then decrements , and repeats the process until all levels

are filled. This completes component (c). Nodes contacted

during the iterative algorithm use

to optimize their routing

tables where applicable, completing component (d).

To ensure that nodes inserting into the network in unison do

not fail to notify each other about their existence, every node

in the multicast keeps state on every node that is still mul-

ticasting down one of its neighbors. This state is used to tell

each node

with in its multicast tree about . Additionally,

the multicast message includes a list of holes in the new node’s

routing table. Nodes check their tables against the routing table

and notify the new node of entries to fill those holes.

2) Voluntary Node Deletion: If node

leaves Tapestry vol-

untarily, it tells the set

of nodes in ’s backpointers of its

intention, along with a replacement node for each routing level

from its own routing table. The notified nodes each send ob-

ject republish traffic to both

and its replacement. Meanwhile,

routes references to locally rooted objects to their new roots

and signals nodes in

when finished.

3) Involuntary Node Deletion: In a dynamic, failure-prone

network such as the wide-area Internet, nodes generally exit the

network far less gracefully due to node and link failures or net-

work partitions, and may enter and leave many times in a short

interval. Tapestry improves object availability and routing in

such an environment by building redundancy into routing tables

and object location references (e.g., the

backup forwarding

pointers for each routing table entry). Ongoing work has shown

Tapestry’s viability as a resilient routing [36].

To maintain availability and redundancy, nodes use periodic

beacons to detect outgoing link and node failures. Such events

trigger repair of the routing mesh and initiate redistribution

and replication of object location references. Furthermore,

the repair process is augmented by soft-state republishing of

object references. Tapestry repair is highly effective, as shown

in Section V-C. Despite continuous node turnover, Tapestry

retains nearly a 100% success rate at routing messages to nodes

and objects.

IV. T

APESTRY NODE ARCHITECTURE AND IMPLEMENTATION

In this section, we present the architecture of a Tapestry node,

an API for Tapestry extension, details of our current implemen-

is a knob for tuning the tradeoff between resources used and optimality of

the resulting routing table.

评论收藏

内容反馈

zss2zy

2011-10-20

不知道是用什么语言写的，看不懂啊
z312373781

2013-09-26

代码看不懂难道是用PHP写的？有一些文档资料不过都是英文的
xiaoyeahyeah

2012-11-15

不是太详细。。。
littlexiwang

2013-03-22

这个是从网上找过来的吧？
glsjay

2011-09-16

不是太详细，而且没有c或java的code，只css的code

前往

页

llq0118

粉丝: 0
资源: 1

DHT算法的实现，学习chord好材料

Chord算法实现

和弦：Chord DHT（分布式哈希表）论文的实现

SimpleDht:基于Chord的简单DHT实现。 虽然设计是基于 Chord 的，但它是 Chord 的简化版。 手指表和基于手指的路由没有实现。 此外，不处理节点叶故障。 实现的三件事是1）ID空间分区重新分区，2）基于环的路由，以及3）节点加入

基于CHORD环的DHT全分布式P2P网络结构分析.pdf

基于DHT的Kademlia 算法

DHT算法基本统计特性及其应用 (2009年)

分布式散列表（DHT）的原理:Kademlia和Chord(PDF)

Chordette：出于学习目的，Chord DHT模型的简化实现

NDN下DHT算法的设计与实现

go-chord-implementation：具有自定义或生成的活动节点的CHORD查找算法的简单实现

dht.zip_DHT C_algorithm chord_chord_chord D_chord algorithm

DHT.rar_Chord-Source-Code_DHT

论文研究-DHT算法的分析与改进 .pdf

chord算法演示源代码

p2p对等网络DHT算法

基于遗传算法的双向搜索Chord算法.pdf

Chord算法性能及优化策略分析

一种改进的chord路由算法

对等网络中dht算法研究

《chord源代码》

BiTtorrent的DHT算法 --- Kademlia 协议原理简介

分布式哈希表(Distributed Hash Table DHT)1

论文研究-基于Chord算法的拓扑相关改进 .pdf

DHT Algrithm 源码

Chord路由算法的改进

论文研究-一种基于DHT的资源查找算法.pdf

论文研究-基于DHT的几种P2P算法研究 .pdf

最新资源

SimpleDht:基于Chord的简单DHT实现。虽然设计是基于 Chord 的，但它是 Chord 的简化版。手指表和基于手指的路由没有实现。此外，不处理节点叶故障。实现的三件事是1）ID空间分区重新分区，2）基于环的路由，以及3）节点加入