
P2P resource pool explores a different dimension in which 1)
there can be multiple and simultaneous instances of different
applications and they could potentially overlap on the resources
they are running; and 2) a peer maybe helping an application
instance of which it is not a member. The first point illustrates
aggregated power of all potential resources, and the second
reflects and extends the very collaborative principle upon which
the P2P premise is found.
For instance, all existing application-level multicasting (ALM)
algorithms assume that the only resources available are those in
the ALM session. In a collaborative environment, many other
stand-by resources could be included for an otherwise more
optimal solution. For example, Microsoft Research has five
branches across the globe, and has many thousands of machines
that are geographically distributed. At a given hour, however,
number of active video-conference sessions is likely to be only a
handful, and each session may have a small number of participants
(say less than 20).
The availability of a P2P resource pool offers new optimization
possibilities. As shown in Figure 1, when an otherwise idle but
suitable helping peer is identified, it can be integrated into a
topology with better performance. This is an actual output of our
algorithm used in this paper.
Figure 1. (a) An optimal plan for an ALM. (b) An even better
plan using helper nodes in the resource pool. Circles are the
original members of the session, and the square is an available
peer with a large degree.
The P2P resource pool has seen its first incarnation in PlanetLab
[24], a wide-area P2P testbed. On PlanetLab, researchers upload
experiments on to machines comprising the testbed that will be
run concurrently with others. Up-to-date, its scale is still limited
(220 nodes as of the writing of the paper), and thus a need to
organize the pool in a more scalable and self-organizing fashion is
not yet profound. This is one of the problem this paper attempts to
address.
2.2 Resource pool and its alternatives
We wish to give a more concrete definition of resource pool by
contrasting it against another interesting alternative: the job pool.
This is necessary because, from a high-level perspective, both are
venues to deliver the matchmaking between job and resource, and
that neither is perfect: depending on the application scenario, each
has its unique strength and weakness.
Informally speaking, a job pool is a collection of jobs and is where
an idle resource look for suitable work to perform, whereas a
resource pool is the precisely the opposite: a task manager goes
into a resource pool to discover and acquire necessary helping
hands in order to accomplish a given mission. Of course, in a
distributed environment, there can be legitimate combination of
both.
A perfect example of a p2p job pool is SETI@home [18] (and of
course, many others of the same flavor). There is a well-
maintained central site and, typically, the application should be
easily parceled out for distribution. Machines register themselves
in order to grab a piece of work and then go away cranking away
whenever they feel like. This is a very economical model and
requires very limited amount of management at the central site.
Provided that job is of coarse granularity, a centralized
architecture works extremely well. It has been reported that
SETI@home has aggregated computing power far exceeding some
of the most powerful supercomputers in the world. The limitation
is also obvious. Although it is possible to think of advanced
variations, because the unpredictability of when and what
resources will become available, applications are restricted mostly
to those that are conventionally known as “embarrassingly
parallel” ones. It is also possible to implement the job pool as a
distributed architecture, but it will be far easier to just use a
centralized architecture.
In contrast, a node joins a resource pool in the hope that its power,
when otherwise idle, can be of some use. The economic incentive
can be stronger, especially in the context of P2P: tasks of arbitrary
type (beyond those of number crunching) will tap into the power
of other participants in the pool at some suitable point. The added
flexibility is particularly useful for applications such as running
application-level multicasting sessions with some level of QoS
guarantee. This is so because planning the topology of the tree is
itself a complex piece of work. On the other hand, the
consequences are many. Foremost of all is an accurate accounting
of what is going on in the resource pool. This is necessary for
each job to quickly query the available candidates and
subsequently make resource reservation. The implication is that,
given the potentially huge amount of resources in the millions, a
client-server architecture where each client updates one central
entity about its status is no longer a scalable – not to say robust
alternative.
To summarize, job pool is best for scenarios where the task can be
well-partitioned. Resource pool can ideally accommodate tasks of
arbitrary type. However, it will need, as a minimum, a scalable
way of monitoring and aggregating system information so that
resource reservations can be carried out at the discretion of task
managers that is responsible for individual incoming jobs.
The principle of resource pool is what motivates the work in the
Grid space, in particular the Condor-G line of work [14][8]. For
instance, the Grip Resource Registration Protocol (GRRP) is used
for an entity, typically representing a cluster of machines, to notify
other entities that it is part of the pool. Grip Resource Information
Protocol (GRIP), on the other hand, is the primitive to construct
aggregated resource directory service through which tasks can
query for potential candidates. The Condor-G agent can use such
infrastructure to submit jobs and monitor its progress.
There have been many discussions about the convergence of P2P
and Grid [13][22]. We believe that indeed there are many
synergies among the two in the space of resource pool
organization. In particular, we argue that the self-organizing
attributes is what the many excellent work of P2P can bring to the
scene of Grid. We will offer a more elaborate discussion at the
conclusion of Section-3.
3. BUILDING P2P RESOURCE POOL
The foundation of our resource pool proposal is the so-called
structured peer-to-peer systems, and in particular the distributed
hash table (DHT). DHT offers a way to pool together potentially
unlimited amount of resources together. But the capacity to pool
(a)
(b)
h
h