Operating R. Stockton Gaines
Systems Editor
Time, Clocks, and the
Ordering of Events in
a Distributed System
Leslie Lamport
Massachusetts Computer Associates, Inc.
The concept of one event happening before another
in a distributed system is examined, and is shown to
define a partial ordering of the events. A distributed
algorithm is given for synchronizing a system of logical
clocks which can be used to totally order the events.
The use of the total ordering is illustrated with a
method for solving synchronization problems. The
algorithm is then specialized for synchronizing physical
clocks, and a bound is derived on how far out of
synchrony the clocks can become.
Key Words and Phrases: distributed systems,
computer networks, clock synchronization, multiprocess
systems
CR Categories: 4.32, 5.29
Introduction
The concept of time is fundamental to our way of
thinking. It is derived from the more basic concept of
the order in which events occur. We say that something
happened at 3:15 if it occurred
after
our clock read 3:15
and
before
it read 3:16. The concept of the temporal
ordering of events pervades our thinking about systems.
For example, in an airline reservation system we specify
that a request for a reservation should be granted if it is
made
before
the flight is filled. However, we will see that
this concept must be carefully reexamined when consid-
ering events in a distributed system.
General permission to make fair use in teaching or research of all
or part of this material is granted to individual readers and to nonprofit
libraries
acting for them provided that ACM's copyright notice is given
and that reference is made to the publication, to its date of issue, and
to the fact that reprinting privileges were granted by permission of the
Association for Computing Machinery. To otherwise reprint a figure,
table, other substantial excerpt, or the entire work requires specific
permission as does republication, or systematic or multiple reproduc-
tion.
This work was supported by the Advanced Research Projects
Agency of the Department of Defense and Rome Air Development
Center. It was monitored by Rome Air Development Center under
contract number F 30602-76-C-0094.
Author's address: Computer Science Laboratory, SRI Interna-
tional, 333 Ravenswood Ave., Menlo Park CA 94025.
© 1978 ACM 0001-0782/78/0700-0558 $00.75
558
A distributed system consists of a collection of distinct
processes which are spatially separated, and which com-
municate with one another by exchanging messages. A
network of interconnected computers, such as the ARPA
net, is a distributed system. A single computer can also
be viewed as a distributed system in which the central
control unit, the memory units, and the input-output
channels are separate processes. A system is distributed
if the message transmission delay is not negligible com-
pared to the time between events in a single process.
We will concern ourselves primarily with systems of
spatially separated computers. However, many of our
remarks will apply more generally. In particular, a mul-
tiprocessing system on a single computer involves prob-
lems similar to those of a distributed system because of
the unpredictable order in which certain events can
occur.
In a distributed system, it is sometimes impossible to
say that one of two events occurred first. The relation
"happened before" is therefore only a partial ordering
of the events in the system. We have found that problems
often arise because people are not fully aware of this fact
and its implications.
In this paper, we discuss the partial ordering defined
by the "happened before" relation, and give a distributed
algorithm for extending it to a consistent total ordering
of all the events. This algorithm can provide a useful
mechanism for implementing a distributed system. We
illustrate its use with a simple method for solving syn-
chronization problems. Unexpected, anomalous behav-
ior can occur if the ordering obtained by this algorithm
differs from that perceived by the user. This can be
avoided by introducing real, physical clocks. We describe
a simple method for synchronizing these clocks, and
derive an upper bound on how far out of synchrony they
can drift.
The Partial Ordering
Most people would probably say that an event a
happened before an event b if a happened at an earlier
time than b. They might justify this definition in terms
of physical theories of time. However, if a system is to
meet a specification correctly, then that specification
must be given in terms of events observable within the
system. If the specification is in terms of physical time,
then the system must contain real clocks. Even if it does
contain real clocks, there is still the problem that such
clocks are not perfectly accurate and do not keep precise
physical time. We will therefore define the "happened
before" relation without using physical clocks.
We begin by defining our system more precisely. We
assume that the system is composed of a collection of
processes. Each process consists of a sequence of events.
Depending upon the application, the execution of a
subprogram on a computer could be one event, or the
execution of a single machine instruction could be one
Communications July 1978
of Volume 21
the ACM Number 7
评论0
最新资源