3
Delay Models
in
Data
Networks
3.1
INTRODUCTION
One of the most important perfonnance measures
of
a data network
is
the average delay
required to deliver a packet from origin to destination. Furthennore, delay considerations
strongly influence the choice and perfonnance
of
network algorithms, such as routing and
flow control. For these reasons, it
is
important to understand the nature and mechanism
of
delay, and the manner in which it depends on the characteristics
of
the network.
Queueing theory
is
the primary methodological framework for analyzing network
delay. Its use often requires simplifying assumptions since, unfortunately, more real-
istic assumptions make meaningful analysis extremely difficult. For this reason, it
is
sometimes impossible to obtain accurate quantitative delay predictions on the basis
of
queueing models. Nevertheless, these models often provide a basis for adequate delay
approximations, as well
as
valuable qualitative results and worthwhile insights.
In
what follows, we will mostly focus on packet delay within the communication
subnet
(i.e., the network layer). This delay
is
the sum
of
delays on each subnet link
traversed by the packet. Each link delay in tum consists of four components.
149
150
Delay Models
in
Data Networks Chap. 3
1.
The processinR delay between the time the packet
is
correctly received at the head
node
of
the link and the time the packet
is
assigned to an outgoing link queue
for transmission. (In some systems, we must add to this delay some additional
processing time at the DLC and physical layers.)
2.
The queueinR delay between the time the packet
is
assigned to a queue for trans-
mission and the time it starts being transmitted. During this time, the packet waits
while other packets in the transmission queue are transmitted.
3. The
transmission delay between the times that the first and last bits
of
the packet
are transmitted.
4. The
propagation delay between the time the last bit
is
transmitted
at
the head
node
of
the link and the time the last bit
is
received at the tail node. This is
proportional to the physical distance between transmitter and receiver; it can be
relatively substantial, particularly for a satellite link or a very high speed link.
This accounting neglects the possibility that a packet may require retransmission
on a link due to transmission errors or various other causes. For most links in practice,
other than multiaccess links to be considered in Chapter 4, retransmissions are rare and
will be neglected. The propagation delay depends on the physical characteristics
of
the
link and is independent
of
the traffic carried by the link. The processing delay
is
also
independent
of
the amount
of
traffic handled by the corresponding node
if
computation
power
is
not a limiting resource. This will be assumed in our discussion. Otherwise,
a separate processing queue must be introduced prior to the transmission queues. Most
of
our subsequent analysis focuses
on
the queueing and transmission delays.
We
first
consider a single transmission line and analyze some classical queueing models.
We
then
take up the network case and discuss the type
of
approximations involved
in
deriving
analytical delay models.
While
our
primary emphasis
is
on packet-switched network models, some
of
the
models developed are useful
in
a circuit-switched network context. Indeed, queueing
theory was developed extensively in response to the need for perfonnance models
in
telephony.
3.1.1 Multiplexing of Traffic
on
a Communication Link
The communication link considered
is
viewed
as
a bit pipe over which a given number
of
bits per second can be transmitted. This number
is
called the transmission capacity
of
the link.
It
depends on both the physical channel and the interface (e.g., modems), and
is
simply the rate at which the interface accepts bits. The link capacity may serve several
traffic streams
(e.g., virtual circuits or groups
of
virtual circuits) multiplexed on the link.
The manner
of
allocation
of
capacity among these traffic streams has a profound effect
on packet delay.
In the most common scheme,
statistical multiplexinR, the packets
of
all traffic
streams are merged into a single queue and transmitted on a first-come first-serve basis. A
variation
of
this scheme, which has roughly the same average delay per packet, maintains
Sec.
3.1
Introduction
151
a separate queue for each traffic stream and serves the queues in sequence one packet
at a time. However,
if
the queue
of
a traffic stream
is
empty, the next traffic stream
is
served and no communication resource
is
wasted. Since the entire transmission capacity
C (bits/sec)
is
allocated to a single packet at a time, it takes L / C seconds to transmit a
packet that
is
L bits long.
In time-division (TOM) and frequency-division multiplexing (FOM) with m traffic
streams, the link capacity
is
essentially subdivided into m
portions-one
per traffic
stream.
In
FOM, the channel bandwidth W
is
subdivided into m channels each with
bandwidth
W
/m
(actually slightly less because
of
the need for guard bands between
channels). The transmission capacity of each channel
is
roughly C
/m,
where C
is
the capacity that would be obtained
if
the entire bandwidth were allocated to a single
channel. The transmission time of a packet that
is
L bits long
is
Lm/C,
or m times
larger than
in
the corresponding statistical multiplexing scheme.
In
TOM, allocation
is
done by dividing the time axis into slots
of
fixed length (e.g., one bit or one byte long,
or perhaps one packet long for fixed length packets). Again, conceptually, we may view
the communication link
as
consisting
of
m separate links with capacity C
/m.
In the case
where the slots are short relative to packet length, we may again regard the transmission
time
of
a packet L bits long
as
Lm/C.
In
the case where the slots are
of
packet length,
the transmission time
of
an L bit packet
is
L/C,
but there
is
a wait
of
(m
-
1)
packet
transmission times between packets
of
the same stream.
One
of
the themes that will emerge from our queueing analysis
is
that statistical
multiplexing has smaller average delay per packet than either TOM or FOM. This
is
particularly true when the traffic streams multiplexed have a relatively low duty cycle.
The main reason for the poor delay performance
of
TOM and FOM
is
that communication
resources are wasted when allocated to a traffic stream with a momentarily empty queue,
while other traffic streams have packets waiting in their queue. For a traffic analogy,
consider an m-lane highway and two cases.
In
one case, cars are not allowed to cross
over to other lanes (this corresponds to TOM or FOM), while in the other case, cars can
change lanes (this corresponds roughly to statistical multiplexing). Restricting crossover
increases travel time for the same reason that the delay characteristics
of
TOM or FOM
are poor: namely, some system resources (highway lanes or communication channels)
may not be utilized, while others are momentarily stressed.
Under certain circumstances, TOM or FOM may have an advantage. Suppose that
each traffic stream has a "regular" character (i.e., all packets arrive sufficiently apart
so that no packet has to wait while the preceding packet
is
transmitted.)
If
these traffic
streams are merged into a single queue, it can be shown that the average delay per packet
will decrease, but the variance of waiting time in queue will generally become positive
(for an illustration, see Prob. 3.7). Therefore, if maintaining a small variability of delay
is more important than decreasing delay, it may be preferable to use TOM or FOM.
Another advantage
of
TOM and FOM
is
that there
is
no
need to include identification
of
the traffic stream on each packet, thereby saving some overhead and simplifying packet
processing at the nodes. Note also that when overhead
is
negligible, one can afford to
make packets very small, thereby reducing delay through pipelining
(cf. Fig. 2.37).
152
Delay Models
in
Data Networks
Chap. 3
3.2 QUEUEING MODELS-LITTLE'S THEOREM
We
consider queueing systems where customers arrive at random times to obtain service.
In the context
of
a data network, customers represent packets assigned
to
a communication
link for transmission. Service time corresponds to the packet transmission time and
is
equal to
LIC,
where L
is
the packet length in bits and C
is
the link transmission capacity
in
bits/sec. In this chapter it is convenient
to
ignore the layer 2 distinction between
packets and frames; thus packet lengths are taken to include frame headers and trailers.
In a somewhat different context (which we will not emphasize very much), customers
represent ongoing conversations (or virtual circuits) between points in a network and
service time corresponds to the duration
of
a conversation.
In
a related context, customers
represent active calls in a telephone or circuit switched network and again service time
corresponds to the duration
of
the call.
We
shall be typically interested in estimating quantities such as:
1.
The average number
of
customers
in
the system (i.e., the "typical" number
of
customers either waiting
in
queue or undergoing service)
2. The average delay per customer
(i.e., the "typical" time a customer spends waiting
in queue plus the service time).
These quantities will be estimated in terms
of
known information such as:
1.
The customer arrival rate (i.e., the "typical" number
of
customers entering the
system per unit time)
2. The customer service rate
(i.e., the "typical" number
of
customers the system serves
per unit time when it
is
constantly busy)
In
many cases the customer arrival and service rates are not sufficient
to
determine
the delay characteristics
of
the system. For example, if customers tend to arrive in
groups, the average customer delay will tend to be larger than when their arrival times
are regularly spaced apart. Thus to predict average delay, we will typically need more
detailed (statistical) information about the customer interarrival and service times. In
this section, however,
we
will largely ignore the availability
of
such information and see
how far we can go without
it.
3.2.1
Little's Theorem
We
proceed to clarify the meaning
of
the terms "average" and "typical" that we used
somewhat liberally above in connection with the number
of
customers in the system, the
customer delay, and so on. In doing so we will derive an important result known
as
Little's Theorem.
Suppose that we observe a sample history
of
the system from time t = 0 to
the indefinite future and
we
record the values
of
various quantities
of
interest
as
time
Sec. 3.2 Queueing Models-Little's Theorem
153
(3.1)
progresses.
In
particular, let
N(t)
= Number
of
customers in the system at time t
aCt) = Number
of
customers who arrived in the interval [0,
t]
T;
= Time spent
in
the system by the i
th
arriving customer
Our intuitive notion
of
the "typical" number
of
customers
in
the system observed up to
time
t
is
Nt = t
N(T)
dT
. t
io
which we call the time average
of
N(T)
up to time t. Naturally,
Nt
changes with the
time
t, but in many systems
of
interest, Nt tends to a steady-state N as t increases, that
is,
N
= lim
Nt
In
this case, we call N the steady-state time average (or simply time average) of N(T).
It
is
also natural to view
At
=
a(t)
t
as the time GI'erage arrival rate over the interval [0.
tj.
The steady-state arrival rate
is
defined
as
A=
lim
At
t-x
(assuming that the limit exists). The time average
of
the customer delay up to time t
is
similarly defined as
,\,0(1)
T
T -
6;=0
I
t -
net)
that is, the average time spent in the system per customer up to time
t.
The steady-state
time average customer delay
is
defined as
T
= lim T
t
(assuming that the limit exists).
It
turns out that the quantities
N,
A,
and T above are related by a simple formula
that makes it possible to determine one given the other. This result, known as Little's
Theorem, has the form
N=AT
Little's Theorem expresses the natural idea that crowded systems (large
N)
are associated
with long customer delays (large
T)
and reversely. For example, on a rainy day, traffic
on a rush hour moves slower than average (large
T),
while the streets are more crowded
(large
N).
Similarly, a fast-food restaurant (small
T)
needs a smaller waiting room
(small
N)
than a regular restaurant for the same customer arrival rate.