edly write (and read) that data item.
Later, if one of the other CPUs attempts to access
the data item, it will incur a cache miss, this time
because the first CPU invalidated the item in order
to write to it. This type of cache miss is termed
a “communication miss”, since it is usually due to
several CPUs using the data items to communicate
(for example, a lock is a data item that is used to
communicate among CPUs using a mutual-exclusion
algorithm).
Clearly, much care must be taken to ensure that
all CPUs maintain a coherent view of the data. With
all this fetching, invalidating, and writing, it is easy
to imagine data being lost or (perhaps worse) differ-
ent CPUs having conflicting values for the same data
item in their respective caches. These problems are
prevented by “cache-coherency protocols”, described
in the next section.
2 Cache-Coherence Protocols
Cache-coherency protocols manage cache-line states
so as to prevent inconsistent or lost data. These
protocols can be quite complex, with many tens of
states,
2
but for our purposes we need only concern
ourselves with the four-state MESI cache-coherence
protocol.
2.1 MESI States
MESI stands for “modified”, “exclusive”, “shared”,
and “invalid”, the four states a given cache line can
take on using this protocol. Caches using this proto-
col therefore maintain a two-bit state “tag” on each
cache line in addition to that line’s physical address
and data.
A line in the “modified” state has been subject to
a recent memory store from the corresponding CPU,
and the corresponding memory is guaranteed not to
appear in any other CPU’s cache. Cache lines in the
“modified” state can thus be said to be “owned” by
2
See Culler et al. [CSG99] pages 670 and 671 for the nine-
state and 26-state diagrams for SGI Origin2000 and Sequent
(now IBM) NUMA-Q, respectively. Both diagrams are signif-
icantly simpler than real life.
the CPU. Because this cache holds the only up-to-
date copy of the data, this cache is ultimately respon-
sible for either writing it back to memory or handing
it off to some other cache, and must do so before
reusing this line to hold other data.
The “exclusive” state is very similar to the “modi-
fied” state, the single exception being that the cache
line has not yet been modified by the correspond-
ing CPU, which in turn means that the copy of the
cache line’s data that resides in memory is up-to-
date. However, since the CPU can store to this line
at any time, without consulting other CPUs, a line
in the “exclusive” state can still be said to be owned
by the corresponding CPU. That said, because the
corresponding value in memory is up to date, this
cache can discard this data without writing it back
to memory or handing it off to some other CPU.
A line in the “shared” state might be replicated in
at least one other CPU’s cache, so that this CPU is
not permitted to store to the line without first con-
sulting with other CPUs. As with the “exclusive”
state, because the corresponding value in memory is
up to date, this cache can discard this data without
writing it back to memory or handing it off to some
other CPU.
A line in the “invalid” state is empty, in other
words, it holds no data. When new data enters the
cache, it is placed into a cache line that was in the
“invalid” state if possible. This approach is preferred
because replacing a line in any other state could re-
sult in an expensive cache miss should the replaced
line be referenced in the future.
Since all CPUs must maintain a coherent view
of the data carried in the cache lines, the cache-
coherence protocol provides messages that coordinate
the movement of cache lines through the system.
2.2 MESI Protocol Messages
Many of the transitions described in the previous sec-
tion require communication among the CPUs. If the
CPUs are on a single shared bus, the following mes-
sages suffice:
Read: The “read” message contains the physical ad-
dress of the cache line to be read.
3