This book is devoted to the theory of probabilistic information measures and
their application to coding theorems for information sources and noisy chan-
nels. The eventual goal is a general development of Shannon's mathematical
theory of communication, but much of the space is devoted to the tools and
methods required to prove the Shannon coding theorems. These tools form an
area common to ergodic theory and information theory and comprise several
quantitative notions of the information in random variables, random processes,
and dynamical systems. Examples are entropy, mutual information, conditional
entropy, conditional information, and discrimination or relative entropy, along
with the limiting normalized versions of these quantities such as entropy rate
and information rate. Much of the book is concerned with their properties, es-
pecially the long term asymptotic behavior of sample information and expected
information.