没有合适的资源?快使用搜索试试~ 我知道了~
Information and Entropy (MIT 6.050J)
需积分: 10 20 下载量 194 浏览量
2017-03-17
21:05:31
上传
评论 1
收藏 4.41MB PDF 举报
温馨提示
试读
161页
Information and Entropy (MIT 6.050J)
资源推荐
资源详情
资源评论
Preface
These notes, and the related freshman-level course, are about information. Although you may have
a general idea of what information is, you may not realize that the information you deal with can be
quantified. That’s right, you can often measure the amount of information and use general principles about
how information behaves. We will consider applications to computation and to communications, and we will
also look at general laws in other fields of science and engineering.
One of these general laws is the Second Law of Thermodynamics. Although thermodynamics, a branch of
physics, deals with physical systems, the Second Law is approached here as an example of information pro-
cessing in natural and engineered systems. The Second Law as traditionally stated in thermodynamics deals
with a physical quantity known as “entropy.” Everybody has heard of entropy, but few really understand
it. Only recently has entropy been widely accepted as a form of information.
The Second Law is surely one of science’s most glorious achievements, but as usually taught, through
physical systems and models such as ideal gases, it is difficult to appreciate at an elementary level. On the
other hand, the forms of the Second Law that apply to computation and communications (where information
is processed) are more easily understood, especially today as the information revolution is getting under way.
These notes and the course based on them are intended for first-year university students. Although they
were developed at the Massachusetts Institute of Technology, a university that specializes in science and
engineering, and one at which a full year of calculus is required, calculus is not used much at all in these
notes. Most examples are from discrete systems, not continuous systems, so that sums and differences can
be used instead of integrals and derivatives. In other words, we can use algebra instead of calculus. This
course should be accessible to university students who are not studying engineering or science, and even to
well prepared high-school students. In fact, when combined with a similar one about energy, this course
could provide excellent science background for liberal-arts students.
The analogy between information and energy is interesting. In both high-school and first-year college
physics courses, students learn that there is a physical quantity known as energy which is conserved (it cannot
be created or destroyed), but which can app ear in various forms (potential, kinetic, electric, chemical, etc.).
Energy can exist in one region of space or another, can flow from one place to another, can be stored for
later use, and can be converted from one form to another. In many ways the industrial revolution was all
about harnessing energy for useful purposes. But most important, energy is conserved—despite its being
moved, stored, and converted, at the end of the day there is still exactly the same total amount of e nergy.
This conservation of energy principle is sometimes known as the First Law of Thermodynamics. It has
proven to be so important and fundamental that whenever a “leak” was found, the theory was rescued by
defining a new form of energy. One example of this occurred in 1905 when Albert Einstein recognized that
mass is a form of energy, as expressed by his famous formula E = mc
2
. That understanding later enabled
the development of devices (atomic bombs and nuclear power plants) that convert energy from its form as
mass to other forms.
But what about information? If entropy is really a form of information, there should be a theory that
i
ii
covers both and describes how information can be turned into entropy or vice versa. Such a theory is not
yet well developed, for several historical reasons. Yet it is exactly what is needed to simplify the teaching
and understanding of fundamental concepts.
These notes present such a unified view of information, in which entropy is one kind of information,
but in which there are other kinds as well. Like energy, information can reside in one place or another,
it can be transmitted through space, and it c an be stored for later use. But unlike energy, information is
not conserved: the Second Law states that entropy never decreases as time goes on—it generally increases,
but may in special cases remain constant. Also, information is inherently subjective, because it deals with
what you know and what you don’t know (entropy, as one form of information, is also subjective—this point
makes some physicists uneasy). These two facts make the theory of information different from theories that
deal with conserved quantities such as energy—different and als o interesting.
The unified framework presented here has never before been developed specifically for freshmen. It is
not entirely consistent with conventional thinking in various disciplines, which were developed separately. In
fact, we may not yet have it right. One consequence of teaching at an elementary level is that unanswered
questions close at hand are disturbing, and their resolution demands new research into the fundamentals.
Trying to explain things rigorously but simply often requires new organizing principles and new approaches.
In the present case, the new approach is to start with information and work from there to entropy, and the
new organizing principle is the unified theory of information.
This will be an exciting journey. Welcome aboard!
Chapter 1
Bits
Information is measured in bits, just as length is measured in meters and time is measured in seconds. Of
course knowing the amount of information, in bits, is not the same as knowing the information itself, what
it means, or what it implies. In these notes we will not consider the content or meaning of information, just
the quantity.
Different scales of length are needed in different circumstances. Sometimes we want to measure length
in kilometers, sometimes in inches, and sometimes in
˚
Angstr¨oms. Similarly, other scales for information
besides bits are sometimes used; in the context of physical system s information is often measured in Joules
per Kelvin.
How is information quantified? Consider a situation or experiment that could have any of several possible
outcomes. Examples might be flipping a coin (2 outcomes, heads or tails) or selecting a card from a deck of
playing cards (52 possible outcomes). How compactly could one person (by convention often named Alice)
tell another person (Bob) the outcome of such an experiment or observation?
First consider the case of the two outcomes of flipping a coin, and let us suppose they are equally likely.
If Alice wants to tell Bob the result of the coin toss, she could use several possible techniques, but they
are all equivalent, in terms of the amount of information conveyed, to saying either “heads” or “tails” or to
saying 0 or 1. We say that the information so conveyed is one bit.
If Alice flipped two coins, she could say which of the four possible outcomes actually happened, by saying
0 or 1 twice. Similarly, the result of an experiment with eight equally likely outcomes could be conveyed with
three bits, and more generally 2
n
outcomes with n bits. Thus the amount of information is the logarithm
(to the base 2) of the number of equally likely outcomes.
Note that conveying information requires two phases. First is the “setup” phase, in which Alice and Bob
agree on what they will communicate about, and exactly what each sequence of bits means. This common
understanding is called the c ode. For example, to convey the suit of a card chosen from a deck, their code
might be that 00 means clubs, 01 diamonds, 10 hearts, and 11 spades. Agreeing on the code may be done
before any observations have been made, so there is not yet any information to be sent. The setup phase can
include informing the recipient that there is new information. Then, there is the “outcome” phase, where
actual sequences of 0 and 1 representing the outcomes are sent. These sequences are the data. Using the
agreed-upon code, Alice draws the card, and tells Bob the suit by sending two bits of data. She could do so
repeatedly for multiple experiments, using the same code.
After Bob knows that a card is drawn but before receiving Alice’s message, he is uncertain about the suit.
His uncertainty, or lack of information, can be expressed in bits. Upon hearing the result, his uncertainty is
reduced by the information he receives. Bob’s uncertainty rises during the setup phase and then is reduced
during the outcome phase.
1
2 1.1 The Boolean Bit
Note some important things about information, some of which are illustrated in this example:
• Information can be learned through observation, exp erime nt, or measurement
• Information is subjective, or “observer-dependent.” What Alice knows is different from what Bob
knows (if information were not subjective, there would be no need to communicate it)
• A person’s uncertainty can be increased upon learning that there is an observation about which infor-
mation may be available, and then can be reduced by receiving that information
• Information can be lost, either through loss of the data itself, or through loss of the code
• The physical form of information is localized in space and time. As a consequence ,
– Information can be sent from one place to another
– Information can be stored and then retrieved later
1.1 The Boolean Bit
As we have s een, information can be communicated by sequences of 0 and 1 values. By using only 0 and
1, we can deal with data from many different types of sources, and not be concerned with what the data
means. We are thereby using abstract, not specific, values. This approach lets us ignore many messy details
associated with specific information processing and transmission systems.
Bits are simple, having only two possible values. The mathematics used to denote and manipulate single
bits is not difficult. It is known as Boolean algebra, after the mathematician George Boole (1815–1864).
In some ways Boolean algebra is similar to the algebra of integers or real numbers which is taught in high
school, but in other ways it is different.
Algebra is a branch of mathematics that deals with variables that have certain possible values, and with
functions which, when presented with one or more variables, return a result which again has certain possible
values. In the case of Boolean algebra, the possible values are 0 and 1.
First consider Boolean functions of a single variable that return a single value. There are exactly four of
them. One, called the identity, simply returns its argument. Another, called not (or negation, inversion, or
complement) changes 0 into 1 and vice versa. The other two simply return either 0 or 1 regardless of the
argument. Here is a table showing these four functions:
x f(x)
Argument
0
1
IDEN T IT Y
0
1
NOT
1
0
ZERO
0
0
ONE
1
1
Table 1.1: Boolean functions of a single variable
Note that Boolean algebra is simpler than algebra dealing with integers or real numbers, each of which
has infinitely many functions of a single variable.
Next, consider Boolean functions with two input variables A and B and one output value C. How many
are there? Each of the two arguments can take on either of two values, so there are four possible input
patterns (00, 01, 10, and 11). Think of each Boolean function of two variables as a string of Boolean values
0 and 1 of length equal to the number of possible input patterns, i.e., 4. There are exactly 16 (2
4
) different
ways of composing such strings, and hence exactly 16 different Boolean functions of two variables. Of these
16, two simply ignore the input, four assign the output to be either A or B or their complement, and the
other ten depend on both arguments. The most often used are AND, OR, XO R (exc lusive or), NAND (not
3 1.1 The Boolean Bit
x f(x)
Argument AND NAND OR N OR
00 0 1 0 1 0
01 0 1 1 0 1
10 0 1 1 0 1
11 1 0 1 0 0
XOR
Table 1.2: Five of the 16 possible Boolean functions of two variables
and), and NOR (not or), shown in Table 1.2. (In a similar way, because there are 8 p oss ible input patterns
for three-input Boolean functions, there are 2
8
or 256 different Boolean functions of three variables.)
It is tempting to think of the Boolean values 0 and 1 as the integers 0 and 1. Then AND would correspond
to multiplication and OR to addition, sort of. However, familiar results from ordinary algebra simply do not
hold for Boolean algebra, so such analogies are dangerous. It is important to distinguish the integers 0 and
1 from the Boolean values 0 and 1; they are not the sam e.
There is a standard notation used in Boolean algebra. (This notation is sometimes confusing, but other
notations that are less confusing are awkward in practice.) The AND function is represented the same way
as multiplication, by writing two Boolean values next to each other or with a dot in between: A AND B is
written AB or A B. The OR function is written using the plus sign: A + B means A OR B. Negation,
·
or the NOT function, is denoted by a bar over the symbol or expression, so NOT A is
A. Finally, the
exclusive-or function XOR is represented by a circle with a plus sign inside, A
⊕
B.
NOT A
AND A B·
NAND A B·
OR A + B
NOR A + B
XOR A
⊕
B
Table 1.3: Boolean logic symbols
There are other possible notations for Boolean algebra. The one used here is the most common. Some-
times AND, OR, and N OT are represented in the form AND(A, B), OR(A, B), and N
OT (A). Sometimes
infix notation is used where A
∧ B denotes A B, A ∨ B denotes A + B, and ∼A denotes A. Boolean algebra ·
is also useful in mathematical logic, where the notation A ∧ B for A B, A ∨ B for A + B, and A for A is· ¬
commonly used.
Several general properties of Boolean functions are useful. These can be proven by simply demonstrating
that they hold for all possible input values. For example, a function is said to be reversible if, knowing
the output, the input can be determined. Two of the four functions of a single variable are reversible in this
sense (and in fact are self-inverse). Clearly none of the functions of two (or more) inputs can by themselves
be reversible, since there are more input variables than output variables. However, some combinations of
two or more such functions can be reversible if the resulting combination has the name number of outputs
as inputs; for example it is easily demonstrated that the exclusive-or function A
⊕
B is reversible when
augmented by the function that returns the first argument—that is to say, more precisely, the function of
two variables that has two outputs, one A
⊕
B and the other A, is reversible.
For functions of two variables, there are many properties to consider. For example, a function of two
variables A and B is said to be commutative if its value is unchanged when A and B are interchanged, i.e.,
if f (A, B) = f(B, A). Thus the function AND is commutative because A B = B A. Some of the other 15
· ·
functions are also commutative. Some other properties of Boolean functions are illustrated in the identities
in Table
1.4.
剩余160页未读,继续阅读
资源评论
绝不原创的飞龙
- 粉丝: 1w+
- 资源: 1091
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功