1 Multivariate Gaussians
A vector-valued random variable x ∈ R
n
is said to have a multivariate normal (or
Gaussian) distribution with mean µ ∈ R
n
and covariance matrix Σ ∈ S
n
++
if
p(x; µ, Σ) =
1
(2π)
n/2
|Σ|
1/2
exp
−
1
2
(x − µ)
T
Σ
−1
(x − µ)
. (1)
We write this as x ∼ N(µ, Σ). Here, recall from the section notes on linear algebra t hat S
n
++
refers to the space of symmetric positive definite n × n matrices.
5
Generally speaking, Gaussian random variables are extremely useful in machine learning
and statistics for two main reasons. First, they are extremely common when modeling “noise”
in statistical algorithms. Quite often, noise can be considered to be the accumulation of a
large number of small independent random perturbations affecting the measurement process;
by the Central Limit Theorem, summations of independent random variables will tend to
“look Gaussian.” Second, Gaussian random variables are convenient for many analytical
manipulations, because many of the integrals involving Gaussian distributions that arise in
practice have simple closed form solutions. In the remainder of this section, we will review
a number of useful properties of multivariate Gaussians.
Consider a random vector x ∈ R
n
with x ∼ N(µ, Σ). Suppo se also that the variables in x
have been partitioned into two sets x
A
= [x
1
··· x
r
]
T
∈ R
r
and x
B
= [x
r+1
··· x
n
]
T
∈ R
n−r
(and similarly for µ and Σ), such that
x =
x
A
x
B
µ =
µ
A
µ
B
Σ =
Σ
AA
Σ
AB
Σ
BA
Σ
BB
.
Here, Σ
AB
= Σ
T
BA
since Σ = E[(x − µ)(x − µ)
T
] = Σ
T
. The f ollowing properties hold:
1. Normalization. The density function normalizes, i.e.,
Z
x
p(x; µ, Σ)dx = 1.
This property, though seemingly trivial at first glance, turns out to be immensely
useful for evaluating all sorts of integrals, even o nes which appear to have no relation
to probability distributions at all (see Appendix A.1)!
2. Marginalization. The marginal densities,
p(x
A
) =
Z
x
B
p(x
A
, x
B
; µ, Σ)dx
B
p(x
B
) =
Z
x
A
p(x
A
, x
B
; µ, Σ)dx
A
5
There ar e a c tua lly cases in which we would want to deal with multivariate Gauss ian distributions where
Σ is positive semidefinite but not p ositive definite (i.e., Σ is not full rank). In such cas e s, Σ
−1
does not exist,
so the definition of the Gaussian density given in (1) does not apply. For instance, see the c ourse lecture
notes on “Factor Analysis.”
2
评论0
最新资源