Farkas alternative and Duality Theorem
There are theorems, called alternatives. They say that there are two and only two possibilities for
something, one of these possibilities must take place, and they can’t happen together. An example of an
alternative is, assuming that there is no state between life and death: a human being is alive or dead.
One side of the alternative, say alive, is called the obverse, and the other, in this case dead, the reverse.
In computer science, they use the word xor for exclusive or. Namely Mary xor Jane means Mary or
Jane, but not the two of them.
This set of notes proves one such theorem, called the Farkas alternative and shows that, in fact, it
underpins all the duality theory of linear programming. It underlies, in fact, most of optimisaiton, itself
being a particular case of the Separating Hyperplane Theorem.
First, some definitions (strictly speaking unnecessary, but de bon ton).
Cones. A set C ⊆ R
n
is called a cone if for any x ∈ C, one has λx ∈ C for all λ ≥ 0. This means that
the origin O is in C and, geometrically, if any x = 0 is in C, then the whole ray Ox is in C. With such a
general definition, a cone is not necessarily a closed or convex set.
Exercise: Later we shall deal with the notion of the dual cone. Namely, if C is any cone in R
n
, it dual
cone C
∗
is defined as C
∗
= {y ∈ R
n
: y · x ≥ 0, ∀x ∈ C}. Geometrically C
∗
contains all vectors y, such
that the angle between y and any vector x ∈ C is ninety degrees or less. From this point of view, it is
clear that (C
∗
)
∗
= C, so the dual of the dual is primal. Show this by definition.
If A is an m × n matrix, consider the set
C
A
= {y ∈ R
m
: y = Ax, x ∈ R
n
+
}.
Recall that R
n
+
means x ≥ 0. This set represents a closed convex cone, which is built on the columns
a
1
, . . . a
n
∈ R
m
of A. The reason it is closed and convex is simply because R
n
+
is a closed and convex
set, and C
A
is obtain from it via a linear transformation.
Now let A be a m × n matrix and b ∈ R
m
.
Theorem (Farkas alternative): One and only one of the following two cases is always true: Ax = b has
a solution x ∈ R
n
, x ≥ 0, xor there exists y ∈ R
m
, such that A
T
y ≥ 0 and y · b < 0.
Proof: Either b ∈ C
A
or not. If yes, then b = Ax for some x ≥ 0, by definition of C
A
.
If not, then we can apply the Separating Hyperplane Theorem. The two sets C
A
and {b} are closed
and convex and the latter set is b ounded. Then there exists a hyperplane that strictly separates these
two sets. I.e. for some y ∈ R
m
, β ∈ R, the equation of the hyperplane itself is y · z + β = 0, and for all
z ∈ C
A
, one has y · z + β > 0, while y · b + β < 0. Note that y is the normal vector to the hyperplane,
and z ∈ R
m
is a variable.
To get more info about y and β, let us try now some sp ecial points z from C
A
. First off, 0 ∈ C
A
, so
try z = 0. This implies, y · 0 + β > 0, hence β > 0. Therefore, y · b < 0. Now try z = M a
j
, where a
j
is the jth column of A, and M > 0 is a huge real. (For what x ≥ 0 do we have Ma
j
= Ax?) It follows
that for all j:
y · a
j
≥ −
β
M
, for any M > 0.
Passing to the limit M → ∞, we have that for all j, y · a
j
≥ 0, which is the matrix notation is written
exactly as A
T
y ≥ 0.
What if one replaces Ax = b in the obverse of the Farkas alternative by Ax ≤ b? The answer is easy:
add the slack variables. This augments the matrix A to [A I], where I is the m × m identity matrix.
On the reverse side A
T
y ≥ 0 should now apply to the augmented matrix, so it becomes A
T
y ≥ 0 and
Iy ≥ 0. In other words, here is one more formulation.
Farkas alternative, inequality formulation: One and only one of the following two cases is always true:
Ax ≤ b has a solution x ∈ R
n
+
, xor there exists y ∈ R
m
+
, such that A
T
y ≥ 0 and y · b < 0.
There is yet another interesting formulation that we’ll meet later speaking about Lagrange multipliers.
It bears a name of its own, the Fredholm alternative. It is obtained by changing the obverse of Farkas,
removing the non-negativity claim on x.
1