A simple way to memorize Farkas’ Lemma via LP duality

1. Weak duality

We start from the following primal linear program:

\begin{align*} \begin{aligned} \text{maximize}\quad & c^\top x \\ \text{subject to}\quad & A x \le b,\\ & x \ge 0. \end{aligned} \end{align*}

From this, let us introduce the Lagrange multiplier \(y \geq 0\) so that we can lift the constraint \(A x \geq b\) to the objective. Then,

\begin{align*} \begin{aligned} c^\top x &\le c^\top x + y^\top (b-Ax) &&\text{if } y\ge 0,\; b-Ax\ge 0\\ &= (c-A^\top y)^\top x + y^\top b\\ &\le y^\top b &&\text{if } x\ge 0,\; A^\top y\ge c. \end{aligned} \end{align*}

Clearly, in this second line, \(x\) is a Lagrange multiplier for the constraint \(y^\top A \geq c\). By assembling the conditions on \(y\) we obtain the dual linear problem:

\begin{align*} \begin{aligned} \text{minimize}\quad & y^\top b \\ \text{subject to}\quad & A^\top y \ge c,\\ & y \ge 0. \end{aligned} \end{align*}

These inequalities represent weak duality: every feasible value of the objective of the primal problem is \(\le\) every feasible value of the objective of the dual.

2. Some consequences

2.1. Farkas’ Lemma

Consider the linear system

\begin{align*} Ax \le b,\qquad x\ge 0. \end{align*}

One version of Farkas’ Lemma says that exactly one of the following two systems has a solution:

\(\exists x\ge 0\) with \(Ax\le b\).
\(\exists y\ge 0\) with \(A^\top y\ge 0\) and \(y^\top b<0\).

With duality theory we can easily see why not both statements can be true at the same time. We don’t prove the other (harder) part: the non-existence of a solution of one system implies the existence of a solution of the other.

Take \(c=0\) in the primal problem. Then, from weak duality, \[ 0 = c^\top x \le y^\top b. \] This implies that if \(x\) and \(y\) are both feasible, \(y^\top b < 0\) cannot be true. Reversily, if \(y^\top b < 0\), the primal problem cannot have a solution.

2.2. Strong duality and Complementary slackness

It’s now just a small step to strong duality and complementary slackness. Suppose there exist vectors \(w \ge 0\) and \(z \ge 0\) such that

\begin{align*} Ax + w &= b, &y^\top A + z^\top &= c^\top. \end{align*}

Then,

\begin{align*} c^\top x &= c^\top x + y^\top (b - A x - w) \\ &= y^{\top} b + (c^\top - y^\top A - z^\top) x + z^\top x - y^\top w \\ &= y^{\top} b + z^\top x - y^\top w \\ \end{align*}

Therefore we have the following: \(c^\top x = y^\top b\) iff \(w, z \geq 0\) exists such that

\begin{align*} z^\top x &= (c^\top - g^\top A) x = 0, & y^\top w &= y^\top (b - A x) = 0. \end{align*}

Moreover, because \(x, w, y, z \ge 0\), when \(z^\top x = 0,\; y^\top w = 0\), then the complementary slackness conditions follow:

\begin{align*} \forall i:\; x_i (c^{\top} - y^{\top} A)_{i} &= 0, & \forall j:\; y_j (A x - b)_j &= 0. \end{align*}

In other words, for each index the variable or the corresponding constraint must vanish.

2.3. Variations

Suppose we drop the condition \(x\geq 0\) in the LP \(\Max{c^\top x : A x \leq b, x \geq 0}\). Then,

\begin{equation} c^{\top} x \leq c^{\top} x + y^{\top}(b- Ax) = (c^\top-y^{\top} A) x + y^{\top} b =y^{\top} b \quad\text{only if } y^{\top} A = c^{\top}. \end{equation}

Therefore the dual becomes \(\Min{y^{\top}b : y^{\top}\geq 0, y^{\top}A = c^{\top}}\).

2.4. Some further compatibility conditions

Assume next that the system \(A x = b\) has solution \(x\). Then for any \(y^\top\) we have \(y^{\top} A x = y^\top b\). In particular, if \(\ker{A'} := \{y : y^\top A = 0\}\) is not empty, then \(y^\top b = 0\) for all \(y\in \ker{A'}\). In other words, the necessary and sufficient conditions for the solvability of the system \(A x = b\) is that the right side is orthogonal to all linearly independent solutions of the adjoint homogeneous system. And, reversily, if \(\text{ker} A') = \{0\}\), that is, the homogeneous adjoint problem has no non-vanishing solutions, then the problem \(A y = b\) becomes unconstrained, i.e., the vector \(b\) can be chosen freely.

3. An interpretation of the optimal solution in terms of Lagrange multipliers

Here is an interpretation for the case the duality gap is zero. In this case both problems have a solution. Since there is no duality gap, the constraint \(y^{\top} A \leq c^{\top}\) should in fact be an equality on part of the state space, i.e., \(\exists y^*\) such that \(y^* A = c^{\top}\). If we interpret \(c\) as the gradient of the function \(f(x) = c^{\top} x\), this equality states that \(c\) is a linear combination of row vectors of \(A\) that are active in the optimum. The row vectors are the gradients of the hyperplanes that form the constraints. Hence, the gradient of \(f(x) = c x\) is, in the optimum, a linear combination of gradients of active constraints. \(y\) is now seen to be a vector of Lagrange multipliers. This situation is completely analogous to the differentiable case, i.e. the problem to optimize a differentiable function \(f\) that has to satisfy a number of differentiable constraints \(g(x) = b\). Now in the optimum \(\nabla f = \lambda \nabla g\), that is, \(\nabla f\) is a linear combination of the gradients of the active constraint \(g\).