International Journal of Computer Vision 56(3), 221–255, 2004
c
2004 Kluwer Academic Publishers. Manufactured in The Netherlands.
Lucas-Kanade 20 Years On: A Unifying Framework
SIMON BAKER AND IAIN MATTHEWS
The Robotics Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
simonb@cs.cmu.edu
iainm@cs.cmu.edu
Received July 10, 2002; Revised February 6, 2003; Accepted February 7, 2003
Abstract. Since the Lucas-Kanade algorithm was proposed in 1981 image alignment has become one of the most
widely used techniques in computer vision. Applications range from optical flow and tracking to layered motion,
mosaic construction, and face coding. Numerous algorithms have been proposed and a wide variety of extensions
have been made to the original formulation. We present an overview of image alignment, describing most of the
algorithms and their extensions in a consistent framework. We concentrate on the inverse compositional algorithm,
an efficient algorithm that we recently proposed. We examine which of the extensions to Lucas-Kanade can be used
with the inverse compositional algorithm without any significant loss of efficiency, and which cannot. In this paper,
Part 1 in a series of papers, we cover the quantity approximated, the warp update rule, and the gradient descent
approximation. In future papers, we will cover the choice of the error function, how to allow linear appearance
variation, and how to impose priors on the parameters.
Keywords: image alignment, Lucas-Kanade, a unifying framework, additive vs. compositional algorithms, for-
wards vs. inverse algorithms, the inverse compositional algorithm, efficiency, steepest descent, Gauss-Newton,
Newton, Levenberg-Marquardt
1. Introduction
Image alignment consists of moving, and possibly de-
forming, a template to minimize the difference between
the template and an image. Since the first use of im-
age alignment in the Lucas-Kanade optical flow al-
gorithm (Lucas and Kanade, 1981), image alignment
has become one of the most widely used techniques
in computer vision. Besides optical flow, some of its
other applications include tracking (Black and Jepson,
1998; Hager and Belhumeur, 1998), parametric and
layered motion estimation (Bergen et al., 1992), mo-
saic construction (Shum and Szeliski, 2000), medical
image registration (Christensen and Johnson, 2001),
and face coding (Baker and Matthews, 2001; Cootes
et al., 1998).
The usual approach to image alignment is gradi-
ent descent. A variety of other numerical algorithms
such as difference decomposition (Gleicher, 1997) and
linear regression (Cootes et al., 1998) have also been
proposed, but gradient descent is the defacto standard.
Gradient descent can be performed in variety of dif-
ferent ways, however. One difference between the var-
ious approaches is whether they estimate an additive
increment to the parameters (the additive approach
(Lucas and Kanade, 1981)), or whether they estimate
an incremental warp that is then composed with the
current estimate of the warp (the compositional ap-
proach (Shum and Szeliski, 2000)). Another difference
is whether the algorithm performs a Gauss-Newton, a
Newton, a steepest-descent, or a Levenberg-Marquardt
approximation in each gradient descent step.
We propose a unifying framework for image align-
ment, describing the various algorithms and their ex-
tensions in a consistent manner. Throughout the frame-
work we concentrate on the inverse compositional