International Journal of Computer Vision 57(2), 137–154, 2004
c
2004 Kluwer Academic Publishers. Manufactured in The Netherlands.
Robust Real-Time Face Detection
PAUL VIOLA
Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA
viola@microsoft.com
MICHAEL J. JONES
Mitsubishi Electric Research Laboratory, 201 Broadway, Cambridge, MA 02139, USA
mjones@merl.com
Received September 10, 2001; Revised July 10, 2003; Accepted July 11, 2003
Abstract. This paper describes a face detection framework that is capable of processing images extremely rapidly
while achieving high detection rates. There are three key contributions. The first is the introduction of a new
image representation called the “Integral Image” which allows the features used by our detector to be computed
very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algo-
rithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of
potential features. The third contribution is a method for combining classifiers in a “cascade” which allows back-
ground regions of the image to be quickly discarded while spending more computation on promising face-like
regions. A set of experiments in the domain of face detection is presented. The system yields face detection perfor-
mance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and
Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per
second.
Keywords: face detection, boosting, human sensing
1. Introduction
This paper brings together new algorithms and insights
to construct a framework for robust and extremely rapid
visual detection. Toward this end we have constructed
a frontal face detection system which achieves detec-
tion and false positive rates which are equivalent to
the best published results (Sung and Poggio, 1998;
Rowley et al., 1998; Osuna et al., 1997a; Schneiderman
and Kanade, 2000; Roth et al., 2000). This face detec-
tion system is most clearly distinguished from previ-
ous approaches in its ability to detect faces extremely
rapidly. Operating on 384 by 288 pixel images, faces
are detected at 15 frames per second on a conventional
700 MHz Intel Pentium III. In other face detection
systems, auxiliary information, such as image differ-
ences in video sequences, or pixel color in color im-
ages, have been used to achieve high frame rates. Our
system achieves high frame rates working only with
the information present in a single grey scale image.
These alternative sources of information can also be in-
tegrated with our system to achieve even higher frame
rates.
There are three main contributions of our face detec-
tion framework. We will introduce each of these ideas
briefly below and then describe them in detail in sub-
sequent sections.
The first contribution of this paper is a new image
representation called an integral image that allows for
very fast feature evaluation. Motivated in part by the
work of Papageorgiou et al. (1998) our detection sys-
tem does not work directly with image intensities. Like