A Taxonomy and Evaluation of Dense Two-Frame
Stereo Correspondence Algorithms
Daniel Scharstein Richard Szeliski
Dept. of Math and Computer Science Microsoft Research
Middlebury College Microsoft Corporation
Middlebury, VT 05753 Redmond, WA 98052
schar@middlebury.edu szeliski@microsoft.com
Abstract
Stereo matching is one of the most active research areas in
computer vision. While a large number of algorithms for
stereo correspondence have been developed, relatively lit-
tle work has been done on characterizing their performance.
In this paper, we present a taxonomy of dense, two-frame
stereo methods. Our taxonomy is designed to assess the dif-
ferent components and design decisions made in individual
stereo algorithms. Using this taxonomy, we compare exist-
ing stereo methods and present experiments evaluating the
performance of many different variants. In order to estab-
lish a common software platform and a collection of data
sets for easy evaluation, we have designed a stand-alone,
flexible C++ implementation that enables the evaluation of
individual components and that can easily be extended to in-
clude new algorithms. We have also produced several new
multi-frame stereo data sets with ground truth and are mak-
ing both the code and data sets available on theWeb. Finally,
we include a comparative evaluation of a large set of today’s
best-performing stereo algorithms.
1. Introduction
Stereo correspondence has traditionally been, and continues
to be, one of themostheavily investigatedtopicsin computer
vision. However, it is sometimes hard to gauge progress in
the field, as most researchers only report qualitative results
on the performance of their algorithms. Furthermore, a sur-
vey of stereo methods is long overdue, with the last exhaus-
tive surveys dating back about a decade [7, 37, 25]. This
paper provides an update on the state of the art in the field,
with particular emphasis on stereo methods that (1) operate
on two frames under known camera geometry, and (2) pro-
duce a dense disparity map, i.e., a disparity estimate at each
pixel.
Our goals are two-fold:
1. To provide a taxonomy of existing stereo algorithms
that allows the dissection and comparison of individual
algorithm components design decisions;
2. To provide a test bed for the quantitative evaluation
of stereo algorithms. Towards this end, we are plac-
ing sample implementations of correspondence algo-
rithms along with test data and results on the Web at
www.middlebury.edu/stereo.
We emphasize calibrated two-frame methods in order to fo-
cus our analysis on the essential components of stereo cor-
respondence. However, it would be relatively straightfor-
wardtogeneralizeourapproachtoincludemanymulti-frame
methods, in particular multiple-baseline stereo [85] and its
plane-sweep generalizations [30, 113].
The requirement of dense output is motivated by modern
applications of stereo such as view synthesis and image-
based rendering, which require disparity estimates in all im-
age regions, even those that are occluded or without texture.
Thus, sparse and feature-based stereo methods are outside
the scope of this paper, unless they are followedby a surface-
fitting step, e.g., using triangulation, splines, or seed-and-
grow methods.
We beginthispaperwith a reviewof the goals and scopeof
this study, which include the need for a coherent taxonomy
and a well thought-out evaluation methodology. We also
review disparity space representations, which play a central
role in this paper. In Section 3, we present our taxonomy
of dense two-frame correspondence algorithms. Section 4
discusses our current test bed implementation in terms of
the major algorithm components, their interactions, and the
parameters controlling their behavior. Section 5 describes
our evaluation methodology, including the methods we used
for acquiring calibrated data sets with known ground truth.
In Section 6 we present experiments evaluating the different
algorithm components, while Section 7 provides an overall
comparison of 20 current stereo algorithms. We conclude in
Section 8 with a discussion of planned future work.
1