没有合适的资源?快使用搜索试试~ 我知道了~
KLT 算法 detection and tracking of point features1991.pdf
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 193 浏览量
2022-07-10
22:02:47
上传
评论
收藏 306KB PDF 举报
温馨提示
试读
22页
KLT 算法 detection and tracking of point features1991.pdf
资源推荐
资源详情
资源评论
Shape and Motion from Image Streams: a Factorization Method—Part 3
Detection and Tracking of Point Features
Technical Report CMU-CS-91-132
Carlo Tomasi Takeo Kanade
April 1991
Abstract
The factorization method described in this series of reports requires an algorithm to track the motion of features in an
image stream. Given the small inter-frame displacement made possible by the factorization approach, the best tracking
method turns out to be the one proposed by Lucas and Kanade in 1981.
The method defines the measure of match between fixed-size feature windows in the past and current frame as the
sum of squared intensity differences over the windows. The displacement is then defined as the one that minimizes
this sum. For small motions, a linearization of the image intensities leads to a Newton-Raphson style minimization.
In this report, after rederiving the method in a physically intuitive way, we answer the crucial question of how to
choose the feature windows that are best suited for tracking. Our selection criterion is based directly on the definition
of the tracking algorithm, and expresses how well a feature can be tracked. As a result, the criterion is optimal by
construction.
We show by experiment that the performance of both the selection and the tracking algorithm are adequate for our
factorization method, and we address the issue of how to detect occlusions. In the conclusion, we point out specific
open questions for future research.
Chapter 1
Introduction
The factorization method introduced in reports 1 and 2 of this series [12] [13] requires selecting and tracking of
features in an image stream. In this report we address the issues involved, and present our algorithm.
In general, two basic questions must be answered: how to select the features, and how to track them from frame
to frame. We base our solution to the tracking problem on a previous result by Lucas and Kanade [6], who proposed a
method for registering two images for stereo matching.
Their approach is to minimize the sum of squared intensity differences between a past and a current window.
Because of the small inter-frame motion, the current window can be approximated by a translation of the old one.
Furthermore, for the same reason, the image intensities in the translated window can be written as those in the original
window plus a residue term that depends almost linearly on the translation vector. As a result of these approximations,
one can write a linear 2 × 2 system whose unknown is the displacement vector between the two windows.
In practice, these approximations introduce errors, but a few iterations of the basic solution step suffice to converge.
The result is a simple, fast, and accurate registration method.
The first question posed above, however, was left unanswered in [6]: how to select the windows that are suitable
for accurate tracking. In the literature, several definitions of a ”good feature” have been proposed, based on an a priori
notion of what constitutes an ”interesting” window. For example, Moravec and Thorpe propose to use windows with
high standard deviations in the spatial intensity profile [8], [11], Marr, Poggio, and Ullman prefer zero crossings of
the Laplacian of the image intensity [7], and Kitchen, Rosenfeld, Dreschler, and Nagel define corner features based on
first and second derivatives of the image intensity function [5], [2].
In contrast with these selection criteria, which are defined independently of the registration algorithm, we show in
this report that a criterion can be derived that explicitly optimizes the tracking performance. In other words, we define
a feature to be good if it can be tracked well.
In this report, we first pose the problem (chapter 2), and rederive the equations of Lucas and Kanade in a physically
intuitive way (chapter 3). Chapter 4 introduces the selection criterion. We then show by experiment (chapter 5) that
the performance of both selector and tracker is satisfactory in a wide variety of situations, and discuss the problem of
detecting feature occlusion. Finally, in chapter 6, we close with a discussion of the suitability of this approach to our
factorization method for the computation of shape and motion, and point out directions for further research.
1
Chapter 2
Feature Tracking
As the camera moves, the patterns of image intensities change in a complex way. In general, any function of three
variables I(x, y, t), where the space variables x and y as well as the time variable t are discrete and suitably bounded,
can represent an image sequence. However, images taken at near time instants are usually strongly related to each
other, because they refer to the same scene taken from only slightly different viewpoints.
We usually express this correlation by saying that there are patterns that move in an image stream. Formally, this
means that the function I(x, y, t) is not arbitrary, but satisfies the following property:
I(x, y, t + τ ) = I(x − ξ, y − η, t) ; (2.1)
in plain English, a later image taken at time t + τ can be obtained by moving every point in the current image, taken
at time t, by a suitable amount. The amount of motion d = (ξ, η) is called the displacement of the point at x = (x, y)
between time instants t and t + τ , and is in general a function of x, y, t, and τ.
Even in a static environment under a constant lighting, the property described by equation (2.1) is violated in many
situations. For instance, at occluding boundaries, points do not just move within the image, but appear and disappear.
Furthermore, the photometric appearance of a region on a visible surface changes when reflectivity is a function of the
viewpoint.
However, the invariant (2.1) is by and large satisfied at surface markings, and away from occluding contours. At
locations where the image intensity changes abruptly with x and y, the point of change remains well defined even in
spite of small variations of overall brightness around it.
Surface markings abound in natural scenes, and are not infrequent in man-made environments. In our experiments,
we found that markings are often sufficient to obtain both good motion estimates and relatively dense shape results.
As a consequence, this report is essentially concerned with surface markings.
The Approach
An important problem in finding the displacement d of a point from one frame to the next is that a single pixel cannot
be tracked, unless it has a very distinctive brightness with respect to all of its neighbors. In fact, the value of the pixel
can both change due to noise, and be confused with adjacent pixels. As a consequence, it is often hard or impossible
to determine where the pixel went in the subsequent frame, based only on local information.
Because of these problems, we do not track single pixels, but windows of pixels, and we look for windows that
contain sufficient texture. In chapter 4, we give a definition of what sufficient texture is for reliable feature tracking.
Unfortunately, different points within a window may behave differently. The corresponding three-dimensional
surface may be very slanted, and the intensity pattern in it can become warped from one frame to the next. Or the
window may be along an occluding boundary, so that points move at different velocities, and may even disappear or
appear anew.
This is a problem in two ways. First, how do we know that we are following the same window, if its contents change
over time? Second, if we measure ”the” displacement of the window, how are the different velocities combined to give
2
the one resulting vector? Our solution to the first problem is residue monitoring: we keep checking that the appearance
of a window has not changed too much. If it has, we discard the window.
The second problem could in principle be solved as follows: rather than describing window changes as simple
translations, we can model the changes as a more complex transformation, such as an affine map. In this way, different
velocities can be associated to different points of the window.
This approach was proposed already in [6], and was recently explored in a more general setting in [10]. We feel,
however, that in cases where the world is known to be rigid the danger of over-parametrizing the system outweighs
the advantages of a richer model. More parameters to estimate require the use of larger windows to constrain the
parameters sufficiently. On the other hand, using small windows implies that only few parameters can be estimated
reliably, but also alleviates the problems mentioned above.
We therefore choose to estimate only two parameters (the displacement vector) for small windows. Any dis-
crepancy between successive windows that cannot be explained by a translation is considered to be error, and the
displacement vector is chosen so as to minimize this residue error.
Formally, if we redefine J (x) = I(x, y, t + τ), and I(x − d) = I(x − ξ, y − η, t), where the time variable has
been dropped for brevity, our local image model is
J(x) = I(x − d) + n(x) , (2.2)
where n is noise.
The displacement vector d is then chosen so as to minimize the residue error defined by the following double
integral over the given window W:
² =
Z
W
[I(x − d) − J (x)]
2
w dx . (2.3)
In this expression, w is a weighting function. In the simplest case, w could be set to 1. Alternatively, w could be a
Gaussian-like function, to emphasize the central area of the window. The weighting function w could also depend on
the image intensity pattern: the relation (3.3) holds for planar patches, and w could be chosen, as suggested in [6], to
de-emphasize regions of high curvature.
Several ways have been proposed in the literature to minimize this residue (see [1] for a survey). When the
displacement d is much smaller than the window size, the linearization method presented in [6] is the most efficient
way to proceed.
In the next chapter, we rederive this method, and explain it in a physically intuitive way. Then, in chapter 4, we
show that the registration idea can be extended also to selecting good features to track. As a consequence, feature
selection is no longer based on an arbitrary criterion for deciding what constitutes a feature. Rather, a good feature is
defined as one that can be tracked well, in a precise mathematical sense.
3
剩余21页未读,继续阅读
资源评论
老帽爬新坡
- 粉丝: 82
- 资源: 2万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于QT+C++的智能云监护仪项目,能够实时显示使用者心电、血氧、血压波形及其它各种参数+源码(毕业设计&课程设计&项目开发)
- 基于java开发的app接收硬件端传输的心音信号,具有显示心音波形,发出心音的功能+源码(毕业设计&课程设计&项目开发)
- Python 程序语言设计模式思路-行为型模式:职责链模式:将请求从一个处理者传递到下一个处理者
- 9241703124789646.16健身系统2.apk
- postgresql-16.3-1-windows-x64.exe
- Python 程序语言设计模式思路-结构型模式:装饰器讲解及利用Python装饰器模式实现高效日志记录和性能测试
- 基于YOLOv5和DeepSORT的多目标跟踪仿真与记录
- Python 程序语言设计模式思路-创建型模式:原型模式:通过复制现有对象来创建新对象,面向对象编程
- 卸载软件geek卸载软件geek
- Python 程序语言设计模式思路-创建型模式:单例模式,确保一个类的唯一实例(装饰器)面向对象编程、继承
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功