没有合适的资源?快使用搜索试试~ 我知道了~
国外一篇人脸识别的论文
5星 · 超过95%的资源 需积分: 10 20 下载量 40 浏览量
2010-02-10
13:28:49
上传
评论
收藏 664KB PDF 举报
温馨提示
试读
15页
这是一篇国外的人脸识别的论文,对人脸识别算法进行了描述,以及最新进展和算法的改进,希望大家下载后能对大家做人脸识别有所帮助。
资源推荐
资源详情
资源评论
INVITED
PAPER
Face Recognition by Humans:
Nineteen Results All Computer
Vision Researchers Should
Know About
Increased knowledge about the ways people recognize each other may help to
guide efforts to develop practical automatic face-recognition systems.
By Pawan Sinha, Benjamin Balas, Yuri Ostrovsky, and Richard Russell
ABSTRACT
|
A key goal of computer vision re searchers is to
create automated face recognition systems that can equal, and
eventually surpass, human performance. To this end, it is
imperative that computational researchers know of the key
findings from experimental studies of face recognition by
humans. These findings provide insights into the nature of
cues that the human visual system relies upon for achieving its
impressive performance and serve as the building blocks for
efforts to artificially emulate these abilities. In this paper, we
present what we believe are 19 basic results, with implications
for the design of computat ional systems. Eac h re sult is
described briefly and appropriate pointers are provided to
permit an in-depth study of any particular result.
KEYWORDS
|
Benchmarks; configuration; face pigmentation;
face recognition; human vision; neural correlates; resolution;
visual development
I. INTRODUCTION
Notwithstanding the extensive research effort that has
gone into computational face recognition algorithms, we
have yet to see a system that can be deployed effectively in
an unconstrained setting, with all of the attendant
variability in imaging parameters such as sensor noise,
viewing distance, and illumination. The only system that
does seem to work well in the face of these challenges is
the human visual system. It makes eminent sense,
therefore, to attempt to understand the strategies this bio-
logical system employs, as a first step towards eventually
translating them into machine-based algorithms. With this
objective in mind, we review here 19 important results
regarding face recognition by humans. While these
observations do not constitute a coherent theory of face
recognition in human vision (we simply do not have all the
pieces yet to construct such a theory), they do provide
useful hints and constraints for one. We believe that for
this reason, they are likely to be useful to computer vision
researchers in guiding their ongoing efforts. Of course, the
success of machine vision systems is not dependent on a
slavish imitation of their biological counterparts. Insights
into the functioning of the latter serve primarily as
potentially fruitful starting points for computational
investigations.
We have endeavored to bring together in one place
several diverse results to be able to provide the reader a
fairly comprehensive picture of our current understanding
regarding how humans recognize faces. Each of the results
is briefly described and, whenever possible, accompanied
by its implications for computer vision. While the
descriptions here are not extensive for reasons of space,
we have provided relevant pointers to the literature for a
more in-depth study. The results are organized along the
following broad themes.
Recognition as a function of available spatial resolution
Result 1: Humans can recognize familiar faces in
very low-resolution images.
Result 2: The ability to tolerate degradations in-
creases with familiarity.
Manuscript received July 12, 2005; revised March 15, 2006.
P. Sinha, B. Balas, and Y. Ostrovsky are with the Department of Brain and Cognitive
Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
(e-mail: psinha@mit.edu; bjbalas@mit.edu; yostr@mit.edu).
R. Russell is with the Department of Psychology, Harvard University, Cambridge,
MA 02138 USA (e-mail: rrussell@fas.harvard.edu).
Digital Object Identifier: 10.1109/JPROC.2006.884093
1948 Proceedings of the IEEE | Vol. 94, No. 11, November 2006 0018-9219/$20.00
Ó
2006 IEEE
Result 3: High-frequency information by itself is
insufficient for good face recognition
performance.
The nature of processing: Piecemeal versus holistic
Result 4: Facial features are processed holistically.
Result 5: Of the different facial features, eyebrows
are among the most important for
recognition.
Result 6: The important configural relationships
appear to be independent across the width
and height dimensions.
The nature of cues used: Pigmentation, shape and motion
Result 7: Face-shape appears to be encoded in a
slightly caricatured manner.
Result 8: Prolonged face viewing can lead to high-
level aftereffects, which suggest proto-
type-based encoding.
Result 9: Pigmentation cues are at least as impor-
tant as shape cues.
Result 10: Color cues play a significant role, espe-
cially when shape cues are degraded.
Result 11: Contrast polarity inversion dramatically
impairs recognition performance, possi-
bly due to compromised ability to use
pigmentation cues.
Result 12: Illumination changes influence general-
ization.
Result 13: View-generalization appears to be medi-
ated by temporal association.
Result 14: Motion of faces appears to facilitate
subsequent recognition.
Developmental progression
Result 15: The visual system starts with a rudimen-
tary preference for face-like patterns.
Result 16: The visual system progresses from a piece-
meal to a holistic strategy over the first
several years of life.
Neural underpinnings
Result 17: The human visual system appears to de-
vote specialized neural resources for face
perception.
Result 18: Latency of responses to faces in infero-
temporal (IT) cortex is about 120 ms, sug-
gesting a largely feedforward computation.
Result 19: Facial identity and expression might be
processed by separate systems.
A. Recognition as a Function of Available
Spatial Resolution
1) Result 1: Humans Can Recognize Familiar Faces in Very
Low-Resolution Images: Progressive improvements in cam-
era resolutions provide ever-greater temptation to use
increasing amounts of detail in face representations in
machine vision systems. Higher image resolutions allow
recognition systems to discriminate between individuals
on the basis of fine differences in their facial features. The
advent of iris-based biometric systems is a case in point.
However, the problem that such details-based schemes
often have to contend with is that high-resolution images
are not always available. This is particularly true in
situations where individuals have to be recognized at a
distance. In order to design systems more robust against
image degradations, we can turn to the human visual
system for inspiration. Everyday, we are confronted with
the task of face identification at a distance and must extract
the critical information from the resulting low-resolution
images. Precisely how does face identification perfor-
mance change as a function of image resolution?
Pioneering work on face recognition with low-resolution
imagery was done by Harmon and Julesz [30], [31].
Working with block averaged images of familiar faces, they
found high recognition accuracies even with images
containing just 16 16 blocks. Yip and Sinha [89] found
that subjects could recognize more than half of an un-
primed set of familiar faces that had been blurred to have
equivalent image resolutions of merely 7 10 pixels (see
Fig. 1), and recognition performance reached ceiling level
at a resolution of 19 27 pixels. While the remarkable
tolerance of the human visual system to resolution
reduction is now indisputable, we do not have a clear
idea of exactly how this is accomplished. At the very least,
this result demonstrates that fine featural details are not
necessary to obtain good face recognition performance.
Furthermore, given the indistinctness of the individual
features at low resolutions, it appears likely that diagnos-
ticity resides in their overall configuration. However, pre-
cisely which aspects of this configuration are important,
and how we can computationally encode them, are open
questions.
2) Result 2: The Ability to Tolerate Degradations Increases
With Familiarity: In trying to uncover the mechanisms
underlying the human ability to recognize highly degraded
face images, we might wonder whether this is the result of
some general purpose compensatory processes, i.e., a
biological instantiation of model-free Bsuper resolution.[
However, the story appears to be more complicated. The
ability to handle degradations increases dramatically with
amount of familiarity. Bruce et al. [9] demonstrated ob-
servers’ poor performance on the task of matching two
different photographs of an unfamiliar person. Burton et al.
[10] have shown that observers’ recognition performance
with low-quality surveillance video is much better when the
individuals pictured are familiar colleagues, rather than
those with whom the observers have interacted infrequent-
ly. Additionally, body structure and gait information are
much less useful for identification than facial information,
Sinha et al.: Face Recognition by Humans: Nineteen Results Researchers Should Know About
Vol. 94, No. 11, November 2006 | Proceedings of the IEEE 1949
even though the effective resolution in that region is very
limited. Recognition performance changes only slightly
after obscuring the gait or body, but is affected dramatically
when the face is hidden, as illustrated in Fig. 2. This does
not appear to be a skill that can be acquired through general
experience; even police officers with extensive forensic
experience perform poorly unless they are familiar with the
target individuals. The fundamental question this finding,
and others like it [49], [66], bring up is the following: How
does the facial representation and matching strategy used
by the visual system change with increasing familiarity, so
as to yield greater tolerance to degradations? We do not yet
know exactly what aspect of the increased experience with
a given individual leads to an increase in the robustness of
the encoding; is it the greater number of views seen or is
the robustness an epiphenomenon related to some bio-
logical limitations such as slow memory consolidation
rates? Notwithstanding our limited understanding, some
implications for computer vision are already evident. In
considering which aspects of human performance to take
as benchmarks, we ought to draw a distinction between
familiar and unfamiliar face recognition. The latter may
end up being a much more modest goal than the former
and might constitute a false goal towards which to strive.
The appropriate benchmark for evaluating machine-based
face recognition systems is human performance with
familiar faces.
3) Result 3: High-Frequency Information by Itself Does Not
Lead to Good Face Recognition Performance: We have long
been enamored of edge maps as a powerful initial repre-
sentation for visual inputs. The belief is that edges capture
the most important aspects of images (the discontinuities)
while being largely invariant to shallow shading gradients
that are often the result of illumination variations. In the
context of human vision as well, line drawings appear to be
sufficient for recognition purposes. Caricatures and quick
pen portraits are often highly recognizable. Do these
observations mean that high spatial frequencies are
critical, or at least sufficient, for face recognition? Several
researchers have examined the contribution of different
spatial frequency bands to face recognition [14], [21].
Their findings suggest that high spatial frequencies might
not be too important for face perception. In the particular
domain of line drawings, Graham Davies and his col-
leagues have reported [16] that images which contain
exclusively contour information are very difficult to re-
cognize (specifically, they found that subjects could recog-
nize only 47% of the line drawings compared to 90% of the
original photographs; see Fig. 3). How can we reconcile
such findings with the observed recognizability of line
drawings in everyday experience? Bruce and colleagues
[6], [7] have convincingly argued that such depictions do,
in fact, contain significant photometric cues and that the
contours included in such a depiction by an accomplished
artist correspond not just to a low-level edge map, but in
Fig. 2. Frames from video-sequences used in Burton et al. [10] study.
(a) Original input. (b) Body obscured. (c) Face obscured. Based on
results from such manipulations, researchers concluded that
recognition of familiar individuals in low-resolution video is based
largely on facial information.
Fig. 1. Unlike current machine-based systems, human observers are able to handle significant degradations in face images. For instance,
subjects are able to recognize more than half of all familiar faces shown to them at the resolution depicted here. Individuals shown in
order are: Michael Jordan, Woody Allen, Goldie Hawn, Bill Clinton, Tom Hanks, Saddam Hussein, Elvis Presley, Jay Leno,
Dustin Hoffman, Prince Charles, Cher, and Richard Nixon.
Sinha et al.: Face Recognition by Humans: Nineteen Results Researchers Should Know About
1950 Proceedings of the IEEE | Vol. 94, No. 11, November 2006
剩余14页未读,继续阅读
资源评论
- xfj8_22016-06-30好文章,值得细细研读
dyuanning
- 粉丝: 0
- 资源: 4
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功