BOOK Image alignment and stitching a tutorial

所需积分/C币:10 2017-09-24 10:15:19 5.07MB PDF
10
收藏 收藏
举报

图像拼接融合的权威书籍、英文原版,没有中文翻译的变形,适合广大视频拼接、图像拼接的研究人员阅读研究算法原理的最详细的参考资料
Introduction Algorithms for aligning images and stitching them into seamless photo- mosaics are among the oldest and most widely used in computer vision Frame-rate image alignment is used in every camcorder that has an image stabilization"feature. Image stitching algorithms create the high-resolution photo-mosaics used to produce today' s digital maps and satellite photos. They also come bundled with most digital cameras currently being sold, and can be used to create beautiful ultra wide- angle panoramas An early example of a widely used image registration algorithm is the patch-based translational alignment(optical flow) technique leveloped by Lucas and Kanade [123. Variants of this algorithm are used in almost all motion-compensated video compression schemes such as MPEG and H 263[113. Similar parametric motion cstima tion algorithms have found a wide variety of applications, including video summarization 20, 93, 111, 203, video stabilization [81], and video compression 95, 114. More sophisticated image registration algorithms havc also bccn developed for mcdical imaging and remote scnsing-scc 29, 71, 226 for some previous surveys of image registration techniques 2 In the photogrammetry community, more manually intensive meth- ods based on surveyed ground control points or manually registered tie points have long been used to register aeria.I photos into large-scale photo-mosaics [181. One of the key advances in this community was the development of bundle adiustment algorithms that could simultane ously solve for the locations of all of the camera positions, thus yielding globally consistent solutions 207. One of the recurring problems in cre aling photo-mosaics is the elimination of visible seams, for which a vari- ety of techniques have been developed over the years [1, 50, 135, 136, 148 In film photography, special cameras were developed at the turn the century to take ultra wide-angle panoramas, often by expos ing the film through a vertical slit as the camera rotated on its axis [131]. In the mid-1990s, image alignment techniques were started being applied to the construction of wide-angle seamless panoramas from regular hand-held cameras[ 43, 124, 193, 194. More recent work in this area has addressed the need to compute globally consistent alignments [167, 178, 199, the removal of ghosts"due to parallax and object movement [1, 50, 178, 210, and dealing with varying expo sures [1, 116, 124, 210].(A collection of some of these papers can be ound in [19. )These techniques have spawned a large number of com mercial stitching products 43, 168, for which reviews and comparison can be found on the Web While most of the above techniques work by directly minimizing pixel-LO-pixel dissimilarities, a dilTerent class of algorithms works by extracting a sparse set of features and then matching these to each other [7, 30, 35, 38, 129, 227. Feature-based approaches have the advan tage of being more robust against scene movement and are potential aster, if implemented the right way. Their biggest advantage, how- ever, is the ability to"recognize panoramas, i. e, to automatically dis- cover the adjacency(overlap) relationships among an unordered set of images, which makes them ideally suited for fully automated stitching of panoramas taken by casual users [30 What, then, are the essential problems in image alignment and stitching? For image alignment, we must first determine the appro priate mathematical model relating pixel coordinates in one image to pixel coordinates in another. Section 2 reviews these basic motion models. Next, we must somehow estimate the correct alignments relat ing various pairs (or collections) of images. Section 3 discusses how direct pixel-to-pixel comparisons combined with gradient descent(and other optimization techniques can be used to estimate these parame ters. Section 4 discusses how distinctive features can be found in each image and then efficiently matched to rapidly establish correspondences between pairs of images. When multiple images exist in a panorama Lechniques Imust be developed to compute a globally consistent set of alignments and to efficiently discover which images overlap one another These issues are discussed in Section 5 For image stitching, we must first choose a final compositing surface onto which lo warp and place all of the aligned images( Section 6) We also need to develop algorithms to seamlessly blend overlapping images, even in the presence of parallax, lens distortion, scene motion and exposure differences(Section 6). In the last section of this survey addilional applications of image stitching and open research problens were discussed Motion models Before we can register and align images, we need to establish the math- ematical relationships that map pixel coordinates from one image to another. A variety of such parametric motion models are possible, from simple 2d transforms, to planar perspective models, 3D camera rota, tions, lens distortions, and the mapping to nonplanar(e. g, cylindrical surfaces 194 To facilitate working with images at different resolutions, we adopt a variant of the normalized device coordinates used in computer graph ics [ 145, 216. For a typical (rectangular) image or video frame, we let the pixel coordinates range from [-1, l along the longer axis, and [a, a along the shorter, where a is the inverse of the aspect ratio, as shown in Figure 2. 1. For an image with width W and height II, the cquations mapping integer pixel coordinates a=(E, y to normalized device coordinates a =(a, y) are 2x-w 21-H and y whore S=max(W, H).(2. 1) I In computer graphics, it is usual to have both axes range from [-1, 1, but this requires the use of two different focal lengths for the vertical and horizontal dimensions, and makes I port.it and 2. 1. 2D(planar) Motions 5 Fig. 2.1 Mapping from pixel coordinates to normalized device coordinates Note that if we work with images in a pyramid, we need to halve the s value after each decimation step rather than recomputing it fron max(W, II), since the(W, II) values may get rounded off or truncated in an unpredictable manner. Note that for the rest of this paper, we use normalized device coordinates when referring to pirel coordinates 2.1 2D(planar) Motions Having defined our coordinate system, we can now describe how coor- dinates are transformed The simplest transformations occur in the 2D) ane and are illustrated in Figure 2.2 ranslation 2D translations can be written as x=a+t Or a'=It]a (2 where I is the(2 x 2)identity matrix and a=(,g, 1)is the homoge neous or projective 2D coordinate similarily projective translati Euclidean affine Fig. 2.2 Basic set of 2D planar transformations 6 Motion Models Rotation Translation This transformation is also known as 2D rigid body motion or the 2D Euclidean transformation (since Euclidean dist ances are preserved). It can be written as m'Rr+tor x-[t定, 2.3 whe os e ne 2.4 sine is an orthonormal rotation matrix with RR-I and R-1 Scaled rotation Also known as the similarity transform, this trans form can be expressed as am'sRr+t, where s is an arbitrary scale factor. It can also be written as b t (25) where we no longer require that a2+b2=1. The similarity transform preserves angles between lines. Affine The affine transform is written as =Ax. where a is an arbitrary 2 x 3 matrix, i.e a00C01a02 (2.6 10 11 12 Parallel lines remain parallel under affine transformations Projective This transform, also known as a perspective transform or homography, operates on homogeneous coordinates and a (27 where n denotes equality up to scale and h is an arbitrary 3X3 matrix NoLe Chat h is itself honogeneous, i. e, it is only defined up to a scale. The resulting homogeneous coordinate a must be normalized in order to obtain an inhomogeneous result a',1.e, hoo +ho1y+ h10x+h11y+112 h20- +h21y+h22 and (28 h20x+h21y+h22 Perspective transformations preserve straight lines 2.2. 3D Transformations 7 Table 2.1 Hierarchy of 2D coordinate transformations. The 2 x 3 matrices are extended 01 row to form a full 3 x 3 matrix for homogeneous coordinat transformations Name Matrix Number of d o f Preserves Ice Translati It 2×3 Orientation Rigid(E [RIt rt angles 2×3 Affine A 2×3 Parallelism Projective 8 Straight lines Hierarchy of 2D Transformations The preceding set of transfor mations are illustrated in Figure 2.2 and summarized in Table 2. 1. The easiest way to think of these is as a set of (potentially restricted )3 X 3 matrices operating on 2D homogeneous coordinate vectors. Hartley and Zisserman 86 contains a more detailed description of the hierarchy of 2d planar transformations The above transformations form a nested set of groups, i.e., they are closed under composition a nd have an inverse that is a, member of the same group. Each(simpler)group is a subset of the more complex roup below it 2.2 3D Transformations A similar nested hierarchy exists for 3D coordinate transformations that can be denoted using 4 x 4 transformation matrices, with 3D equivalents to translation, rigid body(Euclidean) and affine transfor mations, and homographies(sometimes called collineations)[86 The process of central projection maps 3D coordinates p=(X,Y, Z to 2D coordinates a=(a, 3, 1) through a pinhole at the camera origin onto a 2D projection plane a distance f along the x axis f亏,y=J 8 Motion Models e/2 (x, y, 1) (X,Y, Fig. 2.3 Central projec showing the relationship b the 3d and 2D coordinates p and a, as well as the relationship between the focal length f and the field of view 6 as shown in Figure 2. 3. The relationship between the(unit-less) focal length f and the ficld of vicw 0 is givcn by 2 or 6=tan 1 I 2.10 To convert the focal length f to its more commonly used 35 mm equiv alent, multiply the above number by 17.5(the half-width of a 35 mm photo negative frame). To convert it to pixel coordinates, multiply it by $/2 (half-width for a landscape photo In the computer graphics literature, perspective projection is often written as a permutation matrix that permutes the last two elements of homogeneous 4-vector p=(X,Y, Z, 1) 1000 100 000 0010 followed by a scaling and translation into screen and z-buffer ordina In computer vision, it is traditional to drop the z-buffer values, since these cannot be sensed in an image and to write 00 a o f op=[k o]p (212 0

...展开详情
试读 106P BOOK Image alignment and stitching a tutorial
立即下载 低至0.43元/次 身份认证VIP会员低至7折
一个资源只可评论一次,评论内容不能少于5个字
您会向同学/朋友/同事推荐我们的CSDN下载吗?
谢谢参与!您的真实评价是我们改进的动力~
关注 私信
上传资源赚钱or赚积分
最新推荐
BOOK Image alignment and stitching a tutorial 10积分/C币 立即下载
1/106
BOOK Image alignment and stitching a tutorial第1页
BOOK Image alignment and stitching a tutorial第2页
BOOK Image alignment and stitching a tutorial第3页
BOOK Image alignment and stitching a tutorial第4页
BOOK Image alignment and stitching a tutorial第5页
BOOK Image alignment and stitching a tutorial第6页
BOOK Image alignment and stitching a tutorial第7页
BOOK Image alignment and stitching a tutorial第8页
BOOK Image alignment and stitching a tutorial第9页
BOOK Image alignment and stitching a tutorial第10页
BOOK Image alignment and stitching a tutorial第11页
BOOK Image alignment and stitching a tutorial第12页
BOOK Image alignment and stitching a tutorial第13页
BOOK Image alignment and stitching a tutorial第14页
BOOK Image alignment and stitching a tutorial第15页
BOOK Image alignment and stitching a tutorial第16页
BOOK Image alignment and stitching a tutorial第17页
BOOK Image alignment and stitching a tutorial第18页
BOOK Image alignment and stitching a tutorial第19页
BOOK Image alignment and stitching a tutorial第20页

试读结束, 可继续阅读

10积分/C币 立即下载 >