Machine Learning for High-Speed Corner Detection.pdf

所需积分/C币:11 2015-10-21 23:13:33 835KB PDF
收藏 收藏 2

FAST角点检测的最初始的英文原文,Edward Rosten 和 Tom Drummond 在2006年发表。我的博客里有详细的介绍这个算法,欢迎相互交流>_<
432 E. Rosten and t. drummond to DoG has been proposed which, provided that scales are v2 apart, speeds up computation by a factor of about two, compared to the striaghtforward imple mentation of Gaussian convolution 13 It is noted in 14 that the LoG is a particularly stable scale-space kernel Scale-space techniques have also been combined with the Harris approach in 15 which computes harris corners at multiple scales and retains only those which are also optima of the log response across scales Recently, scale invariance has been extended to consider features which are invariant to affine transformations141617 An edge(usually a step change in intensity) in an image corresponds to the boundary between two regions. At corners of regions, this boundary changes di- rection rapidly. Several techniques were developed which involved detecting and chaining edges with a view to finding corners in the chained edge by analysing the chain code 18, finding maxima of curvature19 20 21, change in direction22or change in appearance 23. Others avoid chaining edges and instead look for maxima of curvature 24 or change in direction 25 at places where the gradient is large Another class of corner detectors work by examining a small patch of an image to see if it looks like a corner. Since second derivatives are not computed, a noise reduction step(such as gaussian smoothing) is not required. Consequently, these corner detectors are computationally efficient since only a small number of pixels are examined for each corner detected. A corollary of this is that they tend to perform poorly on images with only large-scale features such as blurred images. The corner detector presented in this work belongs to this category The method presented in 26 assumes that a corner resembles a blurred wedge and finds the characteristics of the wedge(the amplitude, angle and blur by fitting it to the local image. The idea of the wedge is generalised in 27,where a method for calculating the corner strength is proposed which computes self similarity by looking at the proportion of pixels inside a disc whose intensity is within some threshold of the centre(nucleus) value. Pixels closer in value to the nucleus receive a higher weighting. This measure is known as the USan (the Univalue Segment Assimilating Nucleus). A low value for the Usan indicates a corner since the centre pixel is very different from most of its surroundings. A set of rules is used to suppress qualitatively "bad"features, and then local minima of the, SUSANS,Smallest USAN) are selected from the remaining candidates Trajkovic and Hedley 28 use a similar idea: a patch is not self-similar if pix els generally look different from the centre of the patch. This is measured by considering a circle. fc is the pixel value at the centre of the circle, and fp and fp/ are the pixel values at either end of a diameter line across the circle. The response function is defined as min(p-fc)+(fpI-fc) This can only be large in the case where there corner. The test is performed on a Bresenham circle. Since the circle is discretized, linear or circular interpo ation is used in between discrete orientations in order to give the detector a more isotropic response. To this end, the authors present a method whereby the Machine Learning for High-Speed Corner Detection 433 minimum response function at all interpolated positions between two pixels can be efficiently computed. Computing the response function requires performin a search over all orientations, but any single measurement provides an upper bound on the response. To speed up matching, the response in the horizontal and vertical directions only is checked. If the upper bound on the response is too low, then the potential corner is rejected. To speed up the method further, this fast check is first applied at a coarse scale A fast radial symmetry transform is developed in29 to detect points. Points have a high score when the gradient is both radially symmetric, strong, and of a uniform sign along the radius. The scale can be varied by changing the size of the area which is examined for radial symmetry An alternative method of examining a small patch of an image to see if it looks like a corner is to use machine learning to classify patches of the image as corners or non-corners. The examples used in the training set determine the type of features detected. In 30, a three layer neural network is trained to recognise corners where edges meet at a multiple of 45, near to the centre of an 8x 8 window. This is applied to images after edge detection and thinning. It is shown how the neural net learned a more general representation and was able to detect corners at a variety of angles 2 High-Speed Corner Detection 2.1 FAST: Features from Accelerated Segment Test The segment test criterion operates by considering a circle of sixteen pixels around the corner candidate p. The original detector 2 3 classifies p as a corner if there exists a set of n contiguous pixels in the circle which are all brighter than the intensity of the candidate pixel Ip plus a threshold t, or all darker than Ip-t, as illustrated in Figure l n was chosen to be twelve because it admits a high-speed test which can be used to exclude a very large number of non-corners: the test examines only the four pixels at 1, 5,9 and 13(the four compass directions). If p is a corner then at least three of these must all be brighter than Ip+t or darker than Ip-t. If neither of these is the case, then p cannot be a corner. The full segment test criterion can then be applied to the remaining candidates by examining all pixels in the circle. This detector in itself exhibits high performance, but there are several weaknesses 1. The high-speed test does not generalise well for n 12 2. The choice and ordering of the fast test pixels contains implicit assumptions about the distribution of feature appearance 3. Knowledge from the first 4 tests is discarded 4. Multiple features are detected adiacent to one another 2.2 Machine Learning a Corner Detector Here we present an approach which uses machine learning to address the first three points(the fourth is addressed in Section 2. 3). The process operates in 434 E. Rosten and t drummond 5 098 Fig. 1. 12 point segment test corner detection in an image patch. The highlighted squares are the pixels used in the corner detection. The pixel at p is the centre of a candidate corner. The arc is indicated by the dashed line passes through 12 contiguous pixels which are brighter than p by more than the threshold two stages. In order to build a corner detector for a given n, first, corners are detected from a set of images (preferably from the target application domain using the segment test criterion for n and a convenient threshold. This uses a slow algorithm which for each pixel simply tests all 16 locations on the circle around it For each location on the circle E (1.16, the pixel at that position relative to p(denoted by p- c)can have one of three states ≤Ip-t( darker) s, Ip-t<Ip-<Ip+t(similar) 5) b,Ip+t≤Ip →汇 (brighter Choosing an c and computing Sp-r for all pe P(the set of all pixels in all train- ing images partitions P into three subsets, Pa, Ps, Pb, where each p is assigned to pc Let Kn be a boolean variable which is true if p is a corner and false otherwise Stage 2 employs the algorithm used in ID3 31 and begins by selecting the which yields the most information about whether the candidate pixel is a corner measured by the entropy of K The entropy of K for the set P is H(P)=(c+clog(c+a-clog2c-clog2 c where c=plkp is true) (number of corners) p Kp is false number of non corners Machine Learning for High-Speed Corner Detection 435 The choice of then yields the information gain H(P)-H(Pd)-H(Ps-h(Po) Having selected the a which yields the most information, the process is applied recursively on all three subsets i.e. ab is selected to partition P in to Pb, d, Pb s P6. b, Is is selected to partition Ps in to Ps. d, Pss, Ps. b and so on, where each .c is chosen to yield maximum information about the set it is applied to. The process terminates when the entropy of a subset is zero. This means that all p in this subset have the same value of kp, i.e. they are either all corners or all non-corners. This is guaranteed to occur since K is an exact function of the learning data This creates a decision tree which can correctly classify all corners seen in the training set and therefore(to a close approximation)correctly embodies the rules of the chosen Fast corner detector. This decision tree is then converted into C-code, creating a long string of nested if-then-else statements which is compiled and used as a corner detector. For full optimisation, the code is compiled twice once to obtain profiling data on the test images and a second time with arc profiling enabled in order to allow reordering optimisations. In some cases, two of the three subtrees may be the same. In this case, the boolean test which separates them is removed Note that since the data contains incomplete coverage of all possible corners the learned detector is not precisely the same as the segment test detector. I would be relatively straightforward to modify the decision tree to ensure that it has the same results as the segment test algorithm, however, all feature detectors are heuristic to some degree, and the learned detector is merely a very slightly different heuristic to the segment test detector 2.3 Non-maximal Suppression Since the segment test does not compute a corner response function, non max imal suppression can not be applied directly to the resulting features. Conse- quently, a score function, V must be computed for each detected corner, and non-maximal suppression applied to this to remove corners which have an adja cent corner with higher v. There are several intuitive definitions for V The maximum value of n for which p is still a corner 2. The maximum value of t for which p is still a corner 3. The sum of the absolute difference between the pixels in the contiguous arc and the centre pixel Definitions l and 2] are very highly quantised measures, and many pixels share the same value of these. For speed of computation, a slightly modified version Bis used. V is given by V=max →汇 ∑|n-l 8 a∈ Sbright ∈S 436 E. Rosten and t drummond Sbright={x|l→≥Ip+t Shark={x|lp→x≤lp-t 2.4 Timing Results Timing tests were performed on a 2.6GHz Opteron and an 850MHz Pentium Ill processor. The timing data is taken over 1500 monochrome fields from a PAL video source(with a resolution of 768 x 288 pixels). The learned FAST detectors for nm=9 and 12 have been compared to the original fast detector to our implementation of the Harris and Dog(difference of Gaussians the detector used by SIFT)and to the reference implementation of SUSAN32 As can be seen in Tablel FAST in general offers considerably higher perfor- mance than the other tested feature detectors, and the learned Fast performs up to twice as fast as the handwritten version. Importantly, it is able to gen- erate an efficient detector for n=9, which(as will be shown in Section B) is the most reliable of the fast detectors On modern hardware. Fast consumes only a fraction of the time available during video processing and on low power hardware, it is the only one of the detectors tested which is capable of video rate processing at all Examining the decision tree shows that on average, 2. 26 (for m=9) and 2.39 12) questions are asked per pixel to determine whether or not it is feature. By contrast, the handwritten detector asks on average 2. 8 questions Interestingly, the difference in speed between the learned detector and the original FAsT are considerably less marked on the Opteron processor compared to the Pentium Ill. We believe that this is in part due to the opteron having (which assumes equal cost for all pixel accesses), compared to the pentium p a diminishing cost per pixel queried that is less well modelled by our system Table 1. Timing results for a selection of feature detectors run on fields(768 X 288) of a PaL video sequence in milliseconds, and as a percentage of the processing budget per frame. Note that since PAL and NTSC, Dv and 30Hz VGa (common for web cams) have approximately the same pixel rate, the percentages are widely applicable approximately 500 features per field are detected Detector Opteron 2.6GHz Pentium III 850MHz ns ns Fast n=9(non-max suppression) 1.33 6.65 5.29 26.5 Fas 9( 5.40 21.7 Fast n= 12(non-max suppression 1.34 6.70 23.0 12(raw) 1.175.85 21.5 Original FAST n= 12(non-max suppression) 1.59 9.60 48.0 Original FAST n= 12(raw) 1.49 7.45 9.25 48.5 Harris 24.0 120 166 830 DOG 60.1 301 1280 SUSAN 75837927.5 137.5 Machine Learning for High-Speed Corner Detection 437 3 A Comparison of Detector Repeatability Although there is a vast body of work on corner detection, there is much less on the subject of comparing detectors. Mohannah and Mokhtarian 33 evaluate performance by warping test images in an affine manner by a known amount They define the ' consistency of corner numbers'as CCN=100×11nm-na, where na is the number of features in the warped image and no is the number of features in the original image. They also define accuracy as ACU=100 where ng are the number of"ground truth corners(marked by humans) and na is the number of matched corners compared to the ground truth. This unfortu- nately relies on subjectively made devisions Trajkovic and Hedley 28 define stability to be the number of"strong'matches (matches detected over three frames in their tracking algorithm) divided by the total number of corners. This measurement is clearly dependent on both the tracking and matching methods used, but has the advantage that it can be tested on the date used by the system When measuring reliability, what is important is if the same real-world fea tures are detected from multiple views 1 This is the definition which will be used here. For an image pair, a feature is 'detected if is is extracted in one image and appears in the second. It is " repeated if it is also detected nearby in the second. The repeatability is the ratio of repeated features detected features 1, the test is performed on images of planar scenes so that the relationship between point positions is a homography Fiducial markers are projected on to the planar scene to allow accurate computation of this By modelling the surface as planar and using flat textures, this technique tests the feature detectors ability to deal with mostly affine warps (since image features are small) under realistic conditions. This test is not so well matched to our intended application domain, so instead, we use a 3D surface model to compute where detected features should appear in other views(illustrated in Figure 2). This allows the repeatability of the detectors to be analysed on features caused by geometry such as corners of polyhedra, occlusions and T-junctions We also allow bas-relief textures to be modelled with a fat plane so that the repeatability can be tested under non-affine warping A margin of error must be allowed because 1. The alignment is not perfect 2. The model is not perfect 3. The camera model(especially regarding radial distortion) is not perfect 4. The detector may find a maximum on a slightly different part of the corner This becomes more likely as the change in viewpoint and hence change in shape of the corner become large 438 E Rosten and T drummond Detect features in frame 1 Detect features in frame 2 Warp frame 1 to match frame 2 warped feature positions to detected features in frame 2 Fig. 2. Repeatability is tested by checking if the same real-world features are de tected in different views. a geometric model is used to compute where the features re project to Fig 3. Box dataset: photographs taken of a test rig(consisting of photographs pasted to the inside of a cuboid)with strong changes of perspective, changes in scale and large amounts of radial distortion. This tests the corner detectors on planar textures Instead of using fiducial markers, the 3D model is aligned to the scene by hand and this is then optimised using a blend of simulated annealing and gradient descent to minimise the Ssd between all pairs of frames and reprojections To compute the Ssd between frame i and reprojected frame 3, the position of all points in frame j are found in frame i. The images are then bandpass filtered High frequencies are removed to reduce noise, while low frequencies are removed to reduce the impact of lighting changes. To improve the speed of the system the Ssd is only computed using 1000 random points(as opposed to every point The datasets used are shown in Figure图 Figure团 and figure固 With these datasets, we have tried to capture a wide range of corner types(geometric and textura The repeatability is computed as the number of corners per frame is varied. For comparison we also include a scattering of random points as a baseline measure since in the limit ifevery pixel is detected as a corner, then the repeatability is 100% Machine Learning for High-Speed Corner Detection 439 Fig. 4. Maze dataset: photographs taken of a prop used in an augmented reality ap lication. This set consists of textural features undergoing projective warps as well as geometric features. There are also significant changes of scale Fig. 5. Bas-relief dataset: the model is a flat plane, but there are many objects with significant relief. This causes the appearance of features to change in a non affine way from different viewpoints To test robustness to image noise, increasing amounts of Gaussian noise were added to the bas-relief dataset lt should be noted that the noise added is in addition to the significant amounts of camera noise already present(from thermal noise, electrical interference, and etc) 4 Results and discussion Shi and Tomasi 7, derive their result for better feature detection on the as- sumption that the deformation of the features is affine. In the box and maze datasets, this assumption holds and can be seen in Figure 6B and Figure 6C the detector outperforms the Harris detector In the bas-relief dataset this assump tion does not hold, and interestingly, the Harris detector outperforms Shi and Tomasi detector in this case q Mikolajczyk and Schmid 15 evaluate the repeatability of the Harris-Laplace detector evaluated using the method in 34, where planar scenes are examined The results show that Harris-Laplace points outperform both Dog points and Harris points in repeatability. For the box dataset, our results verify that this is correct for up to about 1000 points per frame(typical numbers, probably commonly used); the results are somewhat less convincing in the other datasets where points undergo non-projective changes

试读 14P Machine Learning for High-Speed Corner Detection.pdf
立即下载 低至0.43元/次 身份认证VIP会员低至7折
    woniulvbu 谢谢分享,还可以
    BurtonMan 谢谢, 可以用,作者还是博客
    qq_36622174 谢谢分享,很有用的
    关注 私信 TA的资源
    Machine Learning for High-Speed Corner Detection.pdf 11积分/C币 立即下载
    Machine Learning for High-Speed Corner Detection.pdf第1页
    Machine Learning for High-Speed Corner Detection.pdf第2页
    Machine Learning for High-Speed Corner Detection.pdf第3页
    Machine Learning for High-Speed Corner Detection.pdf第4页
    Machine Learning for High-Speed Corner Detection.pdf第5页


    11积分/C币 立即下载 >