下载  >  开发技术  >  其它  > MSER 线性时间最大稳定极值区Linear Time Maximally Stable Extremal Regions

MSER 线性时间最大稳定极值区Linear Time Maximally Stable Extremal Regions 评分

本文提出了一种新的计算最大稳定极值区域(MSER)的算法。这个 标准算法利用联合查找数据结构 以像素为单位的准线性时间。新算法提供 完全相同的结果会产生真正的最坏情况线性时间。此外,新的 该算法占用的内存显著减少,并且具有更好的缓存位置。 从而加快执行速度。我们的CPU实现的性能是 作为基于标准的最先进的FPGA实现 算法
Linear Time Maximally Stable Extremal Regions 185 Fig. 2. Typical progress stages for the algorithm on the first pass. The top left pixel is used as the source pixel. The flooding occurs, with preference for dark regions, MSER are detected in the process, and eventually the whole image is fooded. The process will then be repeated with preference for bright regions of the landscape as they become accessible to the current body of water. That is, the flooding occurs in a very physically plausible manner. The water adapts to the actual landscape and remains one connected component connected to the point where it is being poured onto the landscape. The algorithm keeps track of the ' downhill stream'of water, which at any time amounts to information for at most as many pixel components as there are grey-levels in the image. When the downhill stream finds a minimum, the water starts filling it up, and when it fills back up, the corresponding pixels are processed for MSER Thus, the progress of the algorithm can be roughly described as a physical food-fill adapting to the landscape, see Fig 2 In the following section, we will first give an algorithm overview, followed by algorithm details and a description of a recommended actual implementation 186 D. Nister and h. stewenius 2 Algorithm Overview In this section we will give a high-level description of the algorithm For simplic ity, we will work on the dark to bright pass. Note however that in reality, the msER detection algorithm is simply carried out twice, one from dark to bright and one from bright to dark(with the landscape flipped upside down) 2.1 Basic Definitions Let the image I consist of n pixels indexed by the variable . Let each pixel be assigned the grey-level value f(c), taken from a set of m linearly ordered grey- level values. Let also the pixels be assigned some neighboring relation encoded by the function N(a), where N(a) denotes the set of pixels neighboring pixel a. An example is a two-dimensional image with four-connected pixels taking on grey levels 0,..., 255]. The generalization to arbitrary undirected finite connected graphs is straightforward A path from a pixel c to y is a sequence of pair-wise neighboring pixels starting with c and ending with y. A connected set X is a set of pixels that has a path entirely within X between any pair of pixels in X. A minimum is a connected set X of pixels with equal grey-level values, such that no pixel in X is connected to a pixel outside X with equal or smaller grey-level value. A sub-level set, pa- rameterized by grey-level y, is the set of pixels cEIl f(a)<y with grey-level smaller than or equal to y. As we trace through the sub-level sets by varying the grey-level from dark to bright, we get a sequence of sets on which we can in principle) perform connected component analysis on. The way the connected components evolve defines a tree, which we will refer to as the component tree defined roughly as follows. When a new component appears from one grey-level itp.255 Grey-Level Fig 3. The component tree with corresponding grey-levels. The nodes of the tree are drawn as vertical boxes where the vertical extent of the box shows the grey-level life span of each component Linear Time Maximally Stable Extremal Regions 187 Q○○○○○○○ Fig. 4. The data-structures. Pixel status is shown on the top right. The source pixel is marked S. The current pixel is marked C. The pixels waiting in the heap of bound- ary pixels (left) are marked with circles. The pixels that belong to a component on the component stack(bottom) are marked with their component number. A detected MSER is shown by a dashed ellipse. Pixels that have been fully processed are marked with an X. Note how the stack holds components with decreasing grey-level to the next (in the sense that we get a component that consist entirely of pix- els that were not in the previous sub-level set), this component of pixels is a minimum, which is made into a leaf node. When, as we increase the grey-level threshold, two or more components become joined into one, the joined compo- nent is assigned a new node and made into a parent of the original nodes. This process continues until the whole image is one component, which corresponds to the root node As the components evolve, we can also keep track of component information which will differ dependent on what we wish to extract. For example, it could consist of a linked list of the pixels in the component. It could also just consist of the first and second order moments of the regions, which can in principle be encoded by six sums∑1,∑x,∑y,∑xr,∑xy,∑ yy over the pixels in the component(although for numerical reasons, it is better to center these moments on something close to the centroid of the region). This is the output we need if we wish to turn the MSER into elliptical regions at the end of detection. An MSER is a component for which the relative growth-rate attains a minimum see matas for further details 2.2 Basic Algorithm We will now describe the algorithm from an abstract point of view. See Sec tion 2. 3on how to accomplish the steps efficiently, and Section B for analysis The algorithm needs the following data-structures 188 D. Nister and h. stewenius Queue current Clear heap Mark source Push empty START pixel onto stack and pIxel an d component heap and accessible make current onto stack consider new Accumulate Pop he Is there current pixel to nanother accessible Grey-level boundary pixels component on neighbor to ≤ current? top of stack current pixel? Empty? Queue STOP neighbor onto heap Same grey- level a previous pixel? N New grey Merge top twoYNew grey-level level 2nd on Process components on stack? component stack top of stack? f stack Fig. 5. State graph for the algorithm a binary mask of accessible pixels. These are the pixels to which the water already has access A priority queue of boundary pixels, where priority is minus the grey-level These pixels can be thought of as partially flooded pixels in the sense that s water has access to the pixel in question, but has either not yet entered,or not yet explored all the edges out from the pixel. Along with the pixel id an edge number indicating the next edge to be explored can be stored A stack C of component information. Each entry holds the pixels in a compo nent and/or the first and second order moments of the pixels in the compo nent, as well as the size history of the component and the current grey-level at which the component is being processed. The maximum number of entries on the stack will be the number of grey-levels During execution of the algorithm, the components in the component info stack C may not correspond to components in the component tree. Rather, there will a number of components representing the 'down-stream'of water streaming downhill towards a minimum, where each component is the set of pixels at a single grey-level that is part of the down-stream. A single component represents the pixels covered by the water currently filling back out of a minimum. The algorithm in a sense has two states, one where the down-stream is flowing down hill in search of a minimum. and one where a minimum has been found. and the water level is currently rising out of it Linear Time Maximally Stable Extremal Regions 189 删 / E Top of Stack Fig. 6. The state of the component stack just before a merge of two components. The water has filled up one basin(dark blue component) and returned out of it to spill over into a second (light blue component) and the bright component is just about to merge with the dark component. Notice the textured components that are waiting as part of the 'downhill stream' of water To execute the algorithm, a pixel from which flooding will proceed is arbitrary chosen. This pixel can be thought of as the point at which water is being poured on, and the output result will be the same regardless of which pixel is selected so we may simply pick the upper left corner of the image. We will refer to this as the source pixel. The algorithm is as follows, see also Fig 1. Clear the accessible pixel mask, the heap of boundary pixels and the compo nent stack. Push h a du ent onto the stack, with grey-level higher s than any allowed in the image 02, Make the source pixel(with its first edge)the current pixel, mark it as accessible and store the grey-level of it in the variable current-level 3. Push an empty component with current_evel onto the component stack 4. Explore the remaining edges to the neighbors of the current pixel, in order as follows: For each neighbor, check if the neighbor is already accessible. If it is not, mark it as accessible and retrieve its grey-level. If the grey-level is not lower than the current one, push it onto the heap of boundary pixels. If on the other hand the grey-level is lower than the current one, enter the current pixel back into the queue of boundary pixels for later processing(with the next edge number), consider the new pixel and its grey-level and go to B 5. Accumulate the current pixel to the component at the top of the stack(water saturates the current pixel 6. Pop the heap of boundary pixels. If the heap is empty, we are done. If the returned pixel is at the same grey-level as the previous, go to 4 7. The returned pixel is at a higher grey-level, so we must now process all components on the component stack until we reach the higher grey-level This is done with the Process Stack sub-routine, see below. Then go to 4 190 D. Nister and h. stewenius Fig. 7. For a ridge pixel, it is important for theoretical correctness of the algorithm that water is not allowed to ' simultaneously into several directions. Therefore, if a lower pixel is found as any of the neighbors of a pixel, the current pixel has to be stacked for later and the newly found pixel tended to fully before proceeding to the other neighb The Process stack sub-routine is as follows Sub-routine Process Stack(new-picelgreg_level) 1. Process component on the top of the stack. The next grey-level is the mini mum of new_picel-grey_level and the grey-level for the second component on the stack 2. If new-picel-grey_level is smaller than the grey-level on the second component on the stack, set the top of stack grey-level to mew-picel-grey_level and return from sub-routine (This occurs when the new pixel is at a grey-level for which there is not yet a component instantiated, so we let the top of stack be that level by just changing its grey-leve 3. Remove the top of stack and merge it into the second component on stack as follows: Add the first and second moment accumulators together and or join the pixel lists. Either merge the histories of the components, or take the history from the winner. Note here that the top of stack should be considered one time-step' back, so its current size is part of the history. Therefore the top of stack would be the winner if its current size is larger than the previous size of second on stack 4. If(new-picel-greg_level> top-of_stack-greg_level)go to l Here, the implementation of how to process a component follows the same lines as the standard algorithm, and depends on the information one wants about an MSER. It entails determining if a relative growth rate minimum has occurred and if so, harvesting the required information from the component and declarin an Mser detected. The data-structure of a component holds information about the previous sizes of the component, and at which grey-levels they occurred Linear Time Maximally Stable Extremal Regions 191 0 0 0 0 Fig 8. The heap of boundary pixels can be efficiently implemented by a binary bitmask with number of grey-level bits, where bits are set if there are pixels at that grey-level and a system of stacks, one for each grey-level It is immaterial to the result which order is used to explore the edges of a particular pixel, but it is important that if a lower grey-level pixel is found at the end of one of the edges, this pixel is tended to before any further edges are explored, see Fig The reason is that otherwise water could enter a 'ridge'pixel that has several edges to lower grey-level pixels from distinct basins, and water would into both basins simultaneously if we are not careful to process one of the edges first and fully saturate it before moving to the next edge. In practice it is possible that this would not be a common problem, it would perhaps just make the algorithm 'myopic' in the sense that water could tunnelthrough ridges, but by carefully saturating the edges in the correct order, our result will dhere exactly to the strict definition of MsEr 2.3 Implementation Details In this section we give some implementation details. For efficiency, we imple- mented the heap of boundary pixel as a system of stacks of pixels at various grey-levels, see Fig 8 a bitmask is keeping track of which out of the 256 grey levels have pixels waiting. This allows us to use a single instruction to find which is the smallest (or largest) occupied grey-level by using a machine in- struction that retrieves the id number of the least significant or most signif icant bit set(for 86 platforms bsf and bsr, or the more general 64-bit com pliant - ScanForward, - Bit Scan Reverse). This is of course limited to as many grey-levels as the processor word-length, but for 256 grey-levels we simply chain multiple such instructions. This is essentially a way to get the hardware to do the work, and lower the time it takes to push and pop from the heap. This time is inherently constant with regards to the number of pixels, and loga- rithmic in the number of grey-levels. We also perform an explicit check for urrent grey-level before doing the general pop. This can be done since the 192 D. Nister and h. stewenius grey-level is never smaller than the current. When it is equal, which is common some work can be saved The heap could be dynamically allocated and implemented as linked lists, but we prefer to count the number pixels at each grey-level in a single pre-sweep of the image, just like the standard algorithm does. This allows us to both pre allocate and use fixed arrays for the stacks without stacks ever running into one another. The total number of entries in the stacks is the number of pixels plus the number of grey-levels(due to one stop-element for each stack) Finally, to avoid having to perform explicit checks for the image boundary,a border of one pixel around the image is used, and the accessible mask is always set when a sweep starts. This can actually be implemented without explicitly sweeping the mask except for in a detector initialization step, since the sweep sets the whole mask. We then simply fip the border bits and invert the meaning of the mask on the dark-to-bright sweep, finally leaving the mask in its original state 3 Analysis An informal analysis for the algorithm is straightforward. It is easy to see that only one copy of a pixel can ever be on the heap of boundary pixels, and that a pixel can only return onto the heap at most as many times as it has neighbors Moreover, the components that are processed must have at least one new pixel in them, so the total amount of component processing is linear in the number of pixels. As it takes at most log(m) time to enter or pop a pixel from the heap the worst case execution time is therefore bounded by O((n +e) log(m)), where n is the number of pixels, m the number of grey-levels, and e is the number of edges in the image graph(such as e a 2n for four-connected images). If we regard m and the connectivity as constants, this simply amounts to O(n),linear time The memory usage is roughly a four-byte integer per pixel, which allows both the heap of boundary pixels, the accessible mask bit and the edge number to be stored while allowing a sufficient number of bits for the pixel coordinates Storage for only m components is required(apart of course from the storage required by detected MSER, which would clearly be needed for any algorithm 4 Experimental Validation As the results of our detector are the same as the original mser detector, which has been found in several studies to have excellent repeatability, our experimental focus is on execution time. As a reference, the recent paper 19 reports on an FPGa implementation with careful attention to detail, but based on the standard union-find algorithm, performing at 25 fps on images up to 350 X 350

...展开详情
所需积分/C币:15 上传时间:2019-01-03 资源大小:2.25MB
举报 举报 收藏 收藏
分享 分享