MSER 线性时间最大稳定极值区Linear Time Maximally Stable Extremal Regions 评分

本文提出了一种新的计算最大稳定极值区域（MSER）的算法。这个 标准算法利用联合查找数据结构 以像素为单位的准线性时间。新算法提供 完全相同的结果会产生真正的最坏情况线性时间。此外，新的 该算法占用的内存显著减少，并且具有更好的缓存位置。 从而加快执行速度。我们的CPU实现的性能是 作为基于标准的最先进的FPGA实现 算法
Linear Time Maximally Stable Extremal Regions 185 Fig. 2. Typical progress stages for the algorithm on the first pass. The top left pixel is used as the source pixel. The flooding occurs, with preference for dark regions, MSER are detected in the process, and eventually the whole image is fooded. The process will then be repeated with preference for bright regions of the landscape as they become accessible to the current body of water. That is, the flooding occurs in a very physically plausible manner. The water adapts to the actual landscape and remains one connected component connected to the point where it is being poured onto the landscape. The algorithm keeps track of the ' downhill stream'of water, which at any time amounts to information for at most as many pixel components as there are greylevels in the image. When the downhill stream finds a minimum, the water starts filling it up, and when it fills back up, the corresponding pixels are processed for MSER Thus, the progress of the algorithm can be roughly described as a physical foodfill adapting to the landscape, see Fig 2 In the following section, we will first give an algorithm overview, followed by algorithm details and a description of a recommended actual implementation 186 D. Nister and h. stewenius 2 Algorithm Overview In this section we will give a highlevel description of the algorithm For simplic ity, we will work on the dark to bright pass. Note however that in reality, the msER detection algorithm is simply carried out twice, one from dark to bright and one from bright to dark(with the landscape flipped upside down) 2.1 Basic Definitions Let the image I consist of n pixels indexed by the variable . Let each pixel be assigned the greylevel value f(c), taken from a set of m linearly ordered grey level values. Let also the pixels be assigned some neighboring relation encoded by the function N(a), where N(a) denotes the set of pixels neighboring pixel a. An example is a twodimensional image with fourconnected pixels taking on grey levels 0,..., 255]. The generalization to arbitrary undirected finite connected graphs is straightforward A path from a pixel c to y is a sequence of pairwise neighboring pixels starting with c and ending with y. A connected set X is a set of pixels that has a path entirely within X between any pair of pixels in X. A minimum is a connected set X of pixels with equal greylevel values, such that no pixel in X is connected to a pixel outside X with equal or smaller greylevel value. A sublevel set, pa rameterized by greylevel y, is the set of pixels cEIl f(a)<y with greylevel smaller than or equal to y. As we trace through the sublevel sets by varying the greylevel from dark to bright, we get a sequence of sets on which we can in principle) perform connected component analysis on. The way the connected components evolve defines a tree, which we will refer to as the component tree defined roughly as follows. When a new component appears from one greylevel itp.255 GreyLevel Fig 3. The component tree with corresponding greylevels. The nodes of the tree are drawn as vertical boxes where the vertical extent of the box shows the greylevel life span of each component Linear Time Maximally Stable Extremal Regions 187 Q○○○○○○○ Fig. 4. The datastructures. Pixel status is shown on the top right. The source pixel is marked S. The current pixel is marked C. The pixels waiting in the heap of bound ary pixels (left) are marked with circles. The pixels that belong to a component on the component stack(bottom) are marked with their component number. A detected MSER is shown by a dashed ellipse. Pixels that have been fully processed are marked with an X. Note how the stack holds components with decreasing greylevel to the next (in the sense that we get a component that consist entirely of pix els that were not in the previous sublevel set), this component of pixels is a minimum, which is made into a leaf node. When, as we increase the greylevel threshold, two or more components become joined into one, the joined compo nent is assigned a new node and made into a parent of the original nodes. This process continues until the whole image is one component, which corresponds to the root node As the components evolve, we can also keep track of component information which will differ dependent on what we wish to extract. For example, it could consist of a linked list of the pixels in the component. It could also just consist of the first and second order moments of the regions, which can in principle be encoded by six sums∑1,∑x,∑y,∑xr,∑xy,∑ yy over the pixels in the component(although for numerical reasons, it is better to center these moments on something close to the centroid of the region). This is the output we need if we wish to turn the MSER into elliptical regions at the end of detection. An MSER is a component for which the relative growthrate attains a minimum see matas for further details 2.2 Basic Algorithm We will now describe the algorithm from an abstract point of view. See Sec tion 2. 3on how to accomplish the steps efficiently, and Section B for analysis The algorithm needs the following datastructures 188 D. Nister and h. stewenius Queue current Clear heap Mark source Push empty START pixel onto stack and pIxel an d component heap and accessible make current onto stack consider new Accumulate Pop he Is there current pixel to nanother accessible Greylevel boundary pixels component on neighbor to ≤ current? top of stack current pixel? Empty? Queue STOP neighbor onto heap Same grey level a previous pixel? N New grey Merge top twoYNew greylevel level 2nd on Process components on stack? component stack top of stack? f stack Fig. 5. State graph for the algorithm a binary mask of accessible pixels. These are the pixels to which the water already has access A priority queue of boundary pixels, where priority is minus the greylevel These pixels can be thought of as partially flooded pixels in the sense that s water has access to the pixel in question, but has either not yet entered,or not yet explored all the edges out from the pixel. Along with the pixel id an edge number indicating the next edge to be explored can be stored A stack C of component information. Each entry holds the pixels in a compo nent and/or the first and second order moments of the pixels in the compo nent, as well as the size history of the component and the current greylevel at which the component is being processed. The maximum number of entries on the stack will be the number of greylevels During execution of the algorithm, the components in the component info stack C may not correspond to components in the component tree. Rather, there will a number of components representing the 'downstream'of water streaming downhill towards a minimum, where each component is the set of pixels at a single greylevel that is part of the downstream. A single component represents the pixels covered by the water currently filling back out of a minimum. The algorithm in a sense has two states, one where the downstream is flowing down hill in search of a minimum. and one where a minimum has been found. and the water level is currently rising out of it Linear Time Maximally Stable Extremal Regions 189 删 / E Top of Stack Fig. 6. The state of the component stack just before a merge of two components. The water has filled up one basin(dark blue component) and returned out of it to spill over into a second (light blue component) and the bright component is just about to merge with the dark component. Notice the textured components that are waiting as part of the 'downhill stream' of water To execute the algorithm, a pixel from which flooding will proceed is arbitrary chosen. This pixel can be thought of as the point at which water is being poured on, and the output result will be the same regardless of which pixel is selected so we may simply pick the upper left corner of the image. We will refer to this as the source pixel. The algorithm is as follows, see also Fig 1. Clear the accessible pixel mask, the heap of boundary pixels and the compo nent stack. Push h a du ent onto the stack, with greylevel higher s than any allowed in the image 02, Make the source pixel(with its first edge)the current pixel, mark it as accessible and store the greylevel of it in the variable currentlevel 3. Push an empty component with current_evel onto the component stack 4. Explore the remaining edges to the neighbors of the current pixel, in order as follows: For each neighbor, check if the neighbor is already accessible. If it is not, mark it as accessible and retrieve its greylevel. If the greylevel is not lower than the current one, push it onto the heap of boundary pixels. If on the other hand the greylevel is lower than the current one, enter the current pixel back into the queue of boundary pixels for later processing(with the next edge number), consider the new pixel and its greylevel and go to B 5. Accumulate the current pixel to the component at the top of the stack(water saturates the current pixel 6. Pop the heap of boundary pixels. If the heap is empty, we are done. If the returned pixel is at the same greylevel as the previous, go to 4 7. The returned pixel is at a higher greylevel, so we must now process all components on the component stack until we reach the higher greylevel This is done with the Process Stack subroutine, see below. Then go to 4 190 D. Nister and h. stewenius Fig. 7. For a ridge pixel, it is important for theoretical correctness of the algorithm that water is not allowed to ' simultaneously into several directions. Therefore, if a lower pixel is found as any of the neighbors of a pixel, the current pixel has to be stacked for later and the newly found pixel tended to fully before proceeding to the other neighb The Process stack subroutine is as follows Subroutine Process Stack(newpicelgreg_level) 1. Process component on the top of the stack. The next greylevel is the mini mum of new_picelgrey_level and the greylevel for the second component on the stack 2. If newpicelgrey_level is smaller than the greylevel on the second component on the stack, set the top of stack greylevel to mewpicelgrey_level and return from subroutine (This occurs when the new pixel is at a greylevel for which there is not yet a component instantiated, so we let the top of stack be that level by just changing its greyleve 3. Remove the top of stack and merge it into the second component on stack as follows: Add the first and second moment accumulators together and or join the pixel lists. Either merge the histories of the components, or take the history from the winner. Note here that the top of stack should be considered one timestep' back, so its current size is part of the history. Therefore the top of stack would be the winner if its current size is larger than the previous size of second on stack 4. If(newpicelgreg_level> topof_stackgreg_level)go to l Here, the implementation of how to process a component follows the same lines as the standard algorithm, and depends on the information one wants about an MSER. It entails determining if a relative growth rate minimum has occurred and if so, harvesting the required information from the component and declarin an Mser detected. The datastructure of a component holds information about the previous sizes of the component, and at which greylevels they occurred Linear Time Maximally Stable Extremal Regions 191 0 0 0 0 Fig 8. The heap of boundary pixels can be efficiently implemented by a binary bitmask with number of greylevel bits, where bits are set if there are pixels at that greylevel and a system of stacks, one for each greylevel It is immaterial to the result which order is used to explore the edges of a particular pixel, but it is important that if a lower greylevel pixel is found at the end of one of the edges, this pixel is tended to before any further edges are explored, see Fig The reason is that otherwise water could enter a 'ridge'pixel that has several edges to lower greylevel pixels from distinct basins, and water would into both basins simultaneously if we are not careful to process one of the edges first and fully saturate it before moving to the next edge. In practice it is possible that this would not be a common problem, it would perhaps just make the algorithm 'myopic' in the sense that water could tunnelthrough ridges, but by carefully saturating the edges in the correct order, our result will dhere exactly to the strict definition of MsEr 2.3 Implementation Details In this section we give some implementation details. For efficiency, we imple mented the heap of boundary pixel as a system of stacks of pixels at various greylevels, see Fig 8 a bitmask is keeping track of which out of the 256 grey levels have pixels waiting. This allows us to use a single instruction to find which is the smallest (or largest) occupied greylevel by using a machine in struction that retrieves the id number of the least significant or most signif icant bit set(for 86 platforms bsf and bsr, or the more general 64bit com pliant  ScanForward,  Bit Scan Reverse). This is of course limited to as many greylevels as the processor wordlength, but for 256 greylevels we simply chain multiple such instructions. This is essentially a way to get the hardware to do the work, and lower the time it takes to push and pop from the heap. This time is inherently constant with regards to the number of pixels, and loga rithmic in the number of greylevels. We also perform an explicit check for urrent greylevel before doing the general pop. This can be done since the 192 D. Nister and h. stewenius greylevel is never smaller than the current. When it is equal, which is common some work can be saved The heap could be dynamically allocated and implemented as linked lists, but we prefer to count the number pixels at each greylevel in a single presweep of the image, just like the standard algorithm does. This allows us to both pre allocate and use fixed arrays for the stacks without stacks ever running into one another. The total number of entries in the stacks is the number of pixels plus the number of greylevels(due to one stopelement for each stack) Finally, to avoid having to perform explicit checks for the image boundary,a border of one pixel around the image is used, and the accessible mask is always set when a sweep starts. This can actually be implemented without explicitly sweeping the mask except for in a detector initialization step, since the sweep sets the whole mask. We then simply fip the border bits and invert the meaning of the mask on the darktobright sweep, finally leaving the mask in its original state 3 Analysis An informal analysis for the algorithm is straightforward. It is easy to see that only one copy of a pixel can ever be on the heap of boundary pixels, and that a pixel can only return onto the heap at most as many times as it has neighbors Moreover, the components that are processed must have at least one new pixel in them, so the total amount of component processing is linear in the number of pixels. As it takes at most log(m) time to enter or pop a pixel from the heap the worst case execution time is therefore bounded by O((n +e) log(m)), where n is the number of pixels, m the number of greylevels, and e is the number of edges in the image graph(such as e a 2n for fourconnected images). If we regard m and the connectivity as constants, this simply amounts to O(n),linear time The memory usage is roughly a fourbyte integer per pixel, which allows both the heap of boundary pixels, the accessible mask bit and the edge number to be stored while allowing a sufficient number of bits for the pixel coordinates Storage for only m components is required(apart of course from the storage required by detected MSER, which would clearly be needed for any algorithm 4 Experimental Validation As the results of our detector are the same as the original mser detector, which has been found in several studies to have excellent repeatability, our experimental focus is on execution time. As a reference, the recent paper 19 reports on an FPGa implementation with careful attention to detail, but based on the standard unionfind algorithm, performing at 25 fps on images up to 350 X 350所需积分/C币：15 上传时间：20190103 资源大小：2.25MB