谷歌大名鼎鼎的RAISR超分辨率论文

所需积分/C币:50 2018-02-28 11:48:47 8.88MB PDF
89
收藏 收藏
举报

超分辨率、图像处理。宣称是可以在重建质量不差情况下,速度比目前算法如A+之类,能够有10到100倍性能提升。
applying a set of pre-learned filters on the image patches, chosen by an efficient hashing mechanism. Note that the filters are learned based on pairs of LR and hr training image patches, and the hashing is done by estimating the local gradients'statistics. As a final step, in order to avoid artifacts, the initial upscaled image and its filtered version are locally blended by applying a weighted average, where the weights are a function of a structure descriptor. We harness the Census Transform(CT) [20] for the blending task, as it is extremely fast and cheap descriptor of the image structure which can be utilized to detect structure deformations that occur due to the filtering step a closely related topic to SiSr is image sharpening, aiming to amplify the structures/details of a blurry image. The basic sharpening techniques apply a linear filter on the image, as in the case of unsharp masking [21] or Difference of Gaussians(DoG)[22], [23]. These techniques are highly effective in terms of complexity, but tend to introduce artifacts such as over-sharpening, gradient reversals, noise amplification, and more. Similarly to SISR, improved results can be obtained by relying on patch priors, where the sensitivity to the content/structure of the image is the key for artifact-free enhancement [24]-[28]. For example, with the cost of increased complexity compared to the linear approach, the edge-aware bilateral filter [29],[301, Non-Local Means [3] and guided filter [25] produce impressive sharpening effect As a way to generate high-quality sharp images, one can learn a mapping from LR images to their sharpened HR versions, thus achieving a built-in sharpening/contrast-enhancement effect "for free". Furthermore, the learning stage is not limited to a linear degradation model(as in Eg.(I)), as such, learning a mapping from compressed LR images to their sharpened hr versions can be easily done, leading to an"all in one"mechanism that not only increases the image resolution, but also reduces compression artifacts and enhances the contrast of the image Triggered by this observation, we develop a sharpener as well, which is of independent interest. The proposed sharpener is highly efficient and able to enhance both fine details(high frequencies) and the overall contrast of the image(mid-low frequencies). The proposed method has almost similar complexity to the linear sharpeners, while being competitive with far more complex techniques. The suggested sharpener is based on applying DoG filters [22],[23 on the image, which are capable to enhance a wide range of frequencies. Next, a CT-based structure-aware blending step is applied as a way to prevent artifacts due to the added content-aware property (similar mechanism to the one suggested in the context of SIsr) This paper is organized as follows: In Section II we describe the global learning and upscaling scheme, formulating the core engine of RAIsr. In Section III we refine the global approach by integrating the initial upscaling kernel to the learning scheme. In Section IV we describe the overall learning and upscaling framework, including the hashing and blending steps. The sharpening algorithm is detailed in Section V. Experiments are brought in Section VI, comparing the proposed upscaling and sharpening algorithm with state-of-the-art methods. Conclusions and Tuture research directions are given in Section VIl II. FIRST STEPS: GLOBAL FILTER LEARNING Given an initial (e. g bilinear in our case)upscaled versions of the training database images, y;E RMXN,with L, we aim to learn a d x d filter h that minimizes the Euclidean distance between the collection yi LR cheap break into least-squares ter upscaling atches solver mages (a)Learning Stage LR cheap filtering output upscaling Image (b)Upscaling Stag Fig. I. The basic learning and application scheme of a global filter that maps lr images to their hR versions and the desired training HR images [x;. Formally, this is done by solving a least-squares minimization problem mIn ∑ 讠=1 where h e rd denotes the filter h E Rdxd in vector-notation; AiErMnxd is a matrix, composed of patches of size d x d, extracted from the image yi, each patch forming a row in the matrix. The vector bi E R is composed of pixels from xi, corresponding to the center coordinates of yi patches. The block diagram, demonstrating the core idea of the learning process is given in Fig. la In practice, the matrix A can be very large, so we employ two separate approaches to control the computational complexity of estimating the filter. First, in general not all available patches needs to be used in order to obtain a reliable estimate. In fact, we typically construct Ai and bi by sampling K patches/pixels from the images on a fixed grid, where K <<MN. Second, the minimization of the least-squares problem, formulated in Eq (2), can be recast in a way that significantly reduces both memory and computational requirements. To simplify the exposition the following discussion is given in the context of filter learning based on just one image, but extending the idea to several images and filters is trivial. The proposed approach results in an efficient solution for the learning stage where the memory requirements are only on the order of the size of the learned filter. The solution is based on the observation that instead of minimizing Eq(2), we can minimize minQh-V 2 (3) where Q=AA and V=Ab Notice that Q is a small d2x d2 matrix, thus requiring relatively little memory. The same observation is valid for V that requires less memory than holding the vector b. Furthermore, based on the inherent definition of matrix matrix and matrix-vector multiplications, we in fact avoid holding the whole matrix(and vector)in memory. More specifically, Q can be computed cumulatively by summing chunks of rows(for example sub matrices A , F RqXd2 q<MN, which can be multiplied independently, followed by an accumulation step; i.e Q=AA=∑A 叫 P2 P1 P1 P2 P1 P2 P1 P1 P2 P1 P2 P1 Bilinear interpolated image ig. 2. Bilinear upscaling by a factor of 2 in each axis. There are four types of pixels, denoted by Pl-P4, corresponding to the four kernels that are applied during the bilinear interpolation The same observation is true for matrix-vector multiplication V=Ab a b (5) where b, E Ri is a portion of the vector b, corresponding to the matrix A; Thus, the complexity of the proposed learning scheme in terms of memory is very low -it is in the order of the filter size. moreover using this observation we can parallelize the computation of Af A; and A, bj, leading to a speedup in the runtime. As for the least squares solver itself, minimizing Eq (3) can be done efficiently since Q is a positive semi-definite matrix, which perfectly suits a fast conjugate gradients solver [31] To summarize, the learning stage is efficient both in terms of the memory requirements and ability to parallelize. As displayed in Fig. 1b, at run-time, given a Lr image(that is not in the training set), we produce its HR approximation by first interpolating it using the same cheap upscaling method (e.g. bilinear)that is used in the learning stage, followed by a filtering step with the pre-learned filter II. REFINING THE CHEAP UPSCALING KERNEL: DEALING WITH ALIASING The"cheap"upscaling method we employ as a first step, can be any method, including a non-linear one. However, in order to keep the low complexity of the proposed approach, we use the bilinear interpolator as the initial upscaling method. Inspired by the work in [15, whatever the choice of the initial upscaling method, we make the observation that when aliasing is present the input LR image, the output of the initial upscaler will generally not be shift-invariant to this aliasing As illustrated in Fig. 2, in the case of upscaling by a factor of 2 in each axis, the interpolation weights of the bilinear kernel vary according to the pixels location. As can be seen, there are four possible kernels that are applied I We also restrict the discussion mainly to the case of 2x upscaling to keep the discussion straightforward. Extensions will be discussed at the end of this section P1-type least-squares P1-type P2-type least-squares P2-type patches solver pixels upscaling images P3-type least-squares P3-type patches solver P4 P4-type patches olver Fig. 3. Spatially varying learning scheme of four global filters, taking into consideration the internal structure of the bilinear 810 (a) Pl-Filter (b)P2-Filter (c) P3-Filter (d)P4-Filter 1020340560 40 5 bU (e)Pl-Magnitude spectrum (f)P2-Magnitude spectrum (g) P3-Magnitude spectrum (h) P4-Magnitude spectrum Fig. 4. Visualization of the four global filters, corresponding to P1-P4 type of pixels, in the pixel domain (a-d), along with their magnitude in the frequency domain(e-f), where the warmer the color, the larger the value. The filters are learned on Fig. 2 image on the lr image according to the type of the pixel, denoted by P1-P4. Since a convolution of two linear filters can be unified into one filter (in our case, the first is the bilinear and the second is the pre-learned one), we should learn four different filters, corresponding to the four possible types of pixels, as demonstrated in Fig. 3 The importance of this observation is illustrated in Fig. 4, which plots examples of actual learned filters, along with their magnitude in the frequency domain. The obtained filters act like bandpass filters, amplifying the mid- frequencies, and suppressing the high-frequencies(which contain aliasing components) of the interpolated image The learned filters have similar magnitude response(Fig. 4e-4h), but different phase response (Fig. 4a-4d), standing This obscrvation is a promising way to furthcr spccd up the algorithm and rcducc the ovcrall complexity P1-type ring P2-type filtering cheap aggregation output Image upscalin 3-type filters PA-type filter Fig. 5. Applying the four spatially varying pre-learned filters on a Lr image in agreement with the four different shifted versions of the interpolation kernels On the application side, similarly to the core/naive upscaling idea, we first upscale the lr image using the bilinear interpolator. Then, differently from the naive approach, we apply the pre-learned filters according to the type of the pixel, followed by an aggregation step that simply combines the outcome of the filtered patches (resulting in a pixel) to an image. This process is illustrated in Fig.5 Notice that a similar observation holds for upscaling by any other integer factor s. For example, upscaling by a factor of 3 implies that we should learn 9 filters, one per each pixel-type. Similarly, when upscaling by a factor of 4, there are 16 types of pixels. As already mentioned, in order to keep the now of the explanations, we will concentrate on the 2x scenario since the generalization to other scaling factors is straightforward IV. RAISR: HASHING-BASED LEARNING AND UPSCALING Generally speaking, the global image filtering is fast and cheap, as it implies the application of one flter per patch. Since the learning scheme reduces the Euclidean distance between the HR and the interpolated version of the LR images, the global filtering has the ability to improve the restoration performance of various linear upscaling methods. However, the global approach described so far is weaker than the state-of-the-art algorithms e.g., sparsity-based methods [8]-[11] or the neural networks based ones [16] that build upon large amount of parameters, minimizing highly nonlinear cost functions. In contrast to these methods, the global approach is not adaptive to the content of the image, and its learning stage estimates only a small amount of parameters Adaptivity to the image content can be achieved by dividing the image patches into clusters, and constructing an appropriate filter per each cluster (e.g. as done in [10], [11D. However, the clustering implies the increase of the overall complexity of the algorithm, which is an outgrowth that we want to avoid. Therefore, instead of applying expensive" clustering(e.g K-means [32, GMM [33],[341, dictionary learning [8 ,19,11,[14 -[18), we suggest using an efficient hashing approach, leading to adaptive filtering that keeps the low complexity of the linear filtering. More specifically, the local adaptivity is achieved by dividing the image patches into groups(called buckets") based on an informative and"cheap "geometry measures, which utilize the statistics of the gradients(a P1-type least-squares P1-type patches solver pixels Images mages P2-type least-squares P2-type pa olver tels cheap divide patches divide pixels upscaling into buckets into buckets patches solver P4-type least-squares P4-type patches olver p pixels (a) Learning stage P filtering(key) P2-type filtering(key) LR cheap divide patches i aggregation output upscaling into buckets Image P3-typt filtering(key) P4-type filtering(key) (b)Upscaling St Fig. 6. Hashing based learning and upscaling schemes. We suggest dividing the patches into buckets", where each bucket contains patches with similar geometry(can be considered as a cheap clustering method). Then, a least squares fitting is applied per each bucket and possible shift. At run-time the hash-table key is computed per each patch, leading to the corresponding pre-learned locally adaptive filters detailed description is given in Section IV-A). Then, similarly to the global approach, we also learn four filters but this time per each bucket. As a consequence, the proposed learning scheme results in a hash-table of filters, where the hash-table keys are a function of the local gradients, and the hash-table entries are the correspondin pre-learned filters. An overview of the proposed hashing-based learning is shown in Fig. 6a Given the hash-table, containing filters per quantized edge-statistic descriptor(more details in Section IV-A), the upscaling procedure becomes very effective. Following Fig 6b, we compute the hash-table key per each patch of the initial interpolated image, pointing to the relevant filters(four filters, one per patch-type), to be applied on the corresponding patch Similarly to the global learning process(see Section I ), we utilize the matrix-matrix and matrix-vector multipli cations once again. Per each bucket we learn a filter ha by minimizing the following cost function min IAd A,ha-ATb,2 (6 where ag and ba are the patches and pixels that belong to the q-th bucket. In this case, the low memory requirements of the proposed learning process are crucial, especially for large hash-table that requires millions of examples to produce a reliable estimate for the filters. As a consequence, by utilizing the observation described in Section Il, we perform a sub-matrix accumulation on a sub-image block basis, leading to a learning process that can handle any desired number of examples A. Hash-Table Keys: Local Gradient Statistics(Angle, Strength, Coherence) Naturally, there are many possible local geometry measures that can be used as the hash-table keys, whereas the statistics of the gradients has a major influence on the proposed approach. We suggest evaluating the local gradient characteristics via eigenanalysis [35], which yields the gradients angle and information about the strength and coherence of the nearby gradients. Eigenanalysis also helps in cases of thin lines, stripes and other scenarios that the mean gradient might be zero, yet the neighborhood exhibits a strong directionality The direction, strength and coherence are computed by utilizing the v'nxvn surroundings of each pixel, i.e., for the k-th pixel we consider all the pixels that are located at k1,. kn. The basic approach starts with a computation of 2 x n matrix, composed from the horizontal and vertical gradients, gr and gy, of the surroundings of the h-th pixel, expressed by g g As stated in [35], the local gradient statistics can be computed using the singular Value Decomposition(SVD) of this matrix. The right vector corresponds to the gradient orientation, and the two singular values indicate the strength and spread of the gradients. Since the work is being performed per-pixel, we hereby focus on efficiency. We can compute those characteristics more efficiently using an eigen-decomposition of GKGk which is a 2 x2 matrix hich can be computed conveniently in a closed form. moreover, in order to incorporate a small neighborhood of gradient samples per pixel, we employ a diagonal weighting matrix Wk, constructed using a separable normalized Gaussian kernel Following [35 the eigenvector i, corresponding to the largest eigenvalue of G[ Gk, can be used to derive the angle of the gradient Bk, given by 0k= arctan(,中m) 8) Notice that due to the symmetry, a filter that corresponds to the angle 8k is identical to the one corresponding to 0k+180° As shown in [35], the square root of the largest eigenvalue xi is analogous to the "strength"of the gradient The square root of the smaller eigenvalue A can be considered as the"spread"of the local gradients, or rather Gradient Angle 180° 2.4 □□□□ 0 (a)2× upscaling Gradient Angle 180° 用田Ⅲ 0 -1.8 (b)3x upscaling filters Gradient Angle 180 1.9 (c)4x upscaling filters Fig. 7. Visualization of the learned filter sets for (a)2x,(b)3x and (c)4x upscaling, learned from using an angle, strength, and coherence based hashing scheme. Per each subset of filters, the angle varies from left to right: the top middle, and bottom 3 rows correspond to low, mcdium and high cohcrcncc. Within cach sct of 3 rows, gradicnt strcngth incrcascs from top to bottom. As can bc inferred, thc gencral trend is that as coherence increases. the directionality of the filter increases. Also, as strength increases the intensity of the filter increases. Notice how the 3x and 4x upscaling filters are not simply scaled versions of the 2x filters, but also have extracted additional information from the training data

...展开详情
试读 31P 谷歌大名鼎鼎的RAISR超分辨率论文
立即下载
限时抽奖 低至0.43元/次
身份认证后 购VIP低至7折
一个资源只可评论一次,评论内容不能少于5个字
您会向同学/朋友/同事推荐我们的CSDN下载吗?
谢谢参与!您的真实评价是我们改进的动力~
  • 签到新秀

关注 私信
上传资源赚钱or赚积分
最新推荐
谷歌大名鼎鼎的RAISR超分辨率论文 50积分/C币 立即下载
1/31
谷歌大名鼎鼎的RAISR超分辨率论文第1页
谷歌大名鼎鼎的RAISR超分辨率论文第2页
谷歌大名鼎鼎的RAISR超分辨率论文第3页
谷歌大名鼎鼎的RAISR超分辨率论文第4页
谷歌大名鼎鼎的RAISR超分辨率论文第5页
谷歌大名鼎鼎的RAISR超分辨率论文第6页
谷歌大名鼎鼎的RAISR超分辨率论文第7页

试读结束, 可继续读3页

50积分/C币 立即下载