filter naturally has an OðNÞtime (in the number of pixels N)
1
nonapproximate algorithm for both gray-scale and high-
dimensional images, regardless of the kernel size and the
intensity range. Typically, our CPU implementation achieves
40 ms per mega-pixel performing gray-scale filtering: To the
best of our knowledge, this is one of the fastest edge-
preserving filters.
A preliminary version of this paper was published in
ECCV ’10 [22]. It is worth mentioning that the guided filter
has witnessed a series of new applications since then. The
guided filter enables a high-quality real-time OðNÞ stereo
matching algorithm [23]. A similar stereo method is
proposed independently in [24]. The guided filter has also
been applied in optical flow estimation [23], interactive
image s egmentation [23], saliency detection [25], and
illumination rendering [26]. We believe that the guided
filter has great potential in computer vision and graphics,
given its simplicity, efficiency, and high-quality. We have
provided a public code to facilitate future studies [27].
2RELATED W ORK
We review edge-preserving filtering techniques in this
section. We categorize them as explicit/implicit weighted-
average filters and nonaverage ones.
2.1 Explicit Weighted-Average Filters
The bilateral filter [1] is perhaps the simplest and most
intuitive one among explicit weighted-average filters. It
computes the filtering output at each pixel as the average of
neighboring pixels, weighted by the Gaussian of both
spatial and intensity distance. The bilateral filter smooths
the image while preserving edges. It has been widely used
in noise reduction [28], HDR compression [15], multiscale
detail decomposition [29], and image abstraction [30]. It is
generalized to the joint bilateral filter in [14], where the
weights are computed from another guidance image rather
than the filtering input. The joint bilateral filter is
particularly favored when the image to be filtered is not
reliable to provide edge information, e.g., when it is very
noisy or is an intermediate result, such as in flash/no-flash
denoising [14], image upsamling [31], image deconvolution
[32], stereo matching [33], etc.
The bilateral filter has limitations despite its popularity. It
has been noticed in [15], [16], and [8] that the bilateral filter
may suffer from “gradient reversal” artifacts. The reason is
that when a pixel (often on an edge) has few similar pixels
around it, the Gaussian weighted average is unstable. In this
case, the results may exhibit unwanted profiles around
edges, usually observed in detail enhancement or HDR
compression.
Another issue concerning the bilateral filter is the
efficiency. A brute-force implementation is OðNr
2
Þtime with
kernel radius r. Durand and Dorsey [15] propose a piece-wise
linear model and enable FFT-based filtering. Paris and
Durand [17] formulate the gray-scale bilateral filter as a 3D
filter in a space-range domain, and downsample this domain
to speed up if the Nyquist condition is approximately true. In
the case of box spatial kernels, Weiss [34] proposes an
OðN log rÞ time method based on distributive histograms, and
Porikli [18] proposes the first OðNÞ time method using
integral histograms. We point out that constructing the
histograms is essentially performing a 2D spatial filter in
the space-range domain with a 1D range filter followed.
Under this viewpoint, both [34] and [18] sample the signal
along the range domain but do not reconstruct it. Yang [19]
proposes another OðNÞ time method which interpolates
along the range domain to allow more aggressive subsam-
pling. All of the above methods are linearly complex w.r.t. the
number of the sampled intensities (e.g., number of linear
pieces or histogram bins). They require coarse sampling to
achieve satisfactory speed, but at the expense of quality
degradation if the Nyquist condition is severely broken.
The space-range domain is generalized to h igher
dimension for color-weighted bilateral filtering [35]. The
expensive cost due to the high dimensionality can be
reduced by the Gaussian kd-trees [20], the Permutohedral
Lattices [21], or the Adaptive Manifolds [36]. But the
performance of these methods is not competitive for gray-
scale bilateral filters because they spend much extra time
preparing the data structures.
Given the limitations of the bilateral filter, people began
to investigate new designs of fast edge-preserving filters.
The OðNÞ time Edge-Avoiding Wavelets (EAW) [37] are
wavelets with explicit image-adaptive weights. But the
kernels of the wavelets are sparsely distributed in the image
plane, with constrained kernel sizes (to powers of two),
which may limit the applications. Recently, Gastal and
Oliveira [38] propose another OðNÞ time filter known as the
Domain Transform filter. The key idea is to iteratively and
separably apply 1D edge-aware filters. The OðNÞ time
complexity is achieved by integral images or recursive
filtering. We will compare with this filter in this paper.
2.2 Implicit Weighted-Average Filters
A series of approaches optimize a quadratic cost function
and solve a linear system, which is equivalent to implicitly
filtering an image by an inverse matrix. In image segmenta-
tion [39] and colorization [9], the affinities of this matrix are
Gaus sian function s of the color similarities. In image
matting, a matting Laplacian matrix [10] is designed to
enforce the alpha matte as a local linear transform of the
image colors. This matrix is also applied in haze removal
[11]. The weighted least squares filter in [8] adjusts the
matrix affinities according to the image gradients and
produces halo-free edge-preserving smoothing.
Although these optimization-based approaches often
generate high quality results, solving the linear system is
time-consuming. Direct solvers like Gaussian Elimination
are not practical due to the memory-demanding “filled in”
problem [40], [41]. Iterative solvers like the Jacobi method,
Successive Over-Relaxation (SOR), and Conjugate Gradi-
ents [40] are too slow t o converge. Though caref ully
designed preconditioners [41] greatly reduce the iteration
number, the computational cost is still high. The multigrid
method [42] is proven OðNÞ time complex for homogeneous
Poisson equations, but its quality degrades when the matrix
becomes more inhomogeneous. Empirically, the implicit
weighted-average filters take at least a few seconds to
process a one megapixel image either by preconditioning
[41] or by multigrid [8].
2 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. X, XXXXXXX 2013
1. In the literature, “OðNÞ/linear time” algorithms are sometimes
referred to as “O(1)/constant time” (per pixel). We are particula rly
interested in the complexity of filtering the entire image, and we will us e
the way of “OðNÞ time” throughout this paper.