没有合适的资源?快使用搜索试试~ 我知道了~
A Depth Map Post-Processing Approach Based on Adaptive Random Wa
需积分: 1 0 下载量 70 浏览量
2022-09-11
08:11:30
上传
评论
收藏 13.74MB PDF 举报
温馨提示
试读
11页
A Depth Map Post-Processing Approach Based on Adaptive Random Walk With Restart
资源推荐
资源详情
资源评论
Received July 22, 2016, accepted August 12, 2016, date of publication August 26, 2016, date of current version September 28, 2016.
Digital Object Identifier 10.1109/ACCESS.2016.2603220
A Depth Map Post-Processing Approach Based
on Adaptive Random Walk With Restart
HOSSEIN JAVIDNIA, (Student Member, IEEE), AND PETER CORCORAN, (Fellow, IEEE)
Department of Electronic Engineering, College of Engineering, National University of Ireland, Galway SW4 794, Ireland
Corresponding author: H. Javidnia (h.javidnia1@nuigalway.ie)
This work was supported by the Strategic Partnership Program of Science Foundation Ireland (SFI) and co-funded by SFI and FotoNation
Ltd., on Next Generation Imaging for Smartphone and Embedded Platforms under Project 13/SPP/I2868.
ABSTRACT Accurate depth estimation is still an important challenge after a decade, particularly from stereo
images. The accuracy comes from a good depth level and preserved structure. For this purpose, a depth post-
processing framework is proposed in this paper. The framework starts with the ‘‘Adaptive Random Walk
with Restart (2015)’’ algorithm. To refine the depth map generated by this method, we introduced a form of
median solver/filter based on the concept of the mutual structure, which refers to the structural information
in both images. This filter is further enhanced by a joint filter. Next, a transformation in image domain is
introduced to remove the artifacts that cause distortion in the image. The proposed post-processing method is
then compared with the top eight algorithms in the Middlebury benchmark. To explore how well this method
is able to compete with more widely known techniques, a comparison is performed with Google’s new depth
map estimation method. The experimental results demonstrate the accuracy and efficiency of the proposed
post-processing method.
INDEX TERMS Stereo matching, depth map, accuracy, edge preserving.
I. INTRODUCTION
A. STEREO DEPTH MAPS
In 3D computer graphics a depth map is an image or image
channel that contains information relating to the distance
to the surfaces of scene objects from a viewpoint [1]. The
depth information corresponds to luminance in proportion to
the distance from the camera. Near surfaces are depicted as
lighter while far surfaces are shown as darker. Estimating the
depth can be considered an important component of under-
standing geometric relations within a scene. In turn, such
relations help to provide a richer representation of objects and
their environment, often leading to improvements in existing
recognition tasks, as well as enabling further applications
such as robotics. In recent years, many new economical
facilities, including time-of-flight [2], [3], structured light [4],
and the Kinect were introduced for depth determination from
stereo images. Kinect captures pairs of synchronized depth-
color images for a scene within a range of several meters.
However, the depth map cannot be used directly in scene
reconstruction because it has some deficiencies such as gaps
due to occlusion, reflection and other optical factors.
In general stereo algorithms or stereo matching algorithms
are categorized into two groups based on the taxonomy
scheme of Scharstein and Szeliski [5]: i.e. local and global
algorithms.
In the local algorithms, the depth value at pixel P is depen-
dent on the intensity and color values of the window W in
which P is located. The initial matching cost is pixel-wise
which is often noisy with minimum information in parts of
the image with smoother texture. Therefore using the cost of
the neighboring regions will assign the best depth value to
pixel P.
On the other hand global methods consider the overall
structure of the scene and smoothen the image and then try
to solve the cost optimization problem.
B. STEREO MATCHING ALGORITHMS
In the last decade stereo matching has attracted a lot of
attention from researchers and many matching algorithms
have been developed. Some of the most well-known and
studied algorithms are LIBELAS [6], iSGM [7], DBP [8]
and CostFilter [9], LIBELAS [6] has been used since 2010
in different research studies. It is inspired from the obser-
vation that despite the fact that many stereo correspon-
dences are highly ambiguous, some of them can be robustly
matched.
VOLUME 4, 2016
2169-3536 2016 IEEE. Translations and content mining are permitted for academic research only.
Personal use is also permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
5509
H. Javidnia, P. Corcoran: Depth Map Post-Processing Approach
While the processing speed of the LIBELAS is quite fast,
the accuracy of the estimated depth map is poor.
iSGM [7] is an iterative scheme of Semi-global matching
(SGM) technique with refined concept of the cost integration
of semi-global matching. The gathered buffer is evaluated to
a prior disparity map after horizontal and vertical integration.
DBP [8] is a global matching algorithm based on energy-
minimization which as all other global methods contains
data and smoothness term. The main contribution in data
term in this algorithm is that, it is being approximated by a
color weighted correlation. Afterwards, the data term is being
refined in occluded regions by employing the hierarchical
loopy belief propagation algorithm.
CostFilter [9] is a framework for multiple applications
such as computing the disparity maps in real-time. It is the
technique which aims to be fast and edge-aware. It consists
of three steps: constructing a cost volume, fast cost volume
filtering and winner-take-all label selection. The estimated
depth by this method suffers from blocky artifacts along the
edges and corners, especially in the regions with illumination
transition. This causes a broken synthetic view along the
edges.
There are other methods which tried to obtain better accu-
racy of depth map based on the combination of Markov
Random Field (MRF) and sophisticated global optimization
techniques in different researches [10]–[13], but still obtain-
ing a good accuracy in depth estimation remains a chal-
lenge, especially in images with sophisticated or very simple
texture.
Another approach which has been considered to improve
the accuracy of the depth map by mostly preserving the edges
was using the Mutual Information (MI) and SIFT features.
A multisensor synthetic aperture radar (SAR) image registra-
tion method was proposed based on MI [14] and SIFT [15].
In this application, MI was used to estimate the registration
parameters which were being used later by conjugate feature
selection during the SIFT matching phase to decrease the
number of false matches. Following the same idea, a stereo
matching method was introduced in [16], based on the com-
bination of MI, SIFT, plane-fitting and log-chromaticity color
space.
Generally finding a local matching method which performs
well in terms of both speed and accuracy is not easy and
straightforward. But recently employing the random walk
with restart along with optimizing the matching cost proved
that it is possible to have fast matching with pretty accurate
estimation. ARWR is a local matching algorithm based on
random walk with restart method [17] which is used as the
fundamental algorithm in this paper.
At this point it is timely to introduce the field of applica-
tion, which establishes requirements for a high performance
stereo disparity map. This work derives from research on
automotive street-scene analysis where it is important to
determine small objects in order to evaluate risks in the path
of a vehicle – e.g. distant pedestrians, animals, vehicles.
As most automotive imaging systems employ relatively small
sensors (2-4 MP) compared to consumer devices it is impor-
tant to be able to run disparity mapping algorithms at full
native sensor resolution – in our case 2864
∗
1924 pixels.
All current methods, as outlined above, suffer from non-
accurate depth around edges and corners, depth discontinuity
especially in texture-less areas, depth conflict around the area
with similar colors and missing depth in one depth level.
By solving these challenges a depth map can present correct
and accurate depth information while respecting the structure
of the reference image.
C. FEATURES OF THE PROPOSED METHOD
In this paper is presented a method to refine the depth
map generated by the Adaptive Random Walk with Restart
(ARWR) algorithm in order to obtain significant improve-
ments in accuracy. The main features of the proposed method
are:
1- A guided joint filter based on the mutual information
was designed by diffusing the image domain.
2- Weights are allocated dynamically to the windows as
part of the joint filter. The weights are being regen-
erated every time the window is moving to the other
patch of pixels. The pixels count in different bins of a
histogram instead of storing the weights directly.
3- The important point about the proposed filter is that it is
rotation invariant because of the joint mutual informa-
tion. Also the filter can be applied repeatedly to remove
more noise but the edges and corners will be preserved
because of the mutual joint feature.
4- When using this filter, the algorithm works better on
high resolution images in comparison with low resolu-
tion.
5- This filter can be used for upsampling/downsampling
purposes.
6- This method has the advantage of filling the depth map
in regions with missing depth values.
The rest of this paper is organized as follows:
In the next section the chosen method, ARWR is presented
in detail. Section 3 provides the details of the proposed post-
processing filter. The results of the evaluation as well as
experimental results are presented in section 4, while con-
clusions are drawn in section 5. There are also 2 appendices
linked to this paper presenting extended numerical and visual
results.
II. INTRODUCTION TO ADAPTIVE RANDOM
WALK WITH RESTART
In this section we describe the fundamental and tech-
nical details of the chosen stereo matching method,
ARWR.
ARWR has an acceptable and comparable performance
in terms of estimation and speed against other algorithm,
but it is still far from the top stereo matching algorithm on
Middlebury benchmark in terms of accuracy.
This algorithm has several important advantages which
make it a suitable method for a variety of applications. It is
5510 VOLUME 4, 2016
H. Javidnia, P. Corcoran: Depth Map Post-Processing Approach
FIGURE 1. Overview of the adaptive random walk with restart.
not affected by illumination variation because of gradient and
census transform, the processing time is quite fast in compar-
ison with recently studied methods, has good performance in
both outside and inside environment and gives us the option
to have a estimation of the depth in low texture scenes.
One important advantage of this algorithm which con-
vinced us to employ it as a part of our approach, is the good
performance on high resolution images. A traditional way
to speed up stereo computation is to use image pyramids or
downsized images which also reduce the disparity range. This
down-sampling in disparity computation will cause some
small objects to be missed. The full disparity resolution for
large distance is vital for long range object detection. The
point about the chosen algorithm is that the image doesn’t
need to be down-sampled to speed up the method.
The comparison of this method with several others meth-
ods done in this paper showed that it has acceptable depth
estimation in high resolution images, 2864
∗
1924 pixels.
Acceptable depth estimation refers to the fact that the
algorithm doesn’t have the problem of estimating different
layers of depth in one object. It respects the depth layers
without conflict. This feature along with the fast process-
ing time makes this algorithm suitable for high resolution
real-time applications. Also it gives us the ability of mak-
ing a more accurate filter, which is described later in the
paper.
A. ALGORITHM DESIGN
The initial matching cost in ARWR is pixel-wise calculated
by employing census transform and gradient image matching.
Census-based matching technique or census transform was
initially introduced by Zabi in 1994 [18]. It is a form of
non-parametric local transform to map the intensity values
of the pixels within a square window to a bit string, thereby
capturing the image structure. In other words, it computes for
every pixel a binary string (census signature) by comparing
its grey value with the grey values in its neighborhood.
The census transform is robust to radiometric variations
but the noise in the local image structure is being encoded
based on the intensity of the pixels. The encoded noise brings
some matching doubts especially in the area with repetitive
or similar texture patterns.
To overcome this problem gradient image matching is
employed as part of the local matching block in ARWR.
At this stage gradient images are computed using 5 × 5
Sobel filters. The whole process of the ARWR is shown
in Fig. 1.
The green block in Fig. 1 shows the local matching block
including the transformation and matching parts.
The usual similarity criteria in stereo matching are
only strictly valid for surfaces with Lambertian (diffuse)
reflectance characteristics. Specular reflections are viewpoint
dependent and may cause large intensity difference at corre-
sponding image points. In the presence of specular reflection,
traditional stereo methods are often unable to establish any
correspondence, or the calculated disparity values tend to be
inaccurate.
In this case using the gradient image matching makes
the local matching method more robust on non-Lambertian
surfaces.
The noise variation in the local pixel-wise matching meth-
ods can be vital in term of the performance. That is why SLIC
(Simple Linear Iterative Clustering) algorithm is employed in
ARWR, the blue block in Fig. 1. SLIC is one of the common
super-pixeling methods [19].
The local measurements in the matching block are more
robust to noise variation when the super-pixels are considered
as the smallest parts of the image to be matched to the target
image. Super-pixeling is considered as an alternative to pixels
in pixel-wise matching which leads to a reduction in memory
requirements in the whole algorithm.
At the last step of the ARWR which is shown as pink
block in Fig. 1, the calculated matching cost is updated using
the RWR algorithm to determine the optimum disparity with
respect to occluded and discontinuity regions. The standard
VOLUME 4, 2016 5511
剩余10页未读,继续阅读
资源评论
Tommy_wxie
- 粉丝: 1062
- 资源: 60
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功