没有合适的资源?快使用搜索试试~ 我知道了~
一种利用RGBD数据处理高动态环境的非参数统计和聚类的方法2
需积分: 0 0 下载量 45 浏览量
2022-08-03
23:28:16
上传
评论
收藏 1.16MB PDF 举报
温馨提示
试读
11页
Dense Visual Odometry in a Dynamic EnvironmentWugen Zhou . Xiaodong Peng . Haiji
资源详情
资源评论
资源推荐
3DR EXPRESS
Nonparametric Statistical and Clustering Based RGB-D
Dense Visual Odometry in a Dynamic Environment
Wugen Zhou
.
Xiaodong Peng
.
Haijiao Wang
.
Bo Liu
Received: 26 September 2018 / Revised: 17 February 2019 / Accepted: 18 February 2019
Ó 3D Display Research Center, Kwangwoon University and Springer-Verlag GmbH Germany, part of Springer Nature 2019
Abstract The robustness of dense-visual-odometry
is still a challenging problem if moving objects appear
in the scene. In this paper, we propose a form of dense-
visual-odometry to handle a highly dynamic environ-
ment by using RGB-D data. Firstly, to find dynam ic
objects, we propose a multi-frame based residual
computing model, which takes a far time difference
frame into consideration to achieve the temporal
consistency motion segmentation. Then the proposed
method combines a scene clustering model and a
nonparametric statistical model to obtain weighted
cluster-wise residuals, as the weight describes how
importantly a cluster residual is considered.
Afterward, the motion segm entation labeling and
clusters’ weights are added to the energy function
optimization of dense-visual-odometry to reduce the
influence of moving objects. Finally, the experimental
results demonstrate that the proposed method has
better performance than the state-of-the-art methods
on many challenging sequences from a benchmark
dataset, especially on highly dynamic sequences.
Keywords Visual odometry Dynamic
environment SLAM Nonparametric statistical
Motion segmentation
1 Introduction
Visual odometry estimation plays a key role in
simultaneous localization and mapping (SLAM) sys-
tems for navigation of robots and autonomous vehicles
[1]. Especially, visual odometry can provide 3D
motion estimation for legged robots and aerial vehi-
cles, whereas traditional encoder based wheel odom-
etry can give 2D motion estimations only. Moreover,
many robotic applications such as object detection,
obstacle avoidance, semantic segmentation and place
recognition could benefit from rich visual information,
which has advantages for joint vision tasks [2–4].
Recently, many visual odometry methods have
emerged and achieved successful results. There are
two main groups of visual odometry methods. One is
Electronic supplementary material The online version of
this article (https://doi.org/10.1007/s13319-019-0220-4) con-
tains supplementary material, which is available to authorized
users.
W. Zhou X. Peng (&) H. Wang B. Liu
National Space Science Centre, Chinese Academy of
Sciences, Beijing 100190, China
e-mail: [email protected]
X. Peng
e-mail: [email protected]
H. Wang
e-mail: [email protected]
B. Liu
e-mail: [email protected]
W. Zhou H. Wang
University of Chinese Academy of Sciences,
Beijing 100049, China
123
3D Res (2019) 10:11
https://doi.org/10.1007/s13319-019-0220-4
(0123456789().,-volV)(0123456789().,-volV)
使用RGBD数据在dynamic环境中的dense VO方法
feature-based visual odometry [5–7]. Th ese methods
estimate 6 degree-of-freedom poses by solving a
closed-form optimal problem using matched visual
correspondence features from source to target frames.
The other is the dense-visual-odometry [8–11]
approach, which takes the pose estimation as an
energy minimization by using the entire dense inten-
sity or depth difference between target and warped
source images.
Generally, dense-visual-odometry is more
stable and robust than the visual feature-based method
in static environments, especially in low-texture
environments [12]. However, the performance of most
of the state-of-the-art dense methods deteriorates
when dynamic objects appear in the scene because
of the assumption of a static world. Usually, the non-
static parts in most quasi-static environments, which
include small dynamic objects, can be viewed as noise.
This noise can be reduced by using some probabilistic
methods, such as the random sample consensus
(RANSAC) [13] or robust Huber function [14].
Unfortunately, most of these dense methods cannot
work correctly if the dynamic parts become signifi-
cant. Some other dense methods use segmentation
based on the cluster-wise [15] or nonparametric [16]
model to deal with dynamic scenes. However, they
cannot effectively remove dynamic objects and pre-
cisely estimate the ego-motion in highly dynamic
environments.
In this paper, we propose a dense approach based on
the nonparametric statistical model and the clustering
model to maintain the robustness of dense-visual-
odometry in highly dynamic scenes by using RGD-B
data. The main contributions of this work are as
follows: (i) the multi-frame-based residual computing
model is proposed to achieve temporal consistency
motion segmentation, leading to an improvement in
the precision of camera motion estimation; and (ii) the
clustering model and the nonparametric statistics
model are combined to obtain the weighted clusters
and then prevent dynamic clusters from pose estima-
tion. The experiments show that our approach outper-
forms the state-of-the-art dense-visual-odometry
methods that are designed to handle dynamic scenes
on many sequences from the RGB-D benchmark
dataset.
2 Related Work
Currently, the handling of dynamic scenes remains a
challenging problem for visual SLAM. Motion seg-
mentation is regarded as a key step to deal with this
problem, which finds the key points or pixels on
dynamic objects and then removes them from the
optimization process.
Kundu et al. [17] proposed a monocular visual
SLAM with motion segmentation based on multi-view
geometry constraints. The method improves the
robustness of pose estimation by filtering feature
points that do not conform to constraints on the driving
road. However, the assumption is so strong that it
limits the application to rigid motion scenes. Tan et al.
[18] considered the difference in appearance and
structure between key frames and current frames to
filter out the effects of dynamic objects, but their
method is limit ed to small scenes. With the assump-
tion that all static background motions are equally
likely, scene motions are clustered by using sparse 3D
flow [13] or scene flow [19], and then camera motion
can be calculated by iterative refinement schemes such
as RANSAC. However, these methods may fail when
dynamic key points outnumber static key points. Other
approaches consider using multiple cameras to explic-
itly compensate for occlusion of dynamic objects [20],
or add an additional IMU sensor [21, 22] to alleviate
this problem. In addition, Li et al. [23] proposed a
static point weighting method for sparse 3D edge point
clouds. Their approach can achieve the state-of-the-art
accuracy. However, the methods mentioned above are
all based on sparse feature points, and therefore they
cannot perform dense motion segmentation and dense
mapping thereafter.
Different from these works, our approach is a
dense-visual-odometr y method. Some related works
are discussed here. A solution for the joi nt estimation
of visual odometry and dense scene flow was proposed
by Jaimez et al. [ 15 ]. They used the background
segmentation and energy function optimization to
divide the scene into moving, stationary, and inter-
mediate state parts. However, some intermediate state
parts deteriorate the method’s performance since all
clusters are treated equally in the optim ization of
energy function. Our approach is also related to the
work of Kim et al. [16], who proposed to leverage
accumulated depth residuals from multiple previous
frames to model a static background by the
123
11 Page 2 of 11 3D Res (2019) 10:11
nonparametric method. The method is based on pixel-
wise segmentation and the statistical model for pose
estimation, which is susceptible to independent non-
rigid body motion of past frames, and therefore some
dynamic pixels may be include d in the energy function
minimization. Meanwhile, Scona et al. [24] proposed
StaticFusion. They maintain a static background
environment mapping used for pose estimation in a
model-to-frame way. However, StaticFusion cannot
work on scenes with fast camera motion due to not
having enough time for mapping. Finally, Sun et al.
[25] proposed a motion removal approach as a pre-
processing ste p and integrated it into the front end of
RGB-D SLAM. However, the disadvantage of this
method is that they can deal with only one motion,
instead of multiple moving objects in dynamic scenes.
Another line of related works are off-line methods.
Roussos et al. [26] proposed an approach of multi-
body motion segmentation and reconstruction based
on the energy function. The algorithm effectively
gives the camera pose, scene depth, and 3D recon-
struction in dynamic scenes. Unfortunately, the
method processes RGB-D data in a batch way and
hence can be seen as an off-line system. Wang et al.
[27] estimated dense optical flow from frames, where
dynamic objects can be excluded by clustering motion
patterns based on optical flow. However, due to the
large amount of calculations, they could not achieve
real-time performance.
Regarding motion segmentation, Stu
¨
ckler et al. [28]
proposed an efficient real-time dense motion segmen-
tation, whose weakness is that it is only applicable to
rigid body segmentation. Although some unsupervised
learning based methods [29–31] were recently pro-
posed and achieved good results, they cannot always
perform well in other special dynamic scenes since
they need a large dataset for training a network; thus,
they suffer from the generalization problem.
3 Dense Visual Odometry Approach
3.1 Overview
In this paper, we proposed a visual odometry approach
based on the nonparametric statistical model and the
clustering model. The overview is shown in Fig. 1.
First, K-means clustering was used to segment each
frame into N clusters based on depth and intensity.
Each cluster was considered to be a rigid body, and
thus the pixel-wise motio n segmentation problem was
simplified into a cluster-wise segmentation. Second,
the initial camera pose was calculated by minimized
photometric and depth residuals in a Cauchy M-esti-
mator, and then the estimated poses were used to warp
previous frames to the current frame coordinate. After
regularization, these warped frames were used to
compute temporal consistency residuals for each
cluster, which ensured the continuity of clusters’
motion. Third, temporal consistent residuals were used
to build a nonparametric statistical model based on the
t-distribution and to find moving objects by utilizing a
dynamic threshold condition. Finally, the probability
confidence of each static cluster based on the statistical
model was regarded as weight that would be incorpo-
rated into the energy function optimization for
obtaining a more accurate camera pose estimation.
Afterward, the warp function was updated based on a
new estimated transformation for the next iteration.
3.2 Preliminaries
Since the RGB-D sensor simultaneously provides a
color image and depth image, a pair of frames
I
k1
; Z
k1
ðÞand I
k
; Z
k
ðÞis given as input, where I xðÞ2
R and Z xðÞ2R represe nt the intensity and depth,
respectively, of pixel x ¼ u; v
ðÞ
T
2 R
2
. Intensity is
converted from the color image (0.299R ? 0.587G ?
0.114B). In the homogeneous coordinate, given a 3D
point p ¼ X
k
; Y
k
; Z
k
; 1ðÞ
T
, the projection function and
its inverse function between the 3D point and its pixel
on the image is as follows:
x ¼ p p
k
ðÞ¼
X
k
f
x
Z
k
þ o
x
;
Y
k
f
y
Z
k
þ o
y
ð1Þ
p
k
¼ p
1
x; Z
k
ðÞ¼
u o
x
f
x
Z
k
;
v o
y
f
y
Z
k
; Z
k
; 1
ð2Þ
where f
x
and f
y
are the focal lengths and o
x
; o
y
is the
principal point.
As the camera moves, the 3D point p in the preview
frame’s camera coordinate can be transformed rigidly
to the current frame with the transformation matrix
T
k
k1
2 SE 3ðÞ. The new coordinate of the 3D point in
the current camera coordinate can be obtained by the
following function:
123
3D Res (2019) 10:11 Page 3 of 11 11
剩余10页未读,继续阅读
柔粟
- 粉丝: 28
- 资源: 304
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0