【免费】一种利用RGBD数据处理高动态环境的非参数统计和聚类的方法2资源-CSDN文库

需积分: 0 45 浏览量 2022-08-03 23:28:16 上传评论收藏 1.16MB PDF 举报

资源详情

资源评论

资源推荐

3DR EXPRESS

Nonparametric Statistical and Clustering Based RGB-D

Dense Visual Odometry in a Dynamic Environment

Wugen Zhou

Xiaodong Peng

Haijiao Wang

Bo Liu

Received: 26 September 2018 / Revised: 17 February 2019 / Accepted: 18 February 2019

Ó 3D Display Research Center, Kwangwoon University and Springer-Verlag GmbH Germany, part of Springer Nature 2019

Abstract The robustness of dense-visual-odometry

is still a challenging problem if moving objects appear

in the scene. In this paper, we propose a form of dense-

visual-odometry to handle a highly dynamic environ-

ment by using RGB-D data. Firstly, to ﬁnd dynam ic

objects, we propose a multi-frame based residual

computing model, which takes a far time difference

frame into consideration to achieve the temporal

consistency motion segmentation. Then the proposed

method combines a scene clustering model and a

nonparametric statistical model to obtain weighted

cluster-wise residuals, as the weight describes how

importantly a cluster residual is considered.

Afterward, the motion segm entation labeling and

clusters’ weights are added to the energy function

optimization of dense-visual-odometry to reduce the

inﬂuence of moving objects. Finally, the experimental

results demonstrate that the proposed method has

better performance than the state-of-the-art methods

on many challenging sequences from a benchmark

dataset, especially on highly dynamic sequences.

Keywords Visual odometry  Dynamic

environment  SLAM  Nonparametric statistical 

Motion segmentation

1 Introduction

Visual odometry estimation plays a key role in

simultaneous localization and mapping (SLAM) sys-

tems for navigation of robots and autonomous vehicles

[1]. Especially, visual odometry can provide 3D

motion estimation for legged robots and aerial vehi-

cles, whereas traditional encoder based wheel odom-

etry can give 2D motion estimations only. Moreover,

many robotic applications such as object detection,

obstacle avoidance, semantic segmentation and place

recognition could beneﬁt from rich visual information,

which has advantages for joint vision tasks [2–4].

Recently, many visual odometry methods have

emerged and achieved successful results. There are

two main groups of visual odometry methods. One is

Electronic supplementary material The online version of

this article (https://doi.org/10.1007/s13319-019-0220-4) con-

tains supplementary material, which is available to authorized

users.

W. Zhou  X. Peng (&)  H. Wang  B. Liu

National Space Science Centre, Chinese Academy of

Sciences, Beijing 100190, China

e-mail: [email protected]

X. Peng

e-mail: [email protected]

H. Wang

e-mail: [email protected]

B. Liu

e-mail: [email protected]

W. Zhou  H. Wang

University of Chinese Academy of Sciences,

Beijing 100049, China

123

3D Res (2019) 10:11

https://doi.org/10.1007/s13319-019-0220-4

(0123456789().,-volV)(0123456789().,-volV)

使用RGBD数据在dynamic环境中的dense VO方法

feature-based visual odometry [5–7]. Th ese methods

estimate 6 degree-of-freedom poses by solving a

closed-form optimal problem using matched visual

correspondence features from source to target frames.

The other is the dense-visual-odometry [8–11]

approach, which takes the pose estimation as an

energy minimization by using the entire dense inten-

sity or depth difference between target and warped

source images.

Generally, dense-visual-odometry is more

stable and robust than the visual feature-based method

in static environments, especially in low-texture

environments [12]. However, the performance of most

of the state-of-the-art dense methods deteriorates

when dynamic objects appear in the scene because

of the assumption of a static world. Usually, the non-

static parts in most quasi-static environments, which

include small dynamic objects, can be viewed as noise.

This noise can be reduced by using some probabilistic

methods, such as the random sample consensus

(RANSAC) [13] or robust Huber function [14].

Unfortunately, most of these dense methods cannot

work correctly if the dynamic parts become signiﬁ-

cant. Some other dense methods use segmentation

based on the cluster-wise [15] or nonparametric [16]

model to deal with dynamic scenes. However, they

cannot effectively remove dynamic objects and pre-

cisely estimate the ego-motion in highly dynamic

environments.

In this paper, we propose a dense approach based on

the nonparametric statistical model and the clustering

model to maintain the robustness of dense-visual-

odometry in highly dynamic scenes by using RGD-B

data. The main contributions of this work are as

follows: (i) the multi-frame-based residual computing

model is proposed to achieve temporal consistency

motion segmentation, leading to an improvement in

the precision of camera motion estimation; and (ii) the

clustering model and the nonparametric statistics

model are combined to obtain the weighted clusters

and then prevent dynamic clusters from pose estima-

tion. The experiments show that our approach outper-

forms the state-of-the-art dense-visual-odometry

methods that are designed to handle dynamic scenes

on many sequences from the RGB-D benchmark

dataset.

2 Related Work

Currently, the handling of dynamic scenes remains a

challenging problem for visual SLAM. Motion seg-

mentation is regarded as a key step to deal with this

problem, which ﬁnds the key points or pixels on

dynamic objects and then removes them from the

optimization process.

Kundu et al. [17] proposed a monocular visual

SLAM with motion segmentation based on multi-view

geometry constraints. The method improves the

robustness of pose estimation by ﬁltering feature

points that do not conform to constraints on the driving

road. However, the assumption is so strong that it

limits the application to rigid motion scenes. Tan et al.

[18] considered the difference in appearance and

structure between key frames and current frames to

ﬁlter out the effects of dynamic objects, but their

method is limit ed to small scenes. With the assump-

tion that all static background motions are equally

likely, scene motions are clustered by using sparse 3D

ﬂow [13] or scene ﬂow [19], and then camera motion

can be calculated by iterative reﬁnement schemes such

as RANSAC. However, these methods may fail when

dynamic key points outnumber static key points. Other

approaches consider using multiple cameras to explic-

itly compensate for occlusion of dynamic objects [20],

or add an additional IMU sensor [21, 22] to alleviate

this problem. In addition, Li et al. [23] proposed a

static point weighting method for sparse 3D edge point

clouds. Their approach can achieve the state-of-the-art

accuracy. However, the methods mentioned above are

all based on sparse feature points, and therefore they

cannot perform dense motion segmentation and dense

mapping thereafter.

Different from these works, our approach is a

dense-visual-odometr y method. Some related works

are discussed here. A solution for the joi nt estimation

of visual odometry and dense scene ﬂow was proposed

by Jaimez et al. [ 15 ]. They used the background

segmentation and energy function optimization to

divide the scene into moving, stationary, and inter-

mediate state parts. However, some intermediate state

parts deteriorate the method’s performance since all

clusters are treated equally in the optim ization of

energy function. Our approach is also related to the

work of Kim et al. [16], who proposed to leverage

accumulated depth residuals from multiple previous

frames to model a static background by the

123

11 Page 2 of 11 3D Res (2019) 10:11

nonparametric method. The method is based on pixel-

wise segmentation and the statistical model for pose

estimation, which is susceptible to independent non-

rigid body motion of past frames, and therefore some

dynamic pixels may be include d in the energy function

minimization. Meanwhile, Scona et al. [24] proposed

StaticFusion. They maintain a static background

environment mapping used for pose estimation in a

model-to-frame way. However, StaticFusion cannot

work on scenes with fast camera motion due to not

having enough time for mapping. Finally, Sun et al.

[25] proposed a motion removal approach as a pre-

processing ste p and integrated it into the front end of

RGB-D SLAM. However, the disadvantage of this

method is that they can deal with only one motion,

instead of multiple moving objects in dynamic scenes.

Another line of related works are off-line methods.

Roussos et al. [26] proposed an approach of multi-

body motion segmentation and reconstruction based

on the energy function. The algorithm effectively

gives the camera pose, scene depth, and 3D recon-

struction in dynamic scenes. Unfortunately, the

method processes RGB-D data in a batch way and

hence can be seen as an off-line system. Wang et al.

[27] estimated dense optical ﬂow from frames, where

dynamic objects can be excluded by clustering motion

patterns based on optical ﬂow. However, due to the

large amount of calculations, they could not achieve

real-time performance.

Regarding motion segmentation, Stu

ckler et al. [28]

proposed an efﬁcient real-time dense motion segmen-

tation, whose weakness is that it is only applicable to

rigid body segmentation. Although some unsupervised

learning based methods [29–31] were recently pro-

posed and achieved good results, they cannot always

perform well in other special dynamic scenes since

they need a large dataset for training a network; thus,

they suffer from the generalization problem.

3 Dense Visual Odometry Approach

3.1 Overview

In this paper, we proposed a visual odometry approach

based on the nonparametric statistical model and the

clustering model. The overview is shown in Fig. 1.

First, K-means clustering was used to segment each

frame into N clusters based on depth and intensity.

Each cluster was considered to be a rigid body, and

thus the pixel-wise motio n segmentation problem was

simpliﬁed into a cluster-wise segmentation. Second,

the initial camera pose was calculated by minimized

photometric and depth residuals in a Cauchy M-esti-

mator, and then the estimated poses were used to warp

previous frames to the current frame coordinate. After

regularization, these warped frames were used to

compute temporal consistency residuals for each

cluster, which ensured the continuity of clusters’

motion. Third, temporal consistent residuals were used

to build a nonparametric statistical model based on the

t-distribution and to ﬁnd moving objects by utilizing a

dynamic threshold condition. Finally, the probability

conﬁdence of each static cluster based on the statistical

model was regarded as weight that would be incorpo-

rated into the energy function optimization for

obtaining a more accurate camera pose estimation.

Afterward, the warp function was updated based on a

new estimated transformation for the next iteration.

3.2 Preliminaries

Since the RGB-D sensor simultaneously provides a

color image and depth image, a pair of frames

k1

; Z

k1

ðÞand I

; Z

ðÞis given as input, where I xðÞ2

R and Z xðÞ2R represe nt the intensity and depth,

respectively, of pixel x ¼ u; v

ðÞ

2 R

. Intensity is

converted from the color image (0.299R ? 0.587G ?

0.114B). In the homogeneous coordinate, given a 3D

point p ¼ X

; Y

; Z

; 1ðÞ

, the projection function and

its inverse function between the 3D point and its pixel

on the image is as follows:

x ¼ p p

ðÞ¼

þ o

;

þ o



ð1Þ

¼ p

1

x; Z

ðÞ¼

u  o

;

v  o

; Z

; 1



ð2Þ

where f

and f

are the focal lengths and o

; o



is the

principal point.

As the camera moves, the 3D point p in the preview

frame’s camera coordinate can be transformed rigidly

to the current frame with the transformation matrix

k1

2 SE 3ðÞ. The new coordinate of the 3D point in

the current camera coordinate can be obtained by the

following function:

123

3D Res (2019) 10:11 Page 3 of 11 11

剩余10页未读，继续阅读

评论收藏

内容反馈

柔粟

粉丝: 28
资源: 304

一种利用RGBD数据处理高动态环境的非参数统计和聚类的方法2

评论0

最新资源

一种利用RGBD数据处理高动态环境的非参数统计和聚类的方法2

评论0

一种利用RGBD数据处理高动态环境的非参数统计和聚类的方法1

TUM RGBD数据集，全部序列，百度网盘地址

基于RGBD摄像头的障碍物检测

论文研究-基于非参数化采样的单幅图像深度估计.pdf

RGBD-IMU 离线数据采集方案1

TUM数据集（RGBD）百度云下载链接，官网下载实在太慢了，下载了一个数据集传到了百度云供大家下载

mmdetection SUN RGB-D数据集Python预处理程序

RGBD-PTAM算法的Python实现_python_代码_下载

TUM RGBD数据集 适用于动态场景的SLAM

使用RGBD数据集进行点云绘制-附件资源

rgbd_slamv2

视觉机器学习rgbd-slam-tutorial-gx-master

SLAM实践：ORB_SLAM2与D435---rgbd_tum.cc

NTU-RGBD合集.xlsx

rgbdslam_v2-indigo

用python学习rgbd-slam系列

rgbd_dataset_freiburg1_room.bag

尝试5读取tfrecord_RGBD_图像分类RGB-D_

BurpLoaderKeygen.jar.zip

最新版ISO/IEC 27001:2022、ISO 27002:2022中英文合集

Goby红队版-win-x64-2.4.7版本

Chrome Header Editor 插件

ISO SAE 21434-2021 中文版.pdf

OpenVAS GVM 中文翻译补丁

安全认证cisp教材全套

STM32F103C8T6核心板-电路原理图1.PDF

软件工程导论(第六版)课后习题答案1

最新资源

TUM RGBD数据集适用于动态场景的SLAM