Kinect深度Kong填充的三边约束稀疏表示资源-CSDN文库

100 浏览量 2021-03-28 09:20:14 上传评论收藏 1.48MB PDF 举报

本研究论文“Kinect深度Kong填充的三边约束稀疏表示”由武汉大学的Zhongyuan Wang等人撰写，发表于《Pattern Recognition Letters》期刊的2015年第65期，页面95至102。该论文探讨了如何利用三边约束稀疏表示方法解决微软Kinect深度图中的孔洞填充问题。知识点一：Kinect深度图 Kinect是微软公司开发的一种运动感应控制器，它可以用来进行体感游戏和虚拟现实。Kinect深度图是指由Kinect设备通过红外光扫描得到的三维深度信息图。深度图能够表达场景中各个物体的距离信息，是实现立体视觉应用的关键数据源。知识点二：深度图的缺陷问题由于测量错误或干扰噪声，Kinect深度图常常存在孔洞和噪声等缺陷。这些缺陷严重影响了深度图在立体视觉中的应用效果。深度图中的孔洞是指深度数据丢失的部分，而噪声则会影响深度数据的准确性。知识点三：孔洞填充问题孔洞填充是指采用算法填充深度图中的空缺部分，以获得更加完整的深度信息。传统的滤波和修补技术虽然在孔洞填充上有所应用，但它们在填充大孔洞时效果不佳，或者在深度不连续处引入其他伪影，如模糊、锯齿和振铃效应。知识点四：稀疏表示稀疏表示是指在高维数据的表示中，只有少数的几个元素是非零的。在图像恢复任务中，稀疏表示可以有效保留自然图像的显著特征，因此相较于其他回归模型如岭回归等，稀疏表示在图像恢复领域更受欢迎。知识点五：三边约束稀疏表示本论文提出了一种三边约束稀疏表示方法，用于改进Kinect深度图的恢复。该方法考虑了参考贴片和目标贴片之间的强度相似性约束、空间距离约束以及目标贴片中的质心像素的位置约束在稀疏惩罚项和数据保真项上。知识点六：双边滤波和局部学习三边约束稀疏表示方法的提出受到了局部学习和双边滤波的启发。局部学习通过学习相邻像素的局部结构，可以更好地理解图像的局部特性。双边滤波是一种保边滤波器，可以保留图像的边缘，同时对图像进行平滑处理。知识点七：深度图恢复深度图恢复是指从损坏的深度图中重建出尽可能接近原始场景的深度信息。深度图恢复是立体视觉应用中的一个重要环节。论文中提出的方法通过学习伴随颜色图像，可以针对孔洞填充问题获得深度预测准确率方面的最优解。知识点八：实验结果论文作者在真实的Kinect深度图和公共数据集上进行了各种实验，结果表明，所提出的三边约束稀疏表示方法在填充效果上，无论是在平滑区域还是在不连续区域，都优于现有的最先进方法。这表明该方法能够有效地解决孔洞填充问题，从而提升深度图的质量。以上知识点涵盖了从Kinect深度图的采集到深度图孔洞问题的处理，再到三边约束稀疏表示的提出及其在深度图恢复中的应用等多个方面，为Kinect深度图处理领域提供了有益的理论和实践参考。

资源推荐

资源详情

资源评论

Pattern Recognition Letters 65 (2015) 95–102

Contents lists available at ScienceDirect

Pattern Recognition Letters

journal homepage: www.elsevier.com/locate/patrec

Trilateral constrained sparse representation for Kinect depth hole ﬁlling

✩

Zhongyuan Wang

a,∗

, Jinhui Hu

, ShiZheng Wang

,TaoLu

NERCMS, School of Computer, Wuhan University, 430072, China

School of Electrical & Electronic Engineering, Nanyang Technological University, 639798, Singapore

School of Computer, Wuhan Institute of Technology, 430073, China

article info

Article history:

Received 14 September 2014

Availableonline3August2015

Keywords:

Sparse representation

Kinect

Depth map

Hole ﬁlling

abstract

Due to measurement errors or interference noise, Kinect depth maps exhibit severe defects of holes and

noise, which signiﬁcantly affect their applicability to stereo visions. Filtering and inpainting techniques have

been extensively applied to hole ﬁlling. However, they either fail to ﬁll in large holes or introduce other ar-

tifacts near depth discontinuities, such as blurring, jagging, and ringing. The emerging reconstruction-based

methods employ underlying regularized representation models to obtain relatively accurate combination co-

eﬃcients, leading to improved depth recovery results. Sparse representation facilitates retaining the saliency

features of natural images and is thus more favorite than other regression models in image restoration, e.g.

ridge regression. However, its naive applicability to depth map recovery hardly affords satisfactory depth

prediction. Motivated by locality learning and bilateral ﬁltering, this paper advocates a trilateral constrained

sparse representation for Kinect depth recovery, which considers the constraints of intensity similarity and

spatial distance between reference patches and target one on sparsity penalty term, as well as position con-

straint of centroid pixel in the target patch on data-ﬁdelity term. Learning from the accompanied color image,

this method can produce optimal solution to hole-ﬁlling problem in terms of depth prediction accuracy. Var-

ious experimental results on real-world Kinect maps and public datasets show that the proposed method

outperforms state-of-the-art methods in ﬁlling effects of both ﬂat and discontinuous regions.

1. Introduction

Microsoft Kinect is a representative RGB-D sensor that has

achieved great success in a wide variety of vision related applica-

tions such as augmented reality, robotics, and human–computer

interactions. The performance of these applications largely depends

on the quality of acquired depth images. It has been observed that

Kinect depth maps suffer from various defects, including holes,

wrong or inaccurate depth measurements, and interference noise.

Depth holes may occur in a depth image on the boundary of objects,

in smooth and shiny surfaces, and in other scattered locations [1].

Because the depth information is unavailable in holes and the depth

discontinuities between objects should be preserved, the recovery of

Kinect depth maps has become a challenging problem.

In order to reﬁne the quality of stereo depth, tremendous works

have been developed to ﬁll in holes, improve the accuracy and sup-

press noise simultaneously. The simplest choice is to recursively

apply a ﬁlter to depth data, such as plain median ﬁlter [2] and

✩

This paper has been recommended for acceptance by Prof. A. Heyden.

∗

Corresponding author. Tel./fax: +86 2787648233.

E-mail address: wzy_hope@163.com (Z. Wang).

Gaussian ﬁlter [3]; however it will also signiﬁcantly blur the sharp

depth edges. To ﬁll in the holes while preserving sharp edges, bi-

lateral ﬁltering [4] and non-local ﬁltering [5] aremorefavoritedue

to their edge-preserving feature, but may also cause distortion in

non-hole regions. Different from spatial ﬁltering in a single depth

map, Matyunin et al. [6] took into account temporal information

in depth recovery, but to restore a target frame this approach re-

quires multiple consecutive frames around it and thus yields de-

lay. Fu et al. [7] employed inter-frame padding scheme to recover

missing depth values through consecutive depth maps. The com-

pensated depth sequence is then smoothed by a divisive normal-

ized bilateral ﬁlter. However, this method does not consider the

temporal depth inconsistency that the depth of a particular pixel

keeps on changing from one frame to the next, even when the

scene is stationary. Thus, it may result in edge fatting or shrink-

ing effects. Overall, the depth enhancement only relying on depth

data hardly complements missing pixels and eliminates artifacts

thoroughly.

Since Kinect depth map is associated with a complementary color

image, which is relatively in high quality. The pair-wise counterparts

deserve to maintain strong structural correlations and are hope-

fully combined to favor depth recovery. Following this idea, a large

number of proposals have been put forward, which can be roughly

http://dx.doi.org/10.1016/j.patrec.2015.07.025

96 Z. Wang et al. / Pattern Recognition Letters 65 (2015) 95–102

Table 1

Advantages and disadvantages of three categories of representative methods.

Categories Advantages Disadvantages Representatives

Filtering-based Simple implementations and afford a clean image Smooth out depth discontinuities and may fail to ﬁll in

large holes

Refs. [8–12]

Inpainting-based Achieve good quality for smooth regions Introduce artifacts, e.g., jagging, blurring, and ringing,

around thin structures or sharp discontinuities

Refs. [14–17]

Reconstruction-based Preserve suﬃcient accuracy in ﬂat regions and sharp

discontinuities around object edges simultaneously

Have to been guided by accompanied color images, and

incorrect prediction may happen

Refs. [18–24,28]

classiﬁed into three categories: ﬁltering-based, inpainting-based and

reconstruction-based methods. Qi et al. [8] proposed a fusion based

method using non-local ﬁltering scheme for restoring depth maps.

He et al. [9] proposed a guided ﬁlter that can preserve sharp edge

and avoid reversal artifacts when smoothing a depth map. Dakkak

et al. [10] proposed an iterative diffusion method which utilizes both

available depth values and color segmentation results to recover

missing depth information, but the shown results are sensitive to the

segmentation accuracy. In order to obtain more precise ﬁlter coef-

ﬁcients, Camplani et al. [11] used a joint bilateral ﬁlter to calculate

the weights of available depth pixels according to collocated pixels

in color image. Based on a joint histogram, Min et al. [12] instead

proposed a weighted mode ﬁlter to prevent the output depth values

from being blurred on the depth boundaries. However, ﬁltering-

based approaches often yield poor results near depth discontinuities,

especially when large holes occur in a depth map.

Inpainting techniques seem more promising in depth hole ﬁlling

than ﬁltering, interpolation and extrapolation algorithms. A popular

inpainting algorithm is fast marching method (FMM) by Telea [13],

but it does poorly when applied to depth maps as it is designed

for generic color images. With an aligned color image, Liu et al.

[14] proposed an extended FMM approach to guide depth inpaint-

ing. Structure-based inpainting [15] ﬁlls the holes by propagating

structure into the target regions via diffusion. The diffusion process

makes holes blurred, and texture is thus lost. Xu et al. [16] further

introduced the exemplar-based texture synthesis into structure

propagation so that the blurring effects can be somewhat avoided. In

order to prevent edge fatting or shrinking after hole inpainting, Miao

et al. [17] used the ﬂuctuating edge region in depth map to assist

hole completion. However, the missing depth values near the object

contour are directly assigned to the mean of available depth values in

ﬂuctuating edge region, which is hence inaccurate for representing

the depth contours.

The above reviewed methods achieve good quality for smooth

depth regions, but may introduce artifacts, e.g., jagging, blur-

ring, and ringing, around thin structures or sharp discontinuities.

Reconstruction-based methods apply image synthesis techniques to

predict missing depth values. Since the reconstruction coeﬃcients

are resolved in a closed-loop scheme in terms of the minimization

of residuals, higher hole-ﬁlling accuracy is achievable. A variety of

representation models have been employed to formulate hole ﬁlling

problems. Chen et al. [18,19] cast the depth recovery as an energy

minimization problem, which addresses the depth hole ﬁlling and

denoising simultaneously. In [20], an additional total variant (TV) reg-

ularization term is introduced to produce smooth depth maps with

sharp boundaries. Yang et al. [21] proposed an adaptive color-guided

autoregressive (AR) model for high quality depth recovery, where the

depth recovery task is converted into a minimization of AR predic-

tion errors subject to measurement consistency. The AR predictor for

each pixel is constructed according to both the local correlation in

the initial depth map and the nonlocal similarity in the accompanied

high quality color image. In contrast to the bilateral ﬁltering meth-

ods [4,11], obtaining reconstruction coeﬃcients by solving minimiza-

tion problem can avoid incorrect prediction in hole ﬁlling, whereas,

overemphasis on energy minimization [18,21] or total variant penalty

[20] is not conducive to depth discontinuities.

Sparse representation (SR) has proven successful in natural

images, where a sparsity prior on an over-complete dictionary solves

inverse problems such as denosing and inpainting. Such priors would

be expected to also play a crucial role in solving the depth recovery

problem. Following this assumption, SR has been recently applied

to stereo vision ﬁelds [22–24], showing promising results in depth

map denoising, depth estimation and scene reconstruction. However,

due to the fact that the depth values in hole regions are unavailable,

reconstruction coeﬃcients have to be learnt from complementary

color images. Otherwise, the generated coeﬃcients are not applicable

when naively used for depth prediction.

To facilitate quickly grasping main features of the above dis-

cussed three categories of representative methods (i.e., ﬁltering-

based, inpainting-based and reconstruction-based methods), we

brieﬂy summarize their advantages and disadvantages in Table 1.

Inspired by the success of locality constraints [25] in image clas-

siﬁcation [26] and image super-resolution [27], in our previous work

[28], we employed a color image guided locality regularized repre-

sentation (LRR) to determine the optimal weights from collocated

patches in color image. Locality constraint demands the reconstruc-

tion only rely on the most relevant pixels rather than all pixels, and

so gives impressive results for Kinect depth hole ﬁlling. Neverthe-

less, the sharp depth edges between objects cannot be adequately

retained due to the inherent over-smoothing of the employed under-

lying ridge regression (RR) model. Besides, [28] only accounts for the

impact of locality (also intensity similarity in Euclidean distance) on

coeﬃcient learning but ignores the other two important factors: geo-

metric distance and position. Since depth maps demonstrate prop-

erties of ﬂatness within objects and sharpness at boundaries, the

spatially neighboring pixels are more likely to share close depth in-

formation. Therefore, the spatial distances of reference patches from

the center patch should be taken into account when formulating reg-

ularized cost function. In addition, because the center pixel is more

concerned, the ﬁtness accuracy of an individual in a patch should not

be evenly treated, but be correlated to its coordinate. In this paper, we

represent the missing depth in occluded regions as the linear combi-

nation of the surrounding available depth values, and establish a tri-

lateral constrained sparse representation (TCSR) to solve the optimal

weights with the help of the associated color image. TCSR comprises

similarity-distance-inducing weighted 

sparsity penalty term and

position-inducing weighted data-ﬁdelity term, which thus not only

readily grasps the salient features of depth images but also consider-

ably promotes the representation accuracy.

The remainder of this paper is organized as follows. Section 2

describes the proposed method based on constrained sparse repre-

sentation in detail. Experimental results and analysis are provided in

Section 3, and we conclude this paper in Section 4.

2. Proposed method

In this section, we focus on trilateral constrained sparse represen-

tation model as well as its optimization.

剩余7页未读，继续阅读

评论收藏

内容反馈

weixin_38526914

粉丝: 7
资源: 910

Kinect深度Kong填充的三边约束稀疏表示

Kinect深度图修复

基于Kinect深度传感器的三维重建技术应用研究_叶日藏.ca

kinect深度图像显示代码

Kinect.rar_Kinect 深度图_kinect_kinect 重建_深度图_深度重建

kinect深度图像去噪

Kinect深度图像快速修复算法

基于Kinect深度图像的三维重建

读取并显示Kinect彩色图和深度图的三种方法：代码

基于Kinect深度信息的实时三维重建和滤波算法研究

基于Kinect深度图像信息的手势跟踪与识别

kinect 深度扫描软件

kinect调用深度图像

kinect深度识别代码

kinect深度图彩色图融合代码

基于改进双边滤波的Kinect深度图像空洞修复算法研究 图像滤波算法

kinect获取深度和彩色数据并保存

Kinect深度图像滤波算法 图像滤波算法.pptx

基于CORDIC的反正弦和反余弦计算的FPGA实现

使用3DCNN和卷积LSTM进行手势识别学习时空特征

BA无标度网络中的SIR模型

基于三次贝塞尔曲线的类汽车曲率连续路径平滑

基于机器学习的设备剩余寿命预测方法综述

基于维纳过程的退化模型，具有递归过滤算法，可用于估计剩余使用寿命

基于FPGA的奇异值和特征值分解的快速实现。

磁悬浮系统自适应模糊PID控制器的设计

最新资源

基于改进双边滤波的Kinect深度图像空洞修复算法研究图像滤波算法

Kinect深度图像滤波算法图像滤波算法.pptx