没有合适的资源?快使用搜索试试~ 我知道了~
Recurrent_MVSNet_for_High-Resolution_Multi-View.pdf
试读
10页
需积分: 0 0 下载量 40 浏览量
更新于2024-11-19
收藏 2MB PDF 举报
多视图立体三维重建MVS论文
Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference
Yao Yao
1∗
Zixin Luo
1
Shiwei Li
1
Tianwei Shen
1
Tian Fang
2†
Long Quan
1
1
The Hong Kong University of Science and Technology
{yyaoag, zluoag, slibc, tshenaa, quan}@cse.ust.hk
2
Shenzhen Zhuke Innovation Technology (Altizure)
fangtian@altizure.com
Abstract
Deep learning has recently demonstrated its excellent
performance for multi-view stereo (MVS). However, one
major limitation of current learned MVS approaches is the
scalability: the memory-consuming cost volume regulariza-
tion makes the learned MVS hard to be applied to high-
resolution scenes. In this paper, we introduce a scalable
multi-view stereo framework based on the recurrent neu-
ral network. Instead of regularizing the entire 3D cost vol-
ume in one go, the proposed Recurrent Multi-view Stereo
Network (R-MVSNet) sequentially regularizes the 2D cost
maps along the depth direction via the gated recurrent
unit (GRU). This reduces dramatically the memory con-
sumption and makes high-resolution reconstruction feasi-
ble. We first show the state-of-the-art performance achieved
by the proposed R-MVSNet on the recent MVS benchmarks.
Then, we further demonstrate the scalability of the pro-
posed method on several large-scale scenarios, where pre-
vious learned approaches often fail due to the memory con-
straint. Code is available at https://github.com/
YoYo000/MVSNet.
1. Introduction
Multi-view stereo (MVS) aims to recover the dense repre-
sentation of the scene given multi-view images and cali-
brated cameras. While traditional methods [24, 10, 29, 9]
have achieved excellent reconstruction performance, recent
works [14, 13, 30] show that learned approaches are able to
produce results comparable to the traditional state-of-the-
arts. In particular, MVSNet [30] proposed a deep architec-
ture for depth map estimation, which significantly boosts
the reconstruction completeness and the overall quality.
One of the key advantages of learning-based MVS is
the cost volume regularization, where most networks ap-
∗
Intern at Shenzhen Zhuke Innovation Technology (Altizure).
†
Corresponding author.
ply multi-scale 3D CNNs [14, 15, 30] to regularize the 3D
cost volume. However, this step is extremely memory ex-
pensive: it operates on 3D volumes and the memory re-
quirement grows cubically with the model resolution (Fig. 1
(d)). Consequently, current learned MVS algorithms could
hardly be scaled up to high-resolution scenarios.
Recent works on 3D with deep learning also acknowl-
edge this problem. OctNet [23] and O-CNN [27] exploit the
sparsity in 3D data and introduce the octree structure to 3D
CNNs. SurfaceNet [14] and DeepMVS [13] apply the engi-
neered divide-and-conquer strategy to the MVS reconstruc-
tion. MVSNet [30] builds the cost volume upon the ref-
erence camera frustum to decouple the reconstruction into
smaller problems of per-view depth map estimation. How-
ever, when it comes to a high-resolution 3D reconstruction
(e.g., volume size > 512
3
voxels), these methods will either
fail or take a long time for processing.
To this end, we present a novel scalable multi-view
stereo framework, dubbed as R-MVSNet, based on the re-
current neural network. The proposed network is built upon
the MVSNet architecture [30], but regularizes the cost vol-
ume in a sequential manner using the convolutional gated
recurrent unit (GRU) rather than 3D CNNs. With the se-
quential processing, the online memory requirement of the
algorithm is reduced from cubic to quadratic to the model
resolution (Fig. 1 (c)). As a result, the R-MVSNet is appli-
cable to high resolution 3D reconstruction with unlimited
depth-wise resolution.
We first evaluate the R-MVSNet on DTU [1], Tanks and
Temples [17] and ETH3D [25] datasets, where our method
produces results comparable or even outperforms the state-
of-the-art MVSNet [30]. Next, we demonstrate the scal-
ability of the proposed method on several large-scale sce-
narios with detailed analysis on the memory consumption.
R-MVSNet is much more efficient than other methods in
GPU memory and is the first learning-based approach ap-
plicable to such wide depth range scenes, e.g., the advance
set of Tanks and Temples dataset [17].
5520
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
978-1-7281-3293-8/19/$31.00 ©2019 IEEE
DOI 10.1109/CVPR.2019.00567
Authorized licensed use limited to: Institute of Software. Downloaded on November 15,2024 at 03:27:14 UTC from IEEE Xplore. Restrictions apply.
下载后可阅读完整内容,剩余9页未读,立即下载
资源推荐
资源评论
182 浏览量
103 浏览量
195 浏览量
2021-03-31 上传
2019-06-11 上传
2021-02-10 上传
193 浏览量
108 浏览量
5星 · 资源好评率100%
2019-08-09 上传
121 浏览量
109 浏览量
2024-08-25 上传
185 浏览量
2024-03-06 上传
114 浏览量
123 浏览量
5星 · 资源好评率100%
2018-12-24 上传
2018-05-27 上传
2022-09-23 上传
146 浏览量
184 浏览量
2022-09-22 上传
5星 · 资源好评率100%
142 浏览量
资源评论
GL_Rain
- 粉丝: 2621
- 资源: 36
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- Python基于toad实现生成评分卡 完整的示例代码和数据集
- 基于PID控制器的电动汽车充放电系统的Simulink建模与仿真 包括程序操作录像+说明+参考paper 使用matlab2022a或者高版本,运行tops.m或者main.m 具体操作观看提供的程
- 密码学课程设计源代码,包括了数字签名、DES核心算法、Hash算法、RSA加解密
- stm32h743使用TSG时间
- 红绿灯识别项目代码,包括了一步一步的训练步骤,以及展示结果
- 基于视频通用内容特性的高效编码器参数优化模型研究(视频编码领域,HEVC标准,复杂度与性能优化)
- 基于显著性阈值的自适应视频流每场景比特率梯度优化预测方案
- 永磁同步电机旋转高频信号注入法零低速无位置控制仿真,相比高频方波信号注入法,旋转高频信号注入法噪声更小损耗更低,该模型注入1000Hz旋转高频电压信号到电机中用于产生激励电流,在低速100rpm下无感
- 异形插件机(sw21可编辑+工程图)全套技术资料100%好用.zip
- 光场图像编码新技术:基于线性近似先验的高效压缩方案
- 双足机器人强化学习项目.zip
- 双足机器人逆运动学解算.zip
- 双足机器人正运动学计算.zip
- RobotBit双足机器人.zip
- TITA双足机器人实机强化学习控制.zip
- 双足机器人:设计,建模,仿真,控制.zip
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功