没有合适的资源?快使用搜索试试~ 我知道了~
Mesh Flow: A video stabilization algorithm
需积分: 50 18 下载量 91 浏览量
2018-08-21
12:05:18
上传
评论 1
收藏 2.97MB PDF 举报
温馨提示
试读
16页
mesh flow,一种新的视频防抖算法。对直播视频去抖非常有参考意义。
资源推荐
资源详情
资源评论
MeshFlow: Minimum Latency Online
Video Stabilization
Shuaicheng Liu
1(
B
)
,PingTan
2
, Lu Yuan
3
, Jian Sun
3
, and Bing Zeng
1
1
University of Electronic Science and Technology of China, Chengdu, China
{liushuaicheng,eezeng}@uestc.edu.cn
2
Simon Fraser University, Burnaby, Canada
3
Microsoft Research Asia, Beijing, China
{luyuan,jiansun}@microsoft.com
Abstract. Many existing video stabilization methods often stabilize
videos off-line, i.e. as a postprocessing tool of pre-recorded videos. Some
methods can stabilize videos online, but either require additional hard-
ware sensors (e.g., gyroscope) or adopt a single parametric motion model
(e.g., affine, homography) which is problematic to represent spatially-
variant motions. In this paper, we propose a technique for online video
stabilization with only one frame latency using a novel MeshFlow motion
model. The MeshFlow is a spatial smooth sparse motion field with motion
vectors only at the mesh vertexes. In particular, the motion vectors on
the matched feature points are transferred to their corresponding nearby
mesh vertexes. The MeshFlow is produced by assigning each vertex an
unique motion vector via two median filters. The path smoothing is
conducted on the vertex profiles, which are motion vectors collected at
the same vertex location in the MeshFlow over time. The profiles are
smoothed adaptively by a novel smoothing technique, namely the Pre-
dicted Adaptive Path Smoothing (PAPS), which only uses motions from
the past. In this way, the proposed method not only handles spatially-
variant motions but also works online in real time, offering potential for a
variety of intelligent applications (e.g., security systems, robotics, UAVs).
The quantitative and qualitative evaluations show that our method can
produce comparable results with the state-of-the-art off-line methods.
Keywords: Online video stabilization
· MeshFlow · Vertex profile
1 Introduction
Most existing video stabilization methods stabilize videos offline [1–5], where the
videos have already been recorded. These methods post-process shaky videos by
estimating and smoothing camera motions for the stabilized results. Typically, to
stabilize the motion at each time instance, they require not only camera motions
in the past but also camera motions in the future for high quality stabilization.
There is an increasing demand of online video stabilization, where the video
c
Springer International Publishing AG 2016
B. Leibe et al. (Eds.): ECCV 2016, Part VI, LNCS 9910, pp. 800–815, 2016.
DOI: 10.1007/978-3-319-46466-4
48
MeshFlow: Minimum Latency Online Video Stabilization 801
is stabilized on the spot during capturing. For example, a robot or drone often
carries a wireless video camera so that a remote operator is aware of the situation.
Ideally, the operator wants to see the video stabilized as soon as it appears on
the monitor for immediate responses. Offline stabilization are not suitable for
this application, though they produce strongly stabilized results.
Online stabilization is challenging mainly for two reasons. Firstly, the cam-
era motion estimation is difficult. Some online stabilization methods use gyro-
scope [6,7] for realtime motion estimation. However, gyro-based methods can
only capture rotational motion, leaving translational motion untouched. High
quality video stabilization requires handling of spatially-variant motion, which
is often due to parallax and camera translation, a common problem in general
scenes with depth changes. Spatially-variant motion is complicated. It cannot
be represented by a single homography [1,3]. Recent methods [4,5,8] divide
the video frame into several regions. However, this strategy is computationally
expensive and hinders realtime applications. Enforcing spatial-temporal coher-
ence during camera motion smoothing further complicates this approach.
Secondly, successful camera motion filtering often requires future frames.
Some online video stabilization methods [9–11] use the single homography model
and buffer some future frames. For example, the method of [10] requires a min-
imum of one second delay. The temporal buffer is needed to adaptively set
the smoothing strength so as to avoid artifacts caused by excessive smoothing.
Reducing this buffer for future frame will significantly deteriorate the results.
We design an online video stabilization method with minimum latency by
solving the two aforementioned challenges. Our method only requires past
motions for high quality motion filtering. We propose a novel motion model,
MeshFlow, which is a spatially smooth sparse motion field with motion vectors
defined only at the mesh vertexes. It can be regarded as a down-sampled dense
flow. Specifically, we place a regular 2D mesh on the video frame. We then track
image corners between consecutive frames, which yields a motion vector at each
feature location. Next, these motion vectors are transferred to their correspond-
ing nearby mesh vertexes, such that each vertex accumulates several motions
from its surrounding features. The MeshFlow is a sparse 2D array of motion
vectors consisting of motions at all mesh vertices.
With regards to the camera motion smoothing, we design a filter to smooth
the temporal changes of the motion vector at each mesh vertex. This filter is
applied to each mesh vertex. Thus, it can naturally deal with the spatially-
variant motion. The uniqueness of this filter is that it mainly requires previous
motions for strong stabilization. This is achieved by predicting an appropriate
smoothing strength according to the camera motion at previous frames. In this
way, it can achieve adaptive smoothing to avoid excessive cropping and wobble
distortions. We call this filter Predicted Adaptive Path Smoothing (PAPS).
In summary, the main contribution of the paper consists of: (1) a compu-
tationally efficient motion model, MeshFlow, for spatially-variant motion repre-
sentation; and (2) an adaptive smoothing method PAPS, designed for the new
model for online processing with only one frame latency. We evaluate our method
802 S. Liu et al.
on various challenging videos and demonstrate its effectiveness in terms of both
visual quality and efficiency.
1
2 Related Work
According to the adopted motion model, video stabilization methods can be
categorised into 3D [8,12,13], 2D [1,3,14], and 2.5D [2,15] approaches.
The 3D methods estimate camera motions in 3D space for stabilization.
Beuhler et al. [16] stabilized videos under projective 3D reconstruction. Liu
et al. [8] applied Structure from Motion (SfM) to the video frames and used
content preserving warps for novel view synthesis. Zhou et al. [13] introduced
3D plane constraints for improved warping quality. Smith et al. [17] and Liu
et al. [5] adopted light field camera and Kinect camera, respectively, in acquir-
ing of 3D structures. Methods [6,7] and [18] used gyroscope to estimate 3D
rotations. Some 2.5D approaches relax the full 3D requirement to some partial
3D information that is embedded in long feature tracks. Goldstein and Fat-
tal [15] used “epipolar transfer” to enhance the length of feature tracks while
Liu et al. [2] smoothed feature tracks in subspace so as to maintain the 3D con-
straints. Later, the subspace approach is extended for stereoscopic videos [19].
All these methods either conducted expensive and brittle 3D reconstruction or
required additional hardware sensors for stabilization. In contrast, our method
is a sensor-free approach that neither recoveries the 3D structures nor relies on
long feature tracks.
The 2D methods use a series of 2D linear transformations (e.g., affines, homo-
graphies) for motion estimation and smooth them for stabilized videos [1,20–22].
Grundmann et al. [3] employed cinematographyrules for camera path design. Later,
they extended their approach by dividing a single homography into homography
array [14] such that the rolling shutter distortions could be well compensated. Wang
et al. [4] divided frames into triangles and smoothed feature trajectories with a
spatial-temporal optimization. Liu et al. [5] smoothed bundled paths for spatially-
variant motions. Bai et al. [23] extended the bundled-paths by introducing user
interactions. Liu et al. [24] proposed to replace the smoothing of feature tracks with
the smoothing of pixel profiles and showed several advantages over smoothing of
traditional feature tracks. Inspired from [24], we propose to smooth vertex profiles,
a sparse version of pixel profiles, for an improved robustness and efficiency, which
facilitates an online system with spatially-variant motion representation.
3MeshFlow
In this section, we introduce the MeshFlow motion model. Figure 1 shows a com-
parison between the SteadyFlow [24] and our MeshFlow. Compared with the
SteadyFlow, which calculates dense optical flow and extracts pixel profiles at all
pixel locations for stabilization, our MeshFlow is computationally more efficient.
1
Project page: http://www.liushuaicheng.org/eccv2016/index.html.
MeshFlow: Minimum Latency Online Video Stabilization 803
(a) A Pixel Profile in Stead
y
Flow [24]
(b) A Vertex Profile in MeshFlow
Fig. 1. (a) Pixel profiles [24] collect motion vectors at the same pixel location in
SteadyFlow over time for all pixel locations. Motions of SteadyFlow come from dense
optical flow. (b) Vertex profiles only collect motion vectors in MeshFlow at mesh ver-
texes. Motions of MeshFlow come from feature matches between adjacent frames.
We only operate on a sparse regular grid of vertex profiles, such that the expen-
sive optical flow can be replaced with cheap feature matches. For one thing, they
are similar because they both encode strong spatial smoothness. For another,
they are different as one is dense and the other is sparse. Moreover, the motion
estimation methods are totally different. Next, we show how to estimate spacial
coherent motions at mesh vertexes.
t
1t
t
t
)c()b()a(
Fig. 2. (a) A pair of matched features (red dots) between frame t and t − 1. (b) The
arrow indicates the motion of the feature point at frame t. (c) The motion is propagated
to the nearby vertexes. (Color figure online)
3.1 Motion Propagation
We match image features between neighboring frames. Figure 2 shows an exam-
ple. Suppose {p, ˆp} is the p-th matched feature pair, with p at frame t and ˆp at
frame t − 1(p and ˆp denote the image coordinates of features). The motion v
p
at feature location p can be computed as: v
p
= p − ˆp (see the dashed arrow in
Fig. 2(a)). The mesh vertexes nearby the feature p should have a similar motion
as v
p
. Therefore, we define an eclipse that is centered at p (dashed circle in
Fig. 2(b)) and assign v
p
to the vertexes within the eclipse (see Fig. 2(c)). Specif-
ically, we detect FAST features [25] and track them by KLT [26] to the adjacent
frame. We place a uniform grid mesh with 16×16 regular cells onto each frame.
2
2
We draw this mesh as 8 × 8 in all figures for the purpose of clearer illustration.
剩余15页未读,继续阅读
资源评论
-大师-
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功