没有合适的资源?快使用搜索试试~ 我知道了~
Maggioni_Efficient_Multi-Stage_Video_CVPR_2021_supplemental.pdf
需积分: 0 0 下载量 80 浏览量
2024-05-14
14:53:09
上传
评论
收藏 10.23MB PDF 举报
温馨提示
试读
9页
Maggioni_Efficient_Multi-Stage_Video_CVPR_2021_supplemental.pdf
资源推荐
资源详情
资源评论
Efficient Multi-Stage Video Denoising with Recurrent Spatio-Temporal Fusion
Supplementary Materials
Matteo Maggioni
?
, Yibin Huang
?
, Cheng Li
?
, Shuai Xiao, Zhongqian Fu, Fenglong Song
Huawei Noah’s Ark Lab
{matteo.maggioni, huangyibin1, licheng89, xiaoshuai7, fuzhongqian, songfenglong}@huawei.com
1. Implementation Aspects
1.1. Learnable Invertible Transforms
Color Transform. The C × C color transform matrix
is analogous to a YUV transformation for RGB domain. A
YUV transform matrix has size C = 3, however the pro-
posed model is designed for raw data, thus in our case the
matrix will have size C = 4, in order to transform each
color in the CFA Bayer pattern (e.g., RG
1
G
2
B). Practi-
cally the matrix is defined as [2]
M =
0.5 0.5 0.5 0.5
−0.5 0.5 0.5 −0.5
0.65 0.2784 −0.2784 −0.65
−0.2784 0.65 −0.65 0.2784
=
Y
U
V
W
(1)
where each row has unit norm and corresponds to a different
color transform basis. The luminance component Y can be
easily recognized in the first row of (1), and unsurprisingly
it corresponds to an (energy-preserving) average of the four
input color channels. In our context, the matrix M will be
used to initialize the 1 × 1 × C × C kernel of a (point-wise)
convolutional layer.
Frequency Transform. As initialization value for our
learnable frequency transform we use filters obtained by
standard wavelet families. In fact, each wavelet type has
a pair of decomposition filters, a low-pass ψ
L
and a high-
pass ψ
H
, as well as a complementary pair of reconstruction
filters, again, a low-pass φ
L
and a high-pass φ
H
. These are
all real 1-D filters of size 1 × n, being n ∈ N
+
an even inte-
ger value. We use these filters to generate the corresponding
n × n convolutional kernels. For example, the 2-D LL de-
composition kernel is obtained as ψ
L
⊗ψ
L
being ⊗ the outer
product. We show all components involved in the learning
and application of the frequency transform in Fig. 1.
1.2. Models
VBM4D. VBM4D [5] is a traditional algorithm origi-
nally designed to remove independent and identically dis-
tributed zero-mean Gaussian noise in grayscale or RGB
video. However, in our experiments, we apply VBM4D on
Strided
Conv
Model
Transp.
Conv
Frequency Domain
Forward
Wavelet Filters
Inverse
Wavelet Filters
Identity
Forward Kernels
Inverse Kernels
=
∗
Input
Output
Figure 1: Frequency transform: convolutional kernels cor-
responds to the outer product ⊗ of the learned filters.
sRGB videos generated by an ISP [8] applied to the noisy
raw data. Thus the noise will be not independent, not iden-
tically distributed, and not white. These are not ideal condi-
tions for VBM4D, but we optimize its σ parameter, which
can be used to control the amount of denoising, to maximize
the PSNR of the validation data. We simply perform a grid
search to find the best σ for each ISO and each dataset.
FastDVDnet. We use the original FastDVDnet imple-
mentation provided by the authors [6]. FastDVDnet is de-
signed for Gaussian noise removal and uses a uniform noise
map corresponding to the variance of the distribution as ad-
ditional input of the network. Since we deal with signal-
dependent noise, we replace the uniform map with the vari-
ance map computed according to the raw noise model de-
fined in (2) of the main paper. In order to decrease model
complexity, we reduce the number of channels. Specifically,
in the 82.61 GFLOPs version, we use 8 channels in the in-
put layers, 16 channels in the highest-resolution scale, and
24 everywhere else. In the 22.16 GFLOPs version we use 8
channels everywhere.
1
资源评论
花生の恶魔
- 粉丝: 41
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功