没有合适的资源?快使用搜索试试~ 我知道了~
基于CNN的快速VVC帧间编码方法及其应用与性能提升研究
0 下载量 106 浏览量
2025-01-02
20:07:27
上传
评论
收藏 888KB PDF 举报
温馨提示
内容概要:本文提出了用于Versatile Video Coding(VVC)的快速帧间编码方法。通过构建一个多信息融合的卷积神经网络(MF-CNN),可以在早期终止四叉树加多种类型树(QTMT)划分单元编码过程,并通过内容复杂度评估提前决策合并模式来跳过耗时的预测模式。此外,文章引入了一个综合的框架图展示了完整的快速编码流程,并通过实验证明了该方法可以显著减少编码时间,平均减少约30.63%,但伴随着轻微的率失真比增长约3%。 适合人群:熟悉视频编码技术的研究人员,特别是在深度学习优化方面有兴趣的技术专家以及参与新一代编码标准设计的专业人员。 使用场景及目标:此论文的方法主要用于改进VVC的实时多媒体应用场景,旨在加速编码速度的同时尽量减少对压缩效果的影响,使得新编码工具能够在计算能力有限的设备上有效运作。 其他说明:文中详细解释了各个模块的设计思路和技术细节,如MF-CNN的架构、不同类型的AKCG配置、以及两种快速决策子系统的工作原理。同时也提供了广泛的实验对比数据支持所提出的优化措施的有效性和优越性。
资源推荐
资源详情
资源评论
1260 IEEE SIGNAL PROCESSING LETTERS, VOL. 28, 2021
A CNN-Based Fast Inter Coding Method for VVC
Zhaoqing Pan , Senior Member, IEEE, Peihan Zhang ,BoPeng , Member, IEEE,NamLing , Fellow, IEEE,
and Jianjun Lei
, Senior Member, IEEE
Abstract—The Versatile Video Coding (VVC) achieves superior
coding efficiency as compared with the High Efficiency Video
Coding (HEVC), while its excellent coding performance is at the
cost of several high computational complexity coding tools, such
as Quad-Tree plus Multi-type Tree (QTMT)-based Coding Units
(CUs) and multiple inter prediction modes. To reduce the compu-
tational complexity of VVC, a CNN-based fast inter coding method
is proposed in this paper. First, a multi-information fusion CNN
(MF-CNN) model is proposed to early terminate the QTMT-based
CU partition process by jointly using the multi-domain informa-
tion. Then, a content complexity-based early Merge mode decision
is proposed to skip the time-consuming inter prediction modes by
considering the CU prediction residuals and the confidence of MF-
CNN. Experimental results show that the proposed method reduces
an average of 30.63% VVC encoding time, and the Bjøontegaard
Delta Bit Rate (BDBR) increases about 3%.
Index Terms—Versatile Video Coding (VVC), Quad-Tree plus
Multi-type Tree (QTMT), early Merge mode decision, CNN.
I. INTRODUCTION
W
ITH the increase of video resolutions, the demand for
more effective video coding technologies has increased
rapidly. To solve this issue, the Joint Video Expert Group (JVET)
has developed the latest video coding standard, called Versatile
Video Coding (VVC) [1]. By introducing a series of new high-
complexity coding technologies, VVC has achieved a giant cod-
ing performance improvement on the basis of High Efficiency
Video Coding (HEVC) [2]–[7]. However, the extremely high
computational complexity becomes a bottleneck for the VVC to
be applied in real-time multimedia applications.
To improve the coding efficiency, the Quad-Tree plus Multi-
type Tree (QTMT) partition structure is adopted in VVC. In the
Coding Unit (CU) encoding process, the CU i s recursively split
into sub-CUs according to the QTMT, and the best partition
mode is determined by the minimum Rate Distortion (RD) cost.
Manuscript received April 15, 2021; revised May 26, 2021; accepted May
30, 2021. Date of publication June 7, 2021; date of current version June 28,
2021. This work was supported in part by the National Key R&D Program of
China under Grant 2018YFE0203900; in part by the National Natural Science
Foundation of China under Grants 61931014, 61722112, 61520106002, and
61971232; in part by the Natural Science Foundation of Tianjin under Grant
18JCJQJC45800; and in part by the Natural Science Foundation of Jiangsu
Province of China under Grant BK20201391. The associate editor coordinating
the review of this manuscript and approving it for publication was Prof. Xun
Cao. (Corresponding author: Peihan Zhang.)
Zhaoqing Pan, Peihan Zhang, Bo Peng, and Jianjun Lei are with the
School of Electrical and Information Engineering, Tianjin University, Tianjin
300072, China (e-mail: zqpan3-c@my.cityu.edu.hk; peihan_zhang@tju.edu.cn;
bpeng@tju.edu.cn; jjlei@tju.edu.cn).
Nam Ling is with the Department of Computer Engineering, Santa Clara
University, Santa Clara, CA 95053 USA (e-mail: nling@scu.edu).
Digital Object Identifier 10.1109/LSP.2021.3086692
The QTMT partition structure allows the CU with a square or
a rectangle shape, which dramatically increases the encoding
complexity. Hence, simplifying the QTMT- based CU partition
process can significantly decrease the computational complexity
of VVC. Besides, the advanced prediction modes have been
introduced to VVC for improving the inter prediction accuracy,
such as affine motion compensation prediction, adaptive motion
vector resolution, bi-directional optical flow, and so on. These
advanced prediction techniques also increase the computational
complexity of VVC. In order to reduce the VVC encoding
complexity, these modes can be conditionally skipped.
To reduce the computational complexity of CU encoding
process, many fast CU encoding methods have been proposed.
These methods can be roughly classified into two categories,
namely statistical analysis-based methods [8]–[14] and CNN-
based methods [15], [16]. The statistical analysis-based methods
simplify the CU encoding process by building the relationship
between the statistical features and the CU mode parameter.
In [8], Tang et al. proposed a fast CU encoding method for
intra and inter coding, in which the CU partition process is early
terminated by using the edge features extracted by the canny
edge detector. In addition, the three-frame difference is used
to measure the motion activity of the CU content in the i nter
coding. In [9], Chen et al. regarded that the uniform area is
usually encoded in large size CUs, and the non-uniform area
is usually encoded in small size CUs. Based on this analysis,
the CU partition process is early terminated according to the
variance and gradient information of the CU. In [10], Cui et al.
simplified the CU partition process by using the direction gradi-
ent information. In [11], Saldanha et al. utilized the variance and
the best angular intra prediction mode of the current CU to skip
the horizontal or vertical partition. In [12], Yang et al. proposed
a fast intra coding scheme consisting of a cascade decision
structure-based fast QTMT partition decision method and gradi-
ent descent-based fast intra mode decision method. In [13], Dong
et al. proposed an adaptive mode pruning method to skip the non-
promising modes and a mode-dependent termination method to
skip the intra predictions of remaining depth levels. Although
these statistical analysis-based fast CU encoding methods can
improve the computational efficiency, the effectiveness of the
statistical features depends on researchers’ experience. To avoid
designing features artificially, the CNN-based methods have
emerged. These methods learn the features for CU size decision
by convolution operations automatically. In [15], Tang et al.
proposed a shape adaptive CNN-based fast CU partition decision
for intra coding to handle CUs with various sizes by utilizing the
variable size pooling layer. In [16], Tissier et al. trained a CNN
1070-9908 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Prince Edward Island. Downloaded on July 03,2021 at 13:00:00 UTC from IEEE Xplore. Restrictions apply.
资源评论
码流怪侠
- 粉丝: 2w+
- 资源: 374
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 【岗位说明】珠宝组长岗位职责.doc
- 【岗位说明】珠宝设计岗位职责.docx
- 【岗位说明】药店店长工作职责.doc
- 【岗位说明】药店营业员岗位职责.doc
- 【岗位说明】药店店长工作手册.doc
- 【岗位说明】药店营业员工作流程.doc
- 【岗位说明】药房操作规程最新版.doc
- 【岗位说明】药品库工作人员岗位职责.doc
- 【岗位说明】海迈斯装饰岗位职责.doc
- 【岗位说明】安装工程师岗位职责.doc
- 【岗位说明】装饰公司岗位职责.docx
- 【岗位说明】装修公司岗位职责.doc
- 【岗位说明】装修公司岗位职责及结构图.doc
- 【岗位说明】精装工程师岗位职责.doc
- 【岗位说明】装修公司框架及岗位职责.docx
- 【岗位说明】装修业务员岗位职责(最新篇).doc
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功