基于CNN的快速VVC帧间编码方法及其应用与性能提升研究资源-CSDN文库

plus

Multi-type

106 浏览量 2025-01-02 20:07:27 上传评论收藏 888KB PDF 举报

资源推荐

资源详情

资源评论

1260 IEEE SIGNAL PROCESSING LETTERS, VOL. 28, 2021

A CNN-Based Fast Inter Coding Method for VVC

Zhaoqing Pan , Senior Member, IEEE, Peihan Zhang ,BoPeng , Member, IEEE,NamLing , Fellow, IEEE,

and Jianjun Lei

, Senior Member, IEEE

Abstract—The Versatile Video Coding (VVC) achieves superior

coding efﬁciency as compared with the High Efﬁciency Video

Coding (HEVC), while its excellent coding performance is at the

cost of several high computational complexity coding tools, such

as Quad-Tree plus Multi-type Tree (QTMT)-based Coding Units

(CUs) and multiple inter prediction modes. To reduce the compu-

tational complexity of VVC, a CNN-based fast inter coding method

is proposed in this paper. First, a multi-information fusion CNN

(MF-CNN) model is proposed to early terminate the QTMT-based

CU partition process by jointly using the multi-domain informa-

tion. Then, a content complexity-based early Merge mode decision

is proposed to skip the time-consuming inter prediction modes by

considering the CU prediction residuals and the conﬁdence of MF-

CNN. Experimental results show that the proposed method reduces

an average of 30.63% VVC encoding time, and the Bjøontegaard

Delta Bit Rate (BDBR) increases about 3%.

Index Terms—Versatile Video Coding (VVC), Quad-Tree plus

Multi-type Tree (QTMT), early Merge mode decision, CNN.

I. INTRODUCTION

ITH the increase of video resolutions, the demand for

more effective video coding technologies has increased

rapidly. To solve this issue, the Joint Video Expert Group (JVET)

has developed the latest video coding standard, called Versatile

Video Coding (VVC) [1]. By introducing a series of new high-

complexity coding technologies, VVC has achieved a giant cod-

ing performance improvement on the basis of High Efﬁciency

Video Coding (HEVC) [2]–[7]. However, the extremely high

computational complexity becomes a bottleneck for the VVC to

be applied in real-time multimedia applications.

To improve the coding efﬁciency, the Quad-Tree plus Multi-

type Tree (QTMT) partition structure is adopted in VVC. In the

Coding Unit (CU) encoding process, the CU i s recursively split

into sub-CUs according to the QTMT, and the best partition

mode is determined by the minimum Rate Distortion (RD) cost.

Manuscript received April 15, 2021; revised May 26, 2021; accepted May

30, 2021. Date of publication June 7, 2021; date of current version June 28,

2021. This work was supported in part by the National Key R&D Program of

China under Grant 2018YFE0203900; in part by the National Natural Science

Foundation of China under Grants 61931014, 61722112, 61520106002, and

61971232; in part by the Natural Science Foundation of Tianjin under Grant

18JCJQJC45800; and in part by the Natural Science Foundation of Jiangsu

Province of China under Grant BK20201391. The associate editor coordinating

the review of this manuscript and approving it for publication was Prof. Xun

Cao. (Corresponding author: Peihan Zhang.)

Zhaoqing Pan, Peihan Zhang, Bo Peng, and Jianjun Lei are with the

School of Electrical and Information Engineering, Tianjin University, Tianjin

300072, China (e-mail: zqpan3-c@my.cityu.edu.hk; peihan_zhang@tju.edu.cn;

bpeng@tju.edu.cn; jjlei@tju.edu.cn).

Nam Ling is with the Department of Computer Engineering, Santa Clara

University, Santa Clara, CA 95053 USA (e-mail: nling@scu.edu).

Digital Object Identiﬁer 10.1109/LSP.2021.3086692

The QTMT partition structure allows the CU with a square or

a rectangle shape, which dramatically increases the encoding

complexity. Hence, simplifying the QTMT- based CU partition

process can signiﬁcantly decrease the computational complexity

of VVC. Besides, the advanced prediction modes have been

introduced to VVC for improving the inter prediction accuracy,

such as afﬁne motion compensation prediction, adaptive motion

vector resolution, bi-directional optical ﬂow, and so on. These

advanced prediction techniques also increase the computational

complexity of VVC. In order to reduce the VVC encoding

complexity, these modes can be conditionally skipped.

To reduce the computational complexity of CU encoding

process, many fast CU encoding methods have been proposed.

These methods can be roughly classiﬁed into two categories,

namely statistical analysis-based methods [8]–[14] and CNN-

based methods [15], [16]. The statistical analysis-based methods

simplify the CU encoding process by building the relationship

between the statistical features and the CU mode parameter.

In [8], Tang et al. proposed a fast CU encoding method for

intra and inter coding, in which the CU partition process is early

terminated by using the edge features extracted by the canny

edge detector. In addition, the three-frame difference is used

to measure the motion activity of the CU content in the i nter

coding. In [9], Chen et al. regarded that the uniform area is

usually encoded in large size CUs, and the non-uniform area

is usually encoded in small size CUs. Based on this analysis,

the CU partition process is early terminated according to the

variance and gradient information of the CU. In [10], Cui et al.

simpliﬁed the CU partition process by using the direction gradi-

ent information. In [11], Saldanha et al. utilized the variance and

the best angular intra prediction mode of the current CU to skip

the horizontal or vertical partition. In [12], Yang et al. proposed

a fast intra coding scheme consisting of a cascade decision

structure-based fast QTMT partition decision method and gradi-

ent descent-based fast intra mode decision method. In [13], Dong

et al. proposed an adaptive mode pruning method to skip the non-

promising modes and a mode-dependent termination method to

skip the intra predictions of remaining depth levels. Although

these statistical analysis-based fast CU encoding methods can

improve the computational efﬁciency, the effectiveness of the

statistical features depends on researchers’ experience. To avoid

designing features artiﬁcially, the CNN-based methods have

emerged. These methods learn the features for CU size decision

by convolution operations automatically. In [15], Tang et al.

proposed a shape adaptive CNN-based fast CU partition decision

for intra coding to handle CUs with various sizes by utilizing the

variable size pooling layer. In [16], Tissier et al. trained a CNN

See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: University of Prince Edward Island. Downloaded on July 03,2021 at 13:00:00 UTC from IEEE Xplore. Restrictions apply.

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余4页未读，立即下载

评论收藏

内容反馈

码流怪侠

粉丝: 2w+
资源: 374

基于CNN的快速VVC帧间编码方法及其应用与性能提升研究

基于深度学习的VVC帧内编码中快速QTMT编码单元划分方法

基于机器学习的高效VVC帧内编码器分区方案及其复杂度降低研究

基于梯度方向的VVC帧内编码中CU划分早终止算法研究与实现

视频编码中基于贝叶斯决策规则的快速CU划分算法提高H.266/VVC帧内编码效率

基于支持向量机的VVC编码单元大小决策快速算法及其性能评估

基于纹理特征的快速H.266/VVC编码单元划分决策方法研究与应用

基于机器学习的可调VVC帧划分方案降低编码复杂度研究

视频编码标准VVC中帧内编码复杂度降低的机会与方法

视频编码技术中用于 VVC 压缩的低复杂度CTU分区与快速帧内模式决策方法

基于深度学习的快速QTMT划分.docx

低复杂度视频特征在VVC中全帧内率控的应用研究

最新通用视频编码标准H.266VVC.pdf

VVC反变换编码

视觉显著性驱动的面向机器视频编码框架基于VVC与YOLO的研究及其对物体检测的影响

流媒体场景下多比特率快速编码方法在VVC中的应用与优化

视频编码标准VVC复杂度分析：Versatile Video Coding (VVC) 编解码复杂性对比与性能评估

JVET-O0682VVC表格_视频编码_VTM_

基于轻量级全连接网络的H.266VVC分量间预测.docx

最新视频编码标准（H.266）VVC-Draft10版本，预发布版本！

视频编码技术HEVC与VVC效率及复杂度比较研究

基于集成学习的高效VVC比特率梯度预测方法

视频编码中基于改进DAG-SVM模型的H.266/VVC快速CU分区决策算法

VVCSoftware-VTM-VTM-20.0 H.266/VVC 参考软件

H266（VVC）视频编码协议标准pdf

视频编码领域的VVC分数插值近似滤波器硬件实现及其低功耗特性研究

基于机器学习的VVC内部子分区预测早期跳过决策研究

优化VVC编码器配置以提高解码能耗效率并保持低编码时间

最新资源