视频压缩领域的自适应神经视频编码器研究与实现资源-CSDN文库

166 浏览量 2024-12-26 23:23:42 上传评论收藏 1.07MB PDF 举报

资源推荐

资源详情

资源评论

Slimmable Video Codec

Zhaocheng Liu

, Luis Herranz

, Fei Yang

, Saiping Zhang

, Shuai Wan

, Marta Mrak

and Marc G

orriz Blanch

School of Electronics and Information, Northwestern Polytechnical University, Xi’an, China

Computer Vision Center, Universitat Autonoma de Barcelona, 08193 Barcelona, Spain

BBC Research & Development, The Lighthouse, White City Place, 201 Wood Lane, London, UK

State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an, China

liuzhaocheng@mail.nwpu.edu.cn

Abstract

Neural video compression has emerged as a novel

paradigm combining trainable multilayer neural net-

works and machine learning, achieving competitive rate-

distortion (RD) performances, but still remaining imprac-

tical due to heavy neural architectures, with large memory

and computational demands. In addition, models are usu-

ally optimized for a single RD tradeoff. Recent slimmable

image codecs can dynamically adjust their model capacity

to gracefully reduce the memory and computation require-

ments, without harming RD performance. In this paper we

propose a slimmable video codec (SlimVC), by integrating a

slimmable temporal entropy model in a slimmable autoen-

coder. Despite a signiﬁcantly more complex architecture,

we show that slimming remains a powerful mechanism to

control rate, memory footprint, computational cost and la-

tency, all being important requirements for practical video

compression.

1. Introduction

During the last two decades, video has become the domi-

nant form of communication of the digital society. This has

led to an explosive growth where video content accounts

for more than 80% of global data trafﬁc. The basic (lossy)

video compression objective consists of transmitting as few

bits as possible (i.e. minimize rate) while representing the

input sequence at a certain level of ﬁdelity (i.e. distortion).

Video is now consumed using heterogeneous devices rang-

ing from TV sets to smartphones. Furthermore, real-time

video conferencing has become a household technology,

pervasive in work and educational environments. These

practical scenarios imposes additional constraints to the de-

L.H. acknowledges the support of the Ram

on y Cajal grant RYC2019-

027020-I (MICINN, Spain).

sign of video codec in practice, such as dynamically con-

trollable rate, low computational and memory footprint, and

low latency. Together with the previous rate and distortion

objectives, they conform the more challenging problem of

practical video compression.

In parallel, the deep learning revolution has motivated a

new compression paradigm based on parametric encoders

and decoders implemented as deep neural networks which

are optimized with data. This compression approach has

been applied successfully ﬁrst in images [4, 5, 7] and then

videos [6, 13]. This paradigm contrasts with the tradi-

tional hybrid video coding paradigm, based on block-based

linear transforms and carefully engineered coding tools

(e.g. H.264/AVC, H.265/HEVC). Focusing on improving

rate-distortion performance, most neural image and video

codecs are impractical, since require heavy and complex

networks. Practical aspects have been always carefully con-

sidered in the design of traditional codecs. In contrast to

previous works, our paper focuses chieﬂy on those practical

constraints, proposing a lightweight and ﬂexible design for

practical neural video compression.

Our design is based on a slimmable autoencoder aug-

mented with a slimmable temporal entropy model. This

design is motivated by two recent works. Motivated by

the empirical observation that lower rates do not require

the use of full capacity, Yang et al. [12] proposed the

slimmable compressive autoencoder (SlimCAE) architec-

ture, where the slimming becomes a ﬂexible mechanism to

both vary the rate-distortion tradeoff and control the com-

plexity. However, extending SlimCAE to video by includ-

ing temporal prediction is not trivial, since most designs re-

quire additional modules to estimate and compensate mo-

tion (e.g. optical ﬂow nets, motion compensation nets).

Slimmable designs of such modules are not straightforward,

nor the potential interplay with other elements in the com-

pression framework. Recently, Sun et al. [9] proposed spa-

tiotemporal entropy model (STEM), a motion-free frame-

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余4页未读，立即下载

评论收藏

内容反馈

码流怪侠

粉丝: 2w+
资源: 432

视频压缩领域的自适应神经视频编码器研究与实现

神经网络视频编解码器的自适应流媒体传输系统Swift

LAC.rar_LAC_自编码_自适应

ei.rar_视频压缩

视频图像压缩！！！！

gaoxingneng.rar_视频压缩性能

《数字图像处理与压缩编码技术

基于神经网络的图像压缩运用matlab实现

基于神经网络的HEVC帧内预测模式算术编码方法

毫米波MIMO系统中基于自适应梯度算法的混合预编码.docx

语音频编码器及其应用简介.zip

视觉显著性驱动的面向机器视频编码框架基于VVC与YOLO的研究及其对物体检测的影响

神经网络在图像压缩技术中的应用-PCA1

现代图像压缩编码技术.pdf

图像压缩的不同方法源码

音视频-编解码-木粉颗粒微观图像特征提取方法与粒径数学建模的研究.pdf

基于卷积神经网络的视频图像失真检测及分类.pdf

稀疏编码算法概述-该方法具有空间的局部性、方向性和频域的带通性,是一种自适应的图像统计方法

HM-HM-16.22

基于轻量级全连接网络的H.266VVC分量间预测.docx

人工智能在点云压缩中的应用前景.zip

ppfxhsrk.zip_Channel Coding_元胞 多目标_机器学习信道_自动编码器

D:\桌面\视屏编码书籍资料

孙松林-AI在音视频中的应用.pdf

H. 266 VTM参考实现代码

TensorFlow实现AutoEncoder自编码器

Image经典图片压缩方法

最新资源

ppfxhsrk.zip_Channel Coding_元胞多目标_机器学习信道_自动编码器