ComparisonoftheCodingEfﬁciencyofVideoCodingStandards资源-CSDN文库

需积分: 9 159 浏览量 2013-03-27 10:01:35 上传评论收藏 6.01MB PDF 举报

本文对比分析了几代视频编码标准的压缩能力，重点关注了高效率视频编码（HEVC）技术。文章利用峰值信噪比（PSNR）和主观测试结果来对比不同视频编码标准的设计，包括H.262/MPEG-2视频、H.263、MPEG-4视觉、H.264/MPEG-4高级视频编码（AVC）以及高效率视频编码（HEVC）。研究中发现，HEVC编码器在平均使用大约50%较少的比特率的情况下，能够实现与H.264/MPEG-4 AVC编码器相当的主观再现质量。HEVC的设计对于低比特率、高分辨率视频内容以及低延迟通信应用尤为有效。通过主观测试得到的改善在一定程度上超过了通过PSNR度量指标所测量到的改善。视频编码标准的设计必须在两个主要领域之间取得平衡：一方面要为编码器和解码器的开发者提供尽可能多的自由度来自定义他们的实现，另一方面要确保不同厂商产品编码的视频信号能够可靠地被其他产品解码。这种自由度对于确保标准能够适应各种平台架构、应用环境以及计算资源的限制至关重要。但这种自由度也受到必须实现互操作性的限制，即必须确保每个厂商产品的视频信号能够被其他厂商的产品可靠解码。文章还提及，设计一个用于广泛应用的视频编码标准时，标准的制定是为了给予编码器和解码器的开发者最大限度的自由，以实现他们的产品能够适应各种不同的平台架构、应用环境和计算资源的限制。而这种自由度是通过确保互操作性来限制的，即保证每个厂商的产品所编码的视频信号都能够被其他厂商的产品可靠地解码。这通常通过将标准的范围限制在两个领域来实现。文章介绍了视频编码标准的主要目标是减少视频数据的比特率，同时尽可能保持视频质量。为了实现这一目标，研究者们开发了各种先进的技术，包括预测算法、变换编码、熵编码和环路滤波器等。HEVC是在H.264/MPEG-4 AVC的基础上发展起来的，它采用了许多先进的技术来进一步提高编码效率。 HEVC标准特别强调了以下几个方面的性能提升： 1. 提高了编码效率，使得在相同的视频质量下，所需的数据量更少，从而节省存储空间和传输带宽。 2. 支持更高分辨率和更高帧率的视频，满足了未来视频内容的发展需求。 3. 引入了更灵活的编码块结构，使得编码器能够更好地适应各种内容，特别是复杂或细节丰富的场景。 4. 改进了运动补偿技术，包括更细致的运动向量精度和更高级的运动模型，从而提高了运动估计的准确性。 5. 引入了新的变换方法，例如高效率的正方形变换，以减少变换后的数据冗余。 6. 发展了更为高效的熵编码技术，例如基于上下文的自适应二进制算术编码（CABAC），进一步压缩了编码后的数据量。 HEVC的这些改进使得它在视频压缩效率方面超越了之前的标准，成为新一代视频压缩技术的代表。尽管如此，HEVC的编解码过程相对复杂，需要的计算资源较多，这在某些情况下可能会成为其应用的障碍。为此，HEVC标准的研究人员和开发人员也在不断努力，通过优化算法和硬件实现来降低其计算复杂度，以提高其实用性和普及率。 HEVC标准在视频编码效率方面取得了显著的进展，对于数字视频通信和存储行业具有重要的影响。随着技术的不断进步，HEVC有望成为未来视频服务和应用的主流标准之一。

资源推荐

资源详情

资源评论

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012 1669

Comparison of the Coding Efﬁciency of Video

Coding Standards—Including High Efﬁciency

Video Coding (HEVC)

Jens-Rainer Ohm, Member, IEEE, Gary J. Sullivan, Fellow, IEEE, Heiko Schwarz,

Thiow Keng Tan,

Senior Member, IEEE, and Thomas Wiegand, Fellow, IEEE

Abstract—The compression capability of several generations

of video coding standards is compared by means of peak

signal-to-noise ratio (PSNR) and subjective testing results. A

uniﬁed approach is applied to the analysis of designs, including

H.262/MPEG-2 Video, H.263, MPEG-4 Visual, H.264/MPEG-4

Advanced Video Coding (AVC), and High Efﬁciency Video Cod-

ing (HEVC). The results of subjective tests for WVGA and HD

sequences indicate that HEVC encoders can achieve equivalent

subjective reproduction quality as encoders that conform to

H.264/MPEG-4 AVC when using approximately 50% less bit

rate on average. The HEVC design is shown to be especially

effective for low bit rates, high-resolution video content, and

low-delay communication applications. The measured subjective

improvement somewhat exceeds the improvement measured by

the PSNR metric.

Index Terms—Advanced Video Coding (AVC), H.264, High

Efﬁciency Video Coding (HEVC), JCT-VC, MPEG, MPEG-4,

standards, VCEG, video compression.

I. Introduction

HE PRIMARY goal of most digital video coding stan-

dards has been to optimize coding efﬁciency. Coding

efﬁciency is the ability to minimize the bit rate necessary for

representation of video content to reach a given level of video

quality—or, as alternatively formulated, to maximize the video

quality achievable within a given available bit rate.

The goal of this paper is to analyze the coding efﬁciency

that can be achieved by use of the emerging High Efﬁciency

Manuscript received May 7, 2012; revised August 22, 2012; accepted

August 24, 2012. Date of publication October 2, 2012; date of current

version January 8, 2013. This paper was recommended by Associate Editor

H. Gharavi.

J.-R. Ohm is with the Institute of Communication Engineering, RWTH

Aachen University, Aachen 52056, Germany (e-mail: ohm@ient.rwth-

aachen.de).

G. J. Sullivan is with Microsoft Corporation, Redmond, WA 98052 USA

(e-mail: garys@ieee.org).

H. Schwarz is with the Fraunhofer Institute for Telecommunications—

Heinrich Hertz Institute, Berlin 10587, Germany (e-mail:

heiko.schwarz@hhi.fraunhofer.de).

T. K. Tan is with M-Sphere Consulting Pte. Ltd., 808379, Singapore, and

also with NTT DOCOMO, Inc., Tokyo 239-8536, Japan (e-mail: tktan@m-

sph.com).

T. Wiegand is with the Fraunhofer Institute for Telecommunications—

Heinrich Hertz Institute, Berlin 10587, Germany, and also with the Berlin

Institute of Technology, Berlin 10587, Germany (e-mail: twiegand@ieee.org).

Color versions of one or more of the ﬁgures in this paper are available

online at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TCSVT.2012.2221192

Video Coding (HEVC) standard [1]–[4], relative to the coding

efﬁciency characteristics of its major predecessors including

H.262/MPEG-2 Video [5]–[7], H.263 [8], MPEG-4 Visual

[9], and H.264/MPEG-4 Advanced Video Coding (AVC)

[10]–[12].

When designing a video coding standard for broad use, the

standard is designed in order to give the developers of encoders

and decoders as much freedom as possible to customize their

implementations. This freedom is essential to enable a standard

to be adapted to a wide variety of platform architectures, appli-

cation environments, and computing resource constraints. This

freedom is constrained by the need to achieve interoperability,

i.e., to ensure that a video signal encoded by each vendor’s

products can be reliably decoded by others. This is ordinarily

achieved by limiting the scope of the standard to two areas

(cp. [11, Fig. 1]).

1) Specifying the format of the data to be produced by

a conforming encoder and constraining some character-

istics of that data (such as its maximum bit rate and

maximum frame rate), without specifying any aspects of

how an encoder would process input video to produce

the encoded data (leaving all preprocessing and algorith-

mic decision-making processes outside the scope of the

standard).

2) Specifying (or bounding the approximation of) the de-

coded results to be produced by a conforming decoder

in response to a complete and error-free input from a

conforming encoder, prior to any further operations to be

performed on the decoded video (providing substantial

freedom over the internal processing steps of the de-

coding process and leaving all postprocessing, loss/error

recovery, and display processing outside the scope as

well).

This intentional limitation of scope complicates the analysis

of coding efﬁciency for video coding standards, as most of the

elements that affect the end-to-end quality characteristics are

outside the scope of the standard. In this paper, the emerging

HEVC design is analyzed using a systematic approach that

is largely similar in spirit to that previously applied to the

analysis of the ﬁrst version of H.264/MPEG-4 AVC in [13].

A major emphasis in this analysis is the application of a

disciplined and uniform approach for optimization of each of

1051-8215/$31.00

 2012 IEEE

1670 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012

the video encoders. Additionally, a greater emphasis is placed

on subjective video quality analysis than what was applied

in [13], as the most important measure of video quality is

the subjective perception of quality as experienced by human

observers.

The paper is organized as follows. Section II describes the

syntax features of the investigated video coding standards

and highlights the main coding tools that contribute to the

coding efﬁciency improvement from one standard generation

to the next. The uniform encoding approach that is used for

all standards discussed in this paper is described in Section

III. In Section IV, the current performance of the HEVC

reference implementation is investigated in terms of tool-

wise analysis, and in comparison with previous standards,

as assessed by objective quality measurement, particularly

peak signal-to-noise ratio (PSNR). Section V provides results

of the subjective quality testing of HEVC in comparison

to the previous best-performing standard, H.264/MPEG-4

AVC.

II. Syntax Overview

The basic design of all major video coding standards since

H.261 (in 1990) [14] follows the so-called block-based hybrid

video coding approach. Each block of a picture is either intra-

picture coded (also known as coded in an intra coding mode),

without referring to other pictures of the video sequence, or it

is temporally predicted (i.e., inter-picture coded, also known

as coded in an inter coding mode), where the prediction signal

is formed by a displaced block of an already coded picture.

The latter technique is also referred to as motion-compensated

prediction and represents the key concept for utilizing the

large amount of temporal redundancy in video sequences. The

prediction error signal (or the complete intra-coded block)

is processed using transform coding for exploiting spatial

redundancy. The transform coefﬁcients that are obtained by

applying a decorrelating (linear or approximately linear) trans-

form to the input signal are quantized and then entropy coded

together with side information such as coding modes and

motion parameters. Although all considered standards follow

the same basic design, they differ in various aspects, which

ﬁnally results in a signiﬁcantly improved coding efﬁciency

from one generation of standard to the next. In the following,

we provide an overview of the main syntax features for the

considered standards. The description is limited to coding tools

for progressive-scan video that are relevant for the comparison

in this paper. For further details, the reader is referred to the

draft HEVC standard [4], the prior standards [5], [8]–[10], and

corresponding books [6], [7], [15] and overview articles [3],

[11], [12].

In order to specify conformance points facilitating interop-

erability for different application areas, each standard deﬁnes

particular proﬁles. A proﬁle speciﬁes a set of coding tools

that can be employed in generating conforming bitstreams. We

concentrate on the proﬁles that provide the best coding efﬁ-

ciency for progressive-scanned 8-bit-per-sample video with the

4:2:0 chroma sampling format, as the encoding of interlaced-

scan video, high bit depths, and non-4:2:0 material has not

been in the central focus of the HEVC project for developing

the ﬁrst version of the standard.

A. ITU-T Rec. H.262 | ISO/IEC 13818-2 (MPEG-2 Video)

H.262/MPEG-2 Video [5] was developed as an ofﬁcial joint

project of ITU-T and ISO/IEC JTC 1. It was ﬁnalized in 1994

and is still widely used for digital television and the DVD-

Video optical disc format. Similarly, as for its predecessors

H.261 [14] and MPEG-1 Video [16], each picture of a video

sequence is partitioned into macroblocks (MBs), which consist

ofa16× 16 luma block and, in the 4:2:0 chroma sampling

format, two associated 8 × 8 chroma blocks. The standard

deﬁnes three picture types: I, P, and B pictures. I and P

pictures are always coded in display/output order. In I pictures,

all MBs are coded in intra coding mode, without referencing

other pictures in the video sequence. An MB in a P picture

can be either transmitted in intra or in inter mode. For the

inter mode, the last previously coded I or P picture is used

as reference picture. The displacement of an inter MB in

a P picture relative to the reference picture is speciﬁed by

a half-sample precision motion vector. The prediction signal

at half-sample locations is obtained by bilinear interpolation.

In general, the motion vector is differentially coded using

the motion vector of the MB to the left as a predictor.

The standard includes syntax features that allow a partic-

ularly efﬁcient signaling of zero-valued motion vectors. In

H.262/MPEG-2 Video, B pictures have the property that they

are coded after, but displayed before the previously coded

I or P picture. For a B picture, two reference pictures can

be employed: the I/P picture that precedes the B picture

in display order and the I/P picture that succeeds it. When

only one motion vector is used for motion compensation

of an MB, the chosen reference picture is indicated by the

coding mode. B pictures also provide an additional coding

mode, for which the prediction signal is obtained by averaging

prediction signals from both reference pictures. For this mode,

which is referred to as the biprediction or bidirectional predic-

tion mode, two motion vectors are transmitted. Consecutive

runs of inter MBs in B pictures that use the same motion

parameters as the MB to their left and do not include a

prediction error signal can be indicated by a particularly

efﬁcient syntax.

For transform coding of intra MBs and the prediction errors

of inter MBs, a discrete cosine transform (DCT) is applied to

blocks of 8×8 samples. The DCT coefﬁcients are represented

using a scalar quantizer. For intra MBs, the reconstruction

values are uniformly distributed, while for inter MBs, the

distance between zero and the ﬁrst nonzero reconstruction

values is increased to three halves of the quantization step size.

The intra DC coefﬁcients are differentially coded using the

intra DC coefﬁcient of the block to their left (if available) as

their predicted value. For perceptual optimization, the standard

supports the usage of quantization weighting matrices, by

which effectively different quantization step sizes can be used

for different transform coefﬁcient frequencies. The transform

coefﬁcients of a block are scanned in a zig–zag manner

and transmitted using 2-D run-level variable-length coding

(VLC). Two VLC tables are speciﬁed for quantized transform

OHM et al.: COMPARISON OF THE CODING EFFICIENCY OF VIDEO CODING STANDARDS 1671

coefﬁcients (also known as transform coefﬁcient levels). One

table is used for inter MBs. For intra MBs, the employed table

can be selected at the picture level.

The most widely implemented proﬁle of H.262/MPEG-2

Video is the Main proﬁle (MP). It supports video coding with

the 4:2:0 chroma sampling format and includes all tools that

signiﬁcantly contribute to coding efﬁciency. The Main proﬁle

is used for the comparisons in this paper.

B. ITU-T Recommendation H.263

The ﬁrst version of ITU-T Rec. H.263 [8] deﬁnes syntax

features that are very similar to those of H.262/MPEG-2

Video, but it includes some changes that make it more efﬁcient

for low-delay low bit-rate coding. The coding of motion

vectors has been improved by using the component-wise

median of the motion vectors of three neighboring previously

decoded blocks as the motion vector predictor. The transform

coefﬁcient levels are coded using a 3-D run-level-last VLC,

with tables optimized for lower bit rates. The ﬁrst version

of H.263 contains four annexes (annexes D through G) that

specify additional coding options, among which annexes D

and F are frequently used for improving coding efﬁciency.

The usage of annex D allows motion vectors to point outside

the reference picture, a key feature that is not permitted in

H.262/MPEG-2 Video. Annex F introduces a coding mode for

P pictures, the inter 8 ×8 mode, in which four motion vectors

are transmitted for an MB, each for an 8×8 subblock. It further

speciﬁes the usage of overlapped block motion compensation.

The second and third versions of H.263, which are often

called H.263+ and H.263++, respectively, add several optional

coding features in the form of annexes. Annex I improves

the intra coding by supporting a prediction of intra AC

coefﬁcients, deﬁning alternative scan patterns for horizontally

and vertically predicted blocks, and adding a specialized

quantization and VLC for intra coefﬁcients. Annex J spec-

iﬁes a deblocking ﬁlter that is applied inside the motion

compensation loop. Annex O adds scalability support, which

includes a speciﬁcation of B pictures roughly similar to those

in H.262/MPEG-2 Video. Some limitations of version 1 in

terms of quantization are removed by annex T, which also

improves the chroma ﬁdelity by specifying a smaller quanti-

zation step size for chroma coefﬁcients than for luma coefﬁ-

cients. Annex U introduces the concept of multiple reference

pictures. With this feature, motion-compensated prediction is

not restricted to use just the last decoded I/P picture (or, for

coded B pictures using annex O, the last two I/P pictures) as a

reference picture. Instead, multiple decoded reference pictures

are inserted into a picture buffer and can be used for inter

prediction. For each motion vector, a reference picture index

is transmitted, which indicates the employed reference picture

for the corresponding block. The other annexes in H.263+ and

H.263++ mainly provide additional functionalities such as the

speciﬁcation of features for improved error resilience.

The H.263 proﬁles that provide the best coding efﬁciency

are the Conversational High Compression (CHC) proﬁle and

the High Latency proﬁle (HLP). The CHC proﬁle includes

most of the optional features (annexes D, F, I, J, T, and U) that

provide enhanced coding efﬁciency for low-delay applications.

The HLP adds the support of B pictures (as deﬁned in annex

O) to the coding efﬁciency tools of the CHC proﬁle and is

targeted for applications that allow a higher coding delay.

C. ISO/IEC 14496-2 (MPEG-4 Visual)

MPEG-4 Visual [9], also known as Part 2 of the MPEG-4

suite, is backward-compatible to H.263 in the sense that each

conforming MPEG-4 decoder must be capable of decoding

H.263 Baseline bitstreams (i.e., bitstreams that use no H.263

optional annex features). Similarly as for annex F in H.263,

the inter prediction in MPEG-4 Visual can be done with

16 × 16 or 8 × 8 blocks. While the ﬁrst version of MPEG-4

Visual only supports motion compensation with half-sample

precision motion vectors and bilinear interpolation (similar

to H.262/MPEG-2 Video and H.263), version 2 added sup-

port for quarter-sample precision motion vectors. The luma

prediction signal at half-sample locations is generated using

an 8-tap interpolation ﬁlter. For generating the quarter-sample

positions, bilinear interpolation of the integer- and half-sample

positions is used. The chroma prediction signal is generated by

bilinear interpolation. Motion vectors are differentially coded

using a component-wise median prediction and are allowed to

point outside the reference picture. MPEG-4 Visual supports B

pictures (in some proﬁles), but it does not support the feature

of multiple reference pictures (except on a slice basis for loss

resilience purposes) and it does not specify a deblocking ﬁlter

inside the motion compensation loop.

The transform coding in MPEG-4 Visual is basically similar

to that of H.262/MPEG-2 Video and H.263. However, two

different quantization methods are supported. The ﬁrst quanti-

zation method, which is sometimes referred to as MPEG-style

quantization, supports quantization weighting matrices simi-

larly to H.262/MPEG-2 Video. With the second quantization

method, which is called H.263-style quantization, the same

quantization step size is used for all transform coefﬁcients

with the exception of the DC coefﬁcient in intra blocks.

The transform coefﬁcient levels are coded using a 3-D run-

level-last VLC code as in H.263. Similarly as in annex I of

H.263, MPEG-4 Visual also supports the prediction of AC

coefﬁcients in intra blocks as well as alternative scan patterns

for horizontally and vertically predicted intra blocks and the

usage of a separate VLC table for intra coefﬁcients.

For the comparisons in this paper, we used the Advanced

Simple Proﬁle (ASP) of MPEG-4 Visual, which includes all

relevant coding tools. We generally enabled quarter-sample

precision motion vectors. MPEG-4 ASP additionally includes

global motion compensation. Due to the limited beneﬁts expe-

rienced in practice and the complexity and general difﬁculty

of estimating global motion ﬁelds suitable for improving the

coding efﬁciency, this feature is rarely supported in encoder

implementations and is also not used in our comparison.

D. ITU-T Rec. H.264 | ISO/IEC 14496-10 (MPEG-4 AVC)

H.264/MPEG-4 AVC [10], [12] is the second video coding

standard that was jointly developed by ITU-T VCEG and

ISO/IEC MPEG. It still uses the concept of 16 ×16 MBs, but

contains many additional features. One of the most obvious

剩余15页未读，继续阅读

评论收藏

内容反馈

隔壁老王250

粉丝: 0
资源: 2

Comparison of the Coding Efﬁciency of Video Coding Standards

Comparison of the Coding Efficiency of Video Coding Standard

MPEG-1 and MPEG-2 Digital Video Coding Standards

video coding

A Technical Overview of VP9--the Latest Open-Source Video Codec

HEVC入门论文(多篇)

An Experimental Comparison of Min-Cut/Max-Flow Algorithms

The Comparison of the Landscape Architecture Arts of the Summer

IMAGE and VIDEO COMPRESSION for MULTIMEDIA ENGINEERING Fundamentals, Algorithms, and Standards.part1.rar

A comparison of the factor structure of the WISC-R for blacks and whites

Comparison of the Calibration

A comparison of the WISC-R and the Woodcock-Johnson tests of cognitive ability

Theoretical Comparison of Direct-Sampling Versus Heterodyne RF Receivers

Comparison of the three CPU schedulers in Xen

A comparison of the Kaufman Assessment Battery for Children and the Stanford-Binet IV for the assessment of gifted children

多目标优化（Comparison of Multiobjective Evolutionary Algorithms: Empirical Results）

comparison of db2 vs oracle

A comparison of 3D file formats.pdf

Comparison of Industrial WSN Standards

A Comparison of Dictionary Implementations

comparison of nonlinear filter

A Comparison of Affine Region Detectors University of Oxford论文

A comparison of the PIAT and WRAT: A closer look

Comparison of the PPVT-R and WISC-R with urban educable mentally retarded students

A comparison of the Kaufman brief intelligence test (K-BIT) with the Stanford-Binet, a two-subtest short form, and the Kaufman test of educational achievement (K-TEA) brief form

Digital Change Detection by Post-Classification Comparison of RS Data in Land Use of Guangzhou

Comparison of the finite mixture of ARMA-GARCH NN SVM financial returns..pdf

Comparison of VHDL Verilog and SystemVerilog

最新资源