Alliance for Open Media
Codec Working Group
Document: CWG-B078[o]_v1
Tool Description for AV1 and libaom
Date: October 4, 2021
Status: Output document
Purpose: Information
Author(s): Xin Zhao, Shan Liu, Adrian Grange, Andrey Norkin
Email(s): xinzzhao@tencent.com, shanl@tencent.com, agrange@google.com,
anorkin@netflix.com
Source: Tencent, Google, Netflix
Abstract
This document provides a description of the main coding features in libaom, a software
implementation of the AV1 standard specification. Both normative decoding processes and some
key encoder algorithms are described in this document.
CONTENTS
Abstract ......................................................................................................................................................... 1
1 Introduction ........................................................................................................................................ 3
2 Abbreviations ...................................................................................................................................... 4
3 Tool description .................................................................................................................................. 4
3.1 Block partitioning .......................................................................................................................... 4
3.1.1 Coding block partitioning ...................................................................................................... 4
3.1.2 Transform block partitioning ................................................................................................ 5
3.2 Intra prediction ............................................................................................................................. 7
3.2.1 Directional intra prediction ................................................................................................... 7
3.2.2 Non-directional intra prediction ........................................................................................... 8
3.2.3 Recursive intra prediction ..................................................................................................... 9
3.2.4 Chroma from luma prediction ............................................................................................ 10
3.2.5 Intra prediction mode signalling ......................................................................................... 10
3.3 Inter prediction ........................................................................................................................... 11
3.3.1 Reference frame system ..................................................................................................... 11
3.3.2 Spatial motion vector prediction ........................................................................................ 12
3.3.3 Temporal motion vector prediction .................................................................................... 14
3.3.4 Dynamic motion vector prediction ..................................................................................... 15
3.3.5 Inter prediction mode signalling ......................................................................................... 16
3.3.6 Translational motion compensation ................................................................................... 18
3.3.7 Warped motion compensation ........................................................................................... 21
3.3.8 Overlapped block motion compensation............................................................................ 23
3.3.9 Compound inter prediction................................................................................................. 24
3.3.10 Compound interintra prediction ....................................................................................... 25
3.4 Transform coding ........................................................................................................................ 26
3.4.1 Core transforms .................................................................................................................. 26
3.4.2 Transform selection and signalling ..................................................................................... 27
3.5 Quantization................................................................................................................................ 29
3.6 Entropy coding ............................................................................................................................ 31
3.6.1 Multisymbol arithmetic coding engine ............................................................................... 31
3.6.2 Coefficient coding ............................................................................................................... 31
3.7 Loop filtering and post-processing.............................................................................................. 32
3.7.1 Deblocking filter .................................................................................................................. 32
3.7.2 Constrained directional enhancement filter ....................................................................... 33
3.7.3 Loop restoration filter ......................................................................................................... 35
3.7.4 Frame super-resolution ...................................................................................................... 36
3.7.5 Film grain synthesis ............................................................................................................. 36
3.8 Screen content coding ................................................................................................................ 37
3.8.1 Intra block copy ................................................................................................................... 37
3.8.2 Palette mode ....................................................................................................................... 39
3.8.3 Encoder content-type detection ......................................................................................... 40
4 References ........................................................................................................................................ 40
1 Introduction
The framework of the Alliance for Open Media Video 1 (AV1) codec is based on a hybrid video
coding structure that consists of a few major function blocks, such as prediction, transform,
quantization, entropy coding, and loop filtering. Each function block processes the input data
using a certain type of video coding technology, and its output is fed into another function block
or taken as the final output of the video codec. These function blocks are connected following a
specific design and work collaboratively to achieve substantial data compression. The function
blocks included in AV1 reference codec libaom [1] are summarized as follows and described in
detail in Section 3.
• Block partitioning
- Coding block partitioning [2]
- Transform block partitioning [2]
• Intra prediction
- Directional intra prediction [2]
- Non-directional intra prediction [2]
- Recursive intra prediction [2]
- Chroma from luma (CfL) prediction [3]
- Intra prediction mode signalling
• Inter prediction
- Reference frame system [2]
- Spatial motion vector prediction
- Temporal motion vector prediction
- Dynamic motion vector prediction [4]
- Inter prediction mode signalling
- Translational motion compensation
- Warped motion compensation [5]
- Overlapped block motion compensation [6]
- Compound inter prediction
- Compound inter-intra prediction
• Transform coding
- Core transforms
- Transform selection and signalling
• Quantization
• Entropy coding
- Multi-symbol arithmetic coding engine [8]
- Coefficient coding [9]
• Loop filtering and post-processing
- Deblocking filter
- Constrained directional enhancement filter [10]
- Loop restoration filter [11]
- Frame super-resolution
- Film grain synthesis [12]
• Screen content coding
- Intra block copy [13]
- Palette mode
- Encoder content-type detection
The coding features described for each building block are all included in the libaom [1]
implementation of AV1 codec.
In this document, syntax elements are written using Courier font, e.g., syntax element
base_q_idx.
2 Abbreviations
For the purposes of this document, the following abbreviations apply:
ARF alternate reference frame
AV1 AOMedia Video 1
BV block vector
CDEF constrained directional enhancement filter
CfL chroma from luma
DRL dynamic reference list
EOB end of block
FIR finite impulse response
IntraBC intra block copy
LR loop restoration
LRU loop restoration unit
MV motion vector
OBMC overlapped block motion compensation
SGF self-guided filter
3 Tool description
3.1 Block partitioning
3.1.1 Coding block partitioning
Coding blocks of different sizes are used in AV1. The largest coding blocks, called superblocks,
have sizes of either 128×128 or 64×64, with the default size being 128×128. The size is signalled
in the sequence header. The minimum coding block size is 4×4.
Superblocks can be further partitioned into smaller coding blocks. The partitioning strategy is
signalled in the bitstream. Besides the no-partitioning mode, PARTITION_NONE, there are up to
nine supported partitioning modes (see Figure 1). These include three 4-partition modes:
PARTITION_SPLIT, PARTITION_VERT_4, and PARTITION_HORZ_4; four 3-partition (T-
shaped) modes: PARTITION_HORZ_A, PARTITION_HORZ_B, PARTITION_VERT_A, and
PARTITION_VERT_B; and two 2-partition modes: PARTITION_HORZ and PARTITION_VERT.
Figure 1: Coding block partitioning modes
Among all the partitioning modes, only PARTITION_SPLIT allows recursive partitioning; that is,
the subpartitions can be further partitioned. For all other partitioning modes, the subpartitions
cannot be further partitioned. Furthermore, PARTITION_VERT_4 and PARTITION_HORZ_4
modes are not allowed for 8×8 or 128×128 block sizes, and T-shaped partitioning modes are not
allowed for 8×8 blocks.
3.1.2 Transform block partitioning
Both intra and inter coding blocks can be further partitioned into multiple transform blocks, with a
partitioning depth of up to two levels. The transform block size is determined by the transform
partitioning mode and the coding block size. The mapping from the transform size of the current
depth to the transform size of the next depth is shown in Table 1.
Table 1: Transform partitioning size setting
Current depth
Next depth
Enumerator
Transform size
Enumerator
Transform size
TX_4X4
4×4
TX_4X4
4×4
TX_8X8
8×8
TX_4X4
4×4
TX_16X16
16×16
TX_8X8
8×8
TX_32X32
32×32
TX_16X16
16×16
TX_64X64
64×64
TX_32X32
32×32
TX_4X8
4×8
TX_4X4
4×4
TX_8X4
8×4
TX_4X4
4×4
TX_8X16
8×16
TX_8X8
8×8
TX_16X8
16×8
TX_8X8
8×8