opus音频编解码算法_视频编解码协议资源-CSDN文库

共3个文件

pdf：3个

opus算法

5星 · 超过95%的资源需积分: 33 21 浏览量 2016-11-07 15:04:19 上传评论 1 收藏 1.02MB RAR 举报

资源推荐

资源详情

资源评论

收起资源包目录

opus.rar （3个子文件）

opus_file_format.pdf 104KB

opus_standard.pdf 961KB

opus_api-1.1.3.pdf 360KB

Internet Engineering Task Force (IETF) JM. Valin

Request for Comments: 6716 Mozilla Corporation

Category: Standards Track K. Vos

ISSN: 2070-1721 Skype Technologies S.A.

T. Terriberry

Mozilla Corporation

September 2012

Definition of the Opus Audio Codec

Abstract

This document defines the Opus interactive speech and audio codec.

Opus is designed to handle a wide range of interactive audio

applications, including Voice over IP, videoconferencing, in-game

chat, and even live, distributed music performances. It scales from

low bitrate narrowband speech at 6 kbit/s to very high quality stereo

music at 510 kbit/s. Opus uses both Linear Prediction (LP) and the

Modified Discrete Cosine Transform (MDCT) to achieve good compression

of both speech and music.

Status of This Memo

This is an Internet Standards Track document.

This document is a product of the Internet Engineering Task Force

(IETF). It represents the consensus of the IETF community. It has

received public review and has been approved for publication by the

Internet Engineering Steering Group (IESG). Further information on

Internet Standards is available in

Section 2 of RFC 5741.

Information about the current status of this document, any errata,

and how to provide feedback on it may be obtained at

http://www.rfc-editor.org/info/rfc6716.

Valin, et al. Standards Track [Page 1]

RFC 6716

Interactive Audio Codec September 2012

This document is subject to

BCP 78 and the IETF Trust’s Legal

Provisions Relating to IETF Documents

(

http://trustee.ietf.org/license-info) in effect on the date of

publication of this document. Please review these documents

carefully, as they describe your rights and restrictions with respect

to this document. Code Components extracted from this document must

include Simplified BSD License text as described in Section 4.e of

the Trust Legal Provisions and are provided without warranty as

described in the Simplified BSD License.

The licenses granted by the IETF Trust to this RFC under

Section 3.c

of the Trust Legal Provisions shall also include the right to extract

text from Sections

1 through 8 and Appendix A and Appendix B of this

RFC and create derivative works from these extracts, and to copy,

publish, display and distribute such derivative works in any medium

and for any purpose, provided that no such derivative work shall be

presented, displayed or published in a manner that states or implies

that it is part of this RFC or any other IETF Document.

Table of Contents

1. Introduction ....................................................5

1.1. Notation and Conventions ...................................6

2. Opus Codec Overview .............................................8

2.1. Control Parameters ........................................10

2.1.1. Bitrate ............................................10

2.1.2. Number of Channels (Mono/Stereo) ...................11

2.1.3. Audio Bandwidth ....................................11

2.1.4. Frame Duration .....................................11

2.1.5. Complexity .........................................11

2.1.6. Packet Loss Resilience .............................12

2.1.7. Forward Error Correction (FEC) .....................12

2.1.8. Constant/Variable Bitrate ..........................12

2.1.9. Discontinuous Transmission (DTX) ...................13

3. Internal Framing ...............................................13

3.1. The TOC Byte ..............................................13

3.2. Frame Packing .............................................16

3.2.1. Frame Length Coding ................................16

3.2.2. Code 0: One Frame in the Packet ....................16

3.2.3. Code 1: Two Frames in the Packet, Each with

Equal Compressed Size ..............................

3.2.4. Code 2: Two Frames in the Packet, with

Different Compressed Sizes .........................

Valin, et al. Standards Track [Page 2]

RFC 6716

Interactive Audio Codec September 2012

3.2.5. Code 3: A Signaled Number of Frames in the Packet ..18

3.3. Examples ..................................................21

3.4. Receiving Malformed Packets ...............................22

4. Opus Decoder ...................................................23

4.1. Range Decoder .............................................23

4.1.1. Range Decoder Initialization .......................25

4.1.2. Decoding Symbols ...................................25

4.1.3. Alternate Decoding Methods .........................27

4.1.4. Decoding Raw Bits ..................................29

4.1.5. Decoding Uniformly Distributed Integers ............29

4.1.6. Current Bit Usage ..................................30

4.2. SILK Decoder ..............................................32

4.2.1. SILK Decoder Modules ...............................32

4.2.2. LP Layer Organization ..............................33

4.2.3. Header Bits ........................................35

4.2.4. Per-Frame LBRR Flags ...............................36

4.2.5. LBRR Frames ........................................36

4.2.6. Regular SILK Frames ................................37

4.2.7. SILK Frame Contents ................................37

4.2.7.1. Stereo Prediction Weights .................40

4.2.7.2. Mid-Only Flag .............................42

4.2.7.3. Frame Type ................................43

4.2.7.4. Subframe Gains ............................44

4.2.7.5. Normalized Line Spectral Frequency

(LSF) and Linear Predictive Coding (LPC)

Coeffieients ..............................

4.2.7.6. Long-Term Prediction (LTP) Parameters .....74

4.2.7.7. Linear Congruential Generator (LCG) Seed ..86

4.2.7.8. Excitation ................................86

4.2.7.9. SILK Frame Reconstruction .................98

4.2.8. Stereo Unmixing ...................................102

4.2.9. Resampling ........................................103

4.3. CELT Decoder .............................................104

4.3.1. Transient Decoding ................................108

4.3.2. Energy Envelope Decoding ..........................108

4.3.3. Bit Allocation ....................................110

4.3.4. Shape Decoding ....................................116

4.3.5. Anti-collapse Processing ..........................120

4.3.6. Denormalization ...................................121

4.3.7. Inverse MDCT ......................................121

4.4. Packet Loss Concealment (PLC) ............................122

4.4.1. Clock Drift Compensation ..........................122

4.5. Configuration Switching ..................................123

4.5.1. Transition Side Information (Redundancy) ..........124

4.5.2. State Reset .......................................127

4.5.3. Summary of Transitions ............................128

5. Opus Encoder ..................................................131

5.1. Range Encoder ............................................132

Valin, et al. Standards Track [Page 3]

RFC 6716

Interactive Audio Codec September 2012

5.1.1. Encoding Symbols ..................................133

5.1.2. Alternate Encoding Methods ........................134

5.1.3. Encoding Raw Bits .................................135

5.1.4. Encoding Uniformly Distributed Integers ...........135

5.1.5. Finalizing the Stream .............................135

5.1.6. Current Bit Usage .................................136

5.2. SILK Encoder .............................................136

5.2.1. Sample Rate Conversion ............................137

5.2.2. Stereo Mixing .....................................137

5.2.3. SILK Core Encoder .................................138

5.3. CELT Encoder .............................................150

5.3.1. Pitch Pre-filter ..................................150

5.3.2. Bands and Normalization ...........................151

5.3.3. Energy Envelope Quantization ......................151

5.3.4. Bit Allocation ....................................151

5.3.5. Stereo Decisions ..................................152

5.3.6. Time-Frequency Decision ...........................153

5.3.7. Spreading Values Decision .........................153

5.3.8. Spherical Vector Quantization .....................154

6. Conformance ...................................................155

6.1. Testing ..................................................155

6.2. Opus Custom ..............................................156

7. Security Considerations .......................................157

8. Acknowledgements ..............................................158

9. References ....................................................159

9.1. Normative References .....................................159

9.2. Informative References ...................................159

Appendix A. Reference Implementation .............................163

A.1. Extracting the Source ....................................164

A.2. Up-to-Date Implementation ................................164

A.3. Base64-Encoded Source Code ...............................164

A.4. Test Vectors .............................................321

Appendix B. Self-Delimiting Framing ..............................321

Valin, et al. Standards Track [Page 4]

RFC 6716

Interactive Audio Codec September 2012

1. Introduction

The Opus codec is a real-time interactive audio codec designed to

meet the requirements described in [

REQUIREMENTS]. It is composed of

a layer based on Linear Prediction (LP) [

LPC] and a layer based on

the Modified Discrete Cosine Transform (MDCT) [

MDCT]. The main idea

behind using two layers is as follows: in speech, linear prediction

techniques (such as Code-Excited Linear Prediction, or CELP) code low

frequencies more efficiently than transform (e.g., MDCT) domain

techniques, while the situation is reversed for music and higher

speech frequencies. Thus, a codec with both layers available can

operate over a wider range than either one alone and can achieve

better quality by combining them than by using either one

individually.

The primary normative part of this specification is provided by the

source code in

Appendix A. Only the decoder portion of this software

is normative, though a significant amount of code is shared by both

the encoder and decoder.

Section 6 provides a decoder conformance

test. The decoder contains a great deal of integer and fixed-point

arithmetic that needs to be performed exactly, including all rounding

considerations, so any useful specification requires domain-specific

symbolic language to adequately define these operations.

Additionally, any conflict between the symbolic representation and

the included reference implementation must be resolved. For the

practical reasons of compatibility and testability, it would be

advantageous to give the reference implementation priority in any

disagreement. The C language is also one of the most widely

understood, human-readable symbolic representations for machine

behavior. For these reasons, this RFC uses the reference

implementation as the sole symbolic representation of the codec.

While the symbolic representation is unambiguous and complete, it is

not always the easiest way to understand the codec’s operation. For

this reason, this document also describes significant parts of the

codec in prose and takes the opportunity to explain the rationale

behind many of the more surprising elements of the design. These

descriptions are intended to be accurate and informative, but the

limitations of common English sometimes result in ambiguity, so it is

expected that the reader will always read them alongside the symbolic

representation. Numerous references to the implementation are

provided for this purpose. The descriptions sometimes differ from

the reference in ordering or through mathematical simplification

wherever such deviation makes an explanation easier to understand.

For example, the right shift and left shift operations in the

reference implementation are often described using division and

Valin, et al. Standards Track [Page 5]

评论收藏

内容反馈

吉祥天

2017-12-28

非常棒，不错！

加加减减free

粉丝: 1
资源: 1

opus音频编解码算法

opus语音编码算法

音频编码之opus

最新的opus音频编解码源代码

opus的源码

android opus语音编解码库的生成和应用

concentus, 纯可移植 C# 和Opus音频编解码器的Java实现.zip

opus编码介绍.pdf

opus编解码移植stm32f407

Opus音频测试

音频编码之opus(最新)

用opus音频编解码实现walkgeek

音频编解码算法库 (支持g711u，g711a，g729，g722，opus等)

crossbridge编译的opus音频编解码库

ios arm64 opus音频编解码

一种面向无线应用的音频编解码算法的实现和优化（可编辑）.doc

android ios opus语音编码压缩库编译

Opus编码和解码的简易应用接口库

jopus:用于解码 Opus 音频文件的 Java 包装器

js实现opus的编码和解码的完整demo代码

g729:适用于ARM设备的G.729编解码器的实验版本

G729音频压缩算法

从rtp包中提取opus及h265码流的小工具

音频实验代码-音频算法.zip

Opus_低延迟音频编解码器API手册中文翻译

c语言 sbc 音频编解码算法

MPEG-4音频编码中CELP编解码的原理、算法和验证.doc

MPEG2音频实时压缩编解码的一种快速算法.docx

adpcm.zip_ADPCM 解码 VC_VC ADPCM_adpcm 解码 VC_音频Adpcm 编解码算法

MP3音频编解码运算中IMDCT算法研究及其FPGA实现

c#语音聊天，c#语音压缩传输

最新资源