www.spiritDSP.com Copyright © 2011 SPIRIT
SPIRIT MP3 Decoder
User's Guide
Version 1.3
October 2010
SPIRIT MP3 Decoder
www.spiritDSP.com Copyright © 2011 SPIRIT Page 2 of 24
Contents
ABOUT THIS DOCUMENT ............................................................................................. 3
Decoder Input Format ................................................................................................................................................ 3
Excluded features ........................................................................................................................................................ 3
Decoder Output Format ............................................................................................................................................. 3
MP3 Introduction. ....................................................................................................................................................... 3
MP3 format history ..................................................................................................................................................... 4
MP3 bitstream overview ............................................................................................................................................. 4
Frame header format .................................................................................................................................................. 6
Standard Compliance and Testing Procedure. ....................................................................................................... 11
Library description ................................................................................................................................................... 12
Description of Structures .......................................................................................................................................... 14
TSpiritMP3Decoder ................................................................................................................................................ 14
TSpiritMP3Info ....................................................................................................................................................... 14
Description of the Callback Functions .................................................................................................................... 16
fnSpiritMP3ReadCallback () ................................................................................................................................... 16
fnSpiritMP3ProcessCallback () ............................................................................................................................... 16
Library API Description........................................................................................................................................... 18
SpiritMP3DecoderInit () ......................................................................................................................................... 18
SpiritMP3Decode () ................................................................................................................................................ 18
Source Samples .......................................................................................................................................................... 20
Sample 1: MP3 to PCM file decoder. ..................................................................................................................... 20
Frequently Asked Questions .................................................................................................................................... 21
SPIRIT MP3 Decoder
www.spiritDSP.com Copyright © 2011 SPIRIT Page 3 of 24
About This Document
This document describes the API of Spirit MPEG audio (mp3) decoder. The decoder is available as:
Portable floating-point ANSI C source code;
Portable fixed-point ANSI C source code (full standard compliance);
Portable fixed-point ANSI C source code (limited accuracy);
Assembly optimized library for various DSP and RISC platforms.
Decoder Input Format
The MP3 (MPEG layer 3) audio decoder supports the following formats:
MPEG-1, 2 or 2.5 formats.
Layers 1, 2 and 3.
VBR and Free-Format streams.
Mono or stereo input streams.
Excluded features
By default, the MP3 decoder does not support the CRC code verification. The reason for that is the
existence of some mp3 files with incorrectly encoded CRC code, so, if CRC check is enabled, such files
cannot be decoded. This behavior matches the behavior of most PC-based decoders.
Decoder Output Format
Stereo PCM signed 16-bit stereo.
Note that the output audio format does not depend on the input format. In case of mono input, both output
channels will contain identical data. Stereo samples are stored in interleaved order; the left channel goes
first. Note that the size of one output sample is always 32 bits (2 channels * 16 bits).
MP3 Introduction.
MP3 refers to the MPEG Layer 3 audio compression scheme that shrinks audio files with only a small
sacrifice in sound quality. MP3 files can be compressed at different rates, but the more they are shrunk, the
worse the sound quality. A standard MP3 compression is at a 10:1 ratio, and yields a file that is about 4 MB
for a three-minute track.
The MPEG audio compression algorithm was developed by the Motion Picture Experts Group (MPEG),
as a part of the International Organization for Standardization (ISO) standard for the high fidelity
compression of digital audio. The MPEG-1 audio compression standard is one part of a multiple part
standard that addresses the compression of video (ISO/IEC 11172-2), the compression of audio (ISO/IEC
11172-3), and the synchronization of the audio, video, and related data streams (ISO/IEC 11172-1). While
the MPEG audio compression algorithm is lossy, it can often provide "transparent", perceptually lossless,
compression.
The ISO/IEC-11172-3 standard specifies three similar formats, called “layers”, for the audio compression.
All layers implements lossy compression of the digital audio for sampling rates 32, 44.1 and 48 KHz using
sub-band coding with account for psychoacoustics principles. The main idea of these algorithms is to shape
quantization noise in the frequency domain, so that it becomes undetectable by the average listener.
The further evolution of the standard is the MPEG-2 audio standard (ISO/IEC 13818-3). The MPEG-2
extends MPEG-1 for the sampling rates 16, 22.05 and 24 KHz. Also MPEG-2 introduces support for the
multi-channel audio, however, this multi-channel extension is not considered in this document.
Also the unofficial extension of the Layer-3 audio for lower sampling frequencies (8, 11.025 and 12 KHz),
called “MPEG 2.5 audio” is exists.
SPIRIT MP3 Decoder
www.spiritDSP.com Copyright © 2011 SPIRIT Page 4 of 24
The widespread term “MP3” denotes the layer-3 audio (regardless of MPEG version), however,
sometimes “MP3” used to denote all range of the MPEG audio algorithms. In this document we will use
“MP3” with regard to all MPEG audio versions/layers.
MP3 format history
1987 the Fraunhofer Institut in Erlangen, Germany, started to work on perceptual audio coding in
the framework of the EUREKA project EU147, Digital Audio Broadcasting (DAB).
1988 MPEG itself established, its full title Moving Picture Experts Group, not an organization in
itself, but a subcommittee of the ISO/IEC (International Standards Organization/International
Electrotechnical Commission).
1989 Fraunhofer Institut received a patent for MP3 in Germany.
1992 Fraunhofer‟s algorithm was integrated in the emergent MPEG-1 standard.
1993 MPEG-1 (ISO/IEC 11172-3:1993) standard published.
1997 Since 1997, MP3 format was supported by several open-source projects
1998 MPEG-2 (ISO/IEC 13818-3:1998) standard published. The key difference with MPEG-1
standard is support for lower sampling frequencies (16-24 KHz). The original MPEG-1 supports
frequencies from 32 to 48 KHz.
1998 WinAMP (free music player for Windows) appears.
1998 first portable MP3 player, Diamond Multimedia‟s Rio 300 released
1999 The mp3 music distribution over the Internet becomes very popular. Many sites have
emerged, allowing free music downloads. In 1999 the famous Napster (recently closed by RIAA
lawsuit) appeared.
Currently, the MP3 format still remains the de-facto standard for audio compression. The key points of its
success are:
MP3 format provides good audio quality at high compression ratios (near-CD-quality for 10x
compression).
MP3 appeared at the “right time”, just as the demands for audio distribution over the Internet
have risen.
Due to its “open” nature, MP3 has support from open-source community. As a result, numerous
free encoders and players (WinAmp, MusicMatch Jukebox, etc.) are available.
MP3 bitstream overview
The MP3 compressed bit stream can have one of several predefined fixed bit rates. The ranges for
allowable bit rates are shown in the Table 1. Depending on the audio sampling rate, this translates to
compression factors ranging from 2.3 to 48. It is allowed to encode different parts of the stream with different
bit rates, producing so-called “Variable Bit Rate” (VBR) stream. In addition, the standard provides a "free-
format” bit rate mode to support fixed bit rates other than the predefined rates. The coded bit stream
supports an optional Cyclic Redundancy Check (CRC) error detection code.
MPEG version
Sampling rates
(kHz)
Bitrate, kbps
Layer-1
Layer-2
Layer-3
MPEG-1
(ISO/IEC 11172-3)
32, 44.1, 48
32-448
32-384
32-320
MPEG-2
(ISO/IEC 13818-3)
16, 22.05, 24
32-256
8-160
8-160
MPEG-2.5
(Unofficial
extension)
8, 11.025, 12
Nonexistent
Nonexistent
8-160
SPIRIT MP3 Decoder
www.spiritDSP.com Copyright © 2011 SPIRIT Page 5 of 24
Table 1. Bitrates and sampling rates for the MP3 audio
The MP3 compressor encodes audio by the fixed size blocks. The block, called “frame” encodes
predefined number of PCM samples (see Table 2). The decoding must start from the beginning of the frame,
however it is possible to decode only a part of frame. The minimum decodable unit size is shown in the
Table 2.
MPEG version
Frame size/minimum decodable unit, samples
Layer-1
Layer-2
Layer-3
MPEG-1
(ISO/IEC 11172-3)
384/32
1152/96
1152/576
MPEG-2
(ISO/IEC 13818-3)
384/32
1152/96
576/576
MPEG-2.5
(Unofficial
extension)
Nonexistent
Nonexistent
576/576
Table 2. Frame sizes for the MP3 audio
The frame format is shown on Figure 1. The frame starts with a 32-bit frame header, which is used for
stream synchronization and encodes basic audio information, such as sample rate, number of channels,
bitrate, etc. The 16-bit CRC field is optional. Note that CRC is calculated not the entire frame, but only for the
last 16 bits of the frame header and side info.
Header
32-bits
CRC
0-16 bits
Bit Allocation
26-188 bits
SCFSI
0-60 bits
Scalefactors
0-1080 bits
Audio samples:
12 granules *96 samples = 1152
Ancillary data
Header
32-bits
CRC
0-16 bits
Bit Allocation
128-256 bits
Scalefactors
0-384 bits
Audio samples:
12 granules *32 samples = 384
Ancillary data
Header
32-bits
CRC
0-16 bits
Side info
72-256 bits
Scalefactors and Huffman-encoded data (“part 2&3”)
1 or 2 granules * 576 samples (0-4095 bits per gr.)
Ancillary data
Layer-1
Layer-2
Layer-3
Figure 1. MP2 frame format.
There are no main file headers in MPEG audio files. An MPEG audio file is built up from a sequence of
successive frames. MPEG stream cannot have any “gaps” between the frames. Frame size can be
determined from the information contained in the frame header, so it is possible to determine position of the
next frame header if the position of the previous one is known. The MPEG audio stream can contain ancillary
data, which are used to keep the specified bit rate (i.e. frame size) when the bit reservoir cannot be
increased or is not used.
In the Variable Bit Rate (VBR) stream, the frame size can differ across the stream. There are no special
“flags” etc. to figure out whether the MPEG audio stream is VBR or not. The decoder should always assume
that frame sizes may differ from frame to frame.
A rarely used feature of MPEG audio is the “Free format” mode. In this mode the frames in the stream
can have arbitrary bitrate (possibly, different from the fixed set of bitrates specified in the standard). The free-
format frame size cannot be calculated from the information contained in the frame header. In order to
support the free-format mode, the decoder must figure out free-format frame size by searching several
consecutive frame headers. The VBR mode is inadmissible when using free-format mode, and free-format
frames cannot be mixed with fixed-bitrate frames.
In case of Layer I or Layer II, frames are totally independent entities: each frame contains all information
required for its decoding, so the decoding process can start right after first frame header is found in the
stream. However, in case of Layer III, frames are not always independent. The structure of the Layer-3 bit