没有合适的资源?快使用搜索试试~ 我知道了~
sox是开源的linux音频工具,众多音频编辑软件采用该工具. Sox是最为著名的Open Source声音文件格式转换工具。已经被广泛移植到Dos、windows、OS2、S un、Next、Unix、Linux等多个操作系统平台。 Sox项目是由Lance Norskog创立的,后来被众多的开发者逐步完善,现在已经能够支持很多种声 音文件格式和声音处理效果。基本上常见的声音格式都能够支持。更加有用的是,Sox能够进行 声音滤波、采样频率转换,这对那些从事声讯平台开发或维护的朋友非常有用。当然,Sox里面 也包括一些DSP算法,有兴趣的朋友可以下载回去研究。Sox可以用于任何用途。但是发布源代码 时必须包括版权声明,发布二进制代码必须声明作者。
资源推荐
资源详情
资源评论
SoX(1) Sound eXchange SoX(1)
NAME
SoX − Sound eXchange, the Swiss Army knife of audio manipulation
SYNOPSIS
sox [global-options][format-options] infile1
[[format-options] infile2]... [format-options] outfile
[effect [effect-options]] ...
play [global-options][format-options] infile1
[[format-options] infile2]... [format-options]
[effect [effect-options]] ...
rec [global-options][format-options] outfile
[effect [effect-options]] ...
DESCRIPTION
Introduction
SoX reads and writes audio files in most popular formats and can optionally apply effects to them. It can
combine multiple input sources, synthesise audio, and, on manysystems, act as a general purpose audio
player or a multi-track audio recorder.Italso has limited ability to split the input into multiple output files.
All SoX functionality is available using just the sox command. Tosimplify playing and recording audio, if
SoX is invokedas play,the output file is automatically set to be the default sound device, and if invokedas
rec,the default sound device is used as an input source. Additionally,the soxi(1) command provides a con-
venient way to just query audio file header information.
The heart of SoX is a library called libSoX. Those interested in extending SoX or using it in other pro-
grams should refer to the libSoX manual page: libsox(3).
SoX is a command-line audio processing tool, particularly suited to making quick, simple edits and to batch
processing. If you need an interactive,graphical audio editor,use audacity(1).
***
The overall SoX processing chain can be summarised as follows:
Input(s) → Combiner → Effects → Output(s)
Note however, that on the SoX command line, the positions of the Output(s) and the Effects are swapped
w.r.t. the logical flowjust shown. Note also that whilst options pertaining to files are placed before their
respective file name, the opposite is true for effects. Toshowhow this works in practice, here is a selection
of examples of howSoX might be used. The simple
sox recital.au recital.wav
translates an audio file in Sun AUformat to a Microsoft WAV file, whilst
sox recital.au −b 16 recital.wav channels 1 rate 16k fade 3 norm
performs the same format translation, but also applies four effects (down-mix to one channel, sample rate
change, fade-in, nomalize), and stores the result at a bit-depth of 16.
sox −r 16k −e signed −b 8 −c 1 voice-memo.raw voice-memo.wav
converts ‘raw’ (a.k.a. ‘headerless’) audio to a self-describing file format,
sox slow.aiff fixed.aiff speed 1.027
adjusts audio speed,
sox short.wav long.wav longer.wav
concatenates twoaudio files, and
sox −m music.mp3 voice.wav mixed.flac
mixes together twoaudio files.
play "The Moonbeams/Greatest/*.ogg" bass +3
plays a collection of audio files whilst applying a bass boosting effect,
play −n −c1 synth sin %−12 sin %−9 sin %−5 sin %−2 fade h 0.1 1 0.1
plays a synthesised ‘Aminor seventh’ chord with a pipe-organsound,
rec −c 2 radio.aiff trim 0 30:00
sox December 31, 2014 1
SoX(1) Sound eXchange SoX(1)
records half an hour of stereo audio, and
play −q take1.aiff & rec −M take1.aiff take1−dub.aiff
(with POSIX shell and where supported by hardware) records a newtrack in a multi-track recording.
Finally,
rec −r 44100 −b 16 −e signed-integer −p \
silence 1 0.50 0.1% 1 10:00 0.1% | \
sox −p song.ogg silence 1 0.50 0.1% 1 2.0 0.1% : \
newfile : restart
records a stream of audio such as LP/cassette and splits in to multiple audio files at points with 2 seconds of
silence. Also, it does not start recording until it detects audio is playing and stops after it sees 10 minutes
of silence.
N.B. The above isjust an overviewofSoX’scapabilities; detailed explanations of howtouse all SoX
parameters, file formats, and effects can be found belowinthis manual, in soxformat(7), and in soxi(1).
File Format Types
SoX can work with ‘self-describing’ and ‘raw’ audio files. ‘self-describing’ formats (e.g. WAV , FLAC,
MP3) have a header that completely describes the signal and encoding attributes of the audio data that fol-
lows. ‘raw’ or ‘headerless’ formats do not contain this information, so the audio characteristics of these
must be described on the SoX command line or inferred from those of the input file.
The following four characteristics are used to describe the format of audio data such that it can be pro-
cessed with SoX:
sample rate
The sample rate in samples per second (‘Hertz’ or ‘Hz’). Digital telephonytraditionally uses a
sample rate of 8000 Hz (8 kHz), though these days, 16 and even32kHz are becoming more com-
mon. Audio Compact Discs use 44100 Hz (44
.
1kHz). Digital Audio Tape and manycomputer
systems use 48 kHz. Professional audio systems often use 96 kHz.
sample size
The number of bits used to store each sample. Today,16-bit is commonly used. 8-bit was popular
in the early days of computer audio. 24-bit is used in the professional audio arena. Other sizes are
also used.
data encoding
The way in which each audio sample is represented (or ‘encoded’). Some encodings have variants
with different byte-orderings or bit-orderings. Some compress the audio data so that the stored
audio data takes up less space (i.e. disk space or transmission bandwidth) than the other format
parameters and the number of samples would imply.Commonly-used encoding types include
floating-point, µ-law, ADPCM, signed-integer PCM, MP3, and FLAC.
channels
The number of audio channels contained in the file. One (‘mono’) and two(‘stereo’) are widely
used. ‘Surround sound’ audio typically contains six or more channels.
The term ‘bit-rate’ is a measure of the amount of storage occupied by an encoded audio signal overaunit
of time. It can depend on all of the above and is typically denoted as a number of kilo-bits per second
(kbps). An A-lawtelephonysignal has a bit-rate of 64 kbps. MP3-encoded stereo music typically has a bit-
rate of 128−196 kbps. FLAC-encoded stereo music typically has a bit-rate of 550−760 kbps.
Most self-describing formats also allowtextual ‘comments’ to be embedded in the file that can be used to
describe the audio in some way,e.g. for music, the title, the author,etc.
One important use of audio file comments is to convey ‘Replay Gain’ information. SoX supports applying
Replay Gain information (for certain input file formats only; currently,atleast FLACand Ogg Vorbis), but
not generating it. Note that by default, SoX copies input file comments to output files that support com-
ments, so output files may contain Replay Gain information if some was present in the input file. In this
case, if anything other than a simple format conversion was performed then the output file Replay Gain
information is likely to be incorrect and so should be recalculated using a tool that supports this (not SoX).
sox December 31, 2014 2
SoX(1) Sound eXchange SoX(1)
The soxi(1) command can be used to display information from audio file headers.
Determining & Setting The File Format
There are several mechanisms available for SoX to use to determine or set the format characteristics of an
audio file. Depending on the circumstances, individual characteristics may be determined or set using dif-
ferent mechanisms.
To determine the format of an input file, SoX will use, in order of precedence and as givenoravailable:
1. Command-line format options.
2. The contents of the file header.
3. The filename extension.
To set the output file format, SoX will use, in order of precedence and as givenoravailable:
1. Command-line format options.
2. The filename extension.
3. The input file format characteristics, or the closest that is supported by the output file type.
Forall files, SoX will exit with an error if the file type cannot be determined. Command-line format options
may need to be added or changed to resolvethe problem.
Playing & Recording Audio
The play and rec commands are provided so that basic playing and recording is as simple as
play existing-file.wav
and
rec new-file.wav
These twocommands are functionally equivalent to
sox existing-file.wav −d
and
sox −d new-file.wav
Of course, further options and effects (as described below) can be added to the commands in either form.
***
Some systems provide more than one type of (SoX-compatible) audio driver, e.g. ALSA & OSS, or
SUNAU & AO.Systems can also have more than one audio device (a.k.a. ‘sound card’). If more than one
audio driverhas been built-in to SoX, and the default selected by SoX when recording or playing is not the
one that is wanted, then the AUDIODRIVER environment variable can be used to override the default.
Forexample (on manysystems):
set AUDIODRIVER=oss
play ...
The AUDIODEV environment variable can be used to override the default audio device, e.g.
set AUDIODEV=/dev/dsp2
play ...
sox ... −t oss
or
set AUDIODEV=hw:soundwave,1,2
play ...
sox ... −t alsa
Note that the way of setting environment variables varies from system to system—for some specific exam-
ples, see ‘SOX_OPTS’ below.
When playing a file with a sample rate that is not supported by the audio output device, SoX will automati-
cally invoke the rate effect to perform the necessary sample rate conversion. For compatibility with old
hardware, the default rate quality levelisset to ‘low’. This can be changed by explicitly specifying the rate
effect with a different quality level, e.g.
play ... rate −m
or by using the −−play−rate−arg option (see below).
sox December 31, 2014 3
SoX(1) Sound eXchange SoX(1)
***
On some systems, SoX allows audio playback volume to be adjusted whilst using play.Where supported,
this is achievedbytapping the ‘v’ & ‘V’ keysduring playback.
To help with setting a suitable recording level, SoX includes a peak-levelmeter which can be invoked
(before making the actual recording) as follows:
rec −n
The recording levelshould be adjusted (using the system-provided mixer program, not SoX) so that the
meter is at most occasionally full scale, and never‘in the red’ (an exclamation mark is shown). See also −S
below.
Accuracy
Manyfile formats that compress audio discard some of the audio signal information whilst doing so. Con-
verting to such a format and then converting back again will not produce an exact copyofthe original
audio. This is the case for manyformats used in telephony(e.g. A-law, GSM) where lowsignal bandwidth
is more important than high audio fidelity,and for manyformats used in portable music players (e.g. MP3,
Vorbis) where adequate fidelity can be retained evenwith the large compression ratios that are needed to
makeportable players practical.
Formats that discard audio signal information are called ‘lossy’. Formats that do not are called ‘lossless’.
The term ‘quality’ is used as a measure of howclosely the original audio signal can be reproduced when
using a lossy format.
Audio file conversion with SoX is lossless when it can be, i.e. when not using lossy compression, when not
reducing the sampling rate or number of channels, and when the number of bits used in the destination for-
mat is not less than in the source format. E.g. converting from an 8-bit PCM format to a 16-bit PCM for-
mat is lossless but converting from an 8-bit PCM format to (8-bit) A-lawisn’t.
N.B. SoX converts all audio files to an internal uncompressed format before performing anyaudio process-
ing. This means that manipulating a file that is stored in a lossy format can cause further losses in audio
fidelity.E.g. with
sox long.mp3 short.mp3 trim 10
SoX first decompresses the input MP3 file, then applies the trim effect, and finally creates the output MP3
file by re-compressing the audio—with a possible reduction in fidelity above that which occurred when the
input file was created. Hence, if what is ultimately desired is lossily compressed audio, it is highly recom-
mended to perform all audio processing using lossless file formats and then convert to the lossy format only
at the final stage.
N.B. Applying multiple effects with a single SoX invocation will, in general, produce more accurate results
than those produced using multiple SoX invocations.
Dithering
Dithering is a technique used to maximise the dynamic range of audio stored at a particular bit-depth. Any
distortion introduced by quantisation is decorrelated by adding a small amount of white noise to the signal.
In most cases, SoX can determine whether the selected processing requires dither and will add it during
output formatting if appropriate.
Specifically,bydefault, SoX automatically adds TPDF dither when the output bit-depth is less than 24 and
anyofthe following are true:
• bit-depth reduction has been specified explicitly using a command-line option
• the output file format supports only bit-depths lower than that of the input file format
• an effect has increased effective bit-depth within the internal processing chain
Forexample, adjusting volume with vol0.25 requires twoadditional bits in which to losslessly store its
results (since 0
.
25 decimal equals 0
.
01 binary). So if the input file bit-depth is 16, then SoX’sinternal rep-
resentation will utilise 18 bits after processing this volume change. In order to store the output at the same
depth as the input, dithering is used to remove the additional bits.
sox December 31, 2014 4
SoX(1) Sound eXchange SoX(1)
Use the −V option to see what processing SoX has automatically added. The −D option may be givento
override automatic dithering. To inv oke dithering manually (e.g. to select a noise-shaping curve), see the
dither effect.
Clipping
Clipping is distortion that occurs when an audio signal level(or ‘volume’) exceeds the range of the chosen
representation. In most cases, clipping is undesirable and so should be corrected by adjusting the level
prior to the point (in the processing chain) at which it occurs.
In SoX, clipping could occur,asyou might expect, when using the vol or gain effects to increase the audio
volume. Clipping could also occur with manyother effects, when converting one format to another,and
ev e nwhen simply playing the audio.
Playing an audio file often involves resampling, and processing by analogue components can introduce a
small DC offset and/or amplification, all of which can produce distortion if the audio signal levelwas ini-
tially too close to the clipping point.
Forthese reasons, it is usual to makesure that an audio file’ssignal levelhas some ‘headroom’, i.e. it does
not exceed a particular levelbelowthe maximum possible levelfor the givenrepresentation. Some stan-
dards bodies recommend as much as 9dB headroom, but in most cases, 3dB (≈ 70% linear) is enough. Note
that this wisdom seems to have been lost in modern music production; in fact, manyCDs, MP3s, etc. are
nowmastered at levels above 0dBFS i.e. the audio is clipped as delivered.
SoX’s stat and stats effects can assist in determining the signal levelinanaudio file. The gain or vol effect
can be used to prevent clipping, e.g.
sox dull.wav bright.wav gain −6 treble +6
guarantees that the treble boost will not clip.
If clipping occurs at anypoint during processing, SoX will display a warning message to that effect.
See also −G and the gain and norm effects.
Input File Combining
SoX’sinput combiner can be configured (see OPTIONS below) to combine multiple files using anyofthe
following methods: ‘concatenate’, ‘sequence’, ‘mix’, ‘mix-power’, ‘merge’, or ‘multiply’. The default
method is ‘sequence’ for play,and ‘concatenate’ for rec and sox.
Forall methods other than ‘sequence’, multiple input files must have the same sampling rate. If necessary,
separate SoX invocations can be used to makesampling rate adjustments prior to combining.
If the ‘concatenate’ combining method is selected (usually,this will be by default) then the input files must
also have the same number of channels. The audio from each input will be concatenated in the order given
to form the output file.
The ‘sequence’ combining method is selected automatically for play.Itissimilar to ‘concatenate’ in that
the audio from each input file is sent serially to the output file. However, here the output file may be closed
and reopened at the corresponding transition between input files. This may be just what is needed when
sending different types of audio to an output device, but is not generally useful when the output is a normal
file.
If either the ‘mix’ or ‘mix-power’ combining method is selected then twoormore input files must be given
and will be mixed together to form the output file. The number of channels in each input file need not be
the same, but SoX will issue a warning if theyare not and some channels in the output file will not contain
audio from every input file. Amixed audio file cannot be un-mixed without reference to the original input
files.
If the ‘merge’ combining method is selected then twoormore input files must be givenand will be merged
together to form the output file. The number of channels in each input file need not be the same. Amerged
audio file comprises all of the channels from all of the input files. Un-merging is possible using multiple
invocations of SoX with the remix effect. For example, twomono files could be merged to form one stereo
file. The first and second mono files would become the left and right channels of the stereo file.
sox December 31, 2014 5
剩余83页未读,继续阅读
资源评论
帅的人已经开始写博客了
- 粉丝: 0
- 资源: 4
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功