# voixen-vad
WebRTC-based Voice Activity Detection library
Voice Activity Detection based on the method used in the upcoming [WebRTC](http://http://www.webrtc.org) HTML5 standard.
Extracted from [Chromium](https://chromium.googlesource.com/external/webrtc/+/branch-heads/43/webrtc/common_audio/vad/) for
stand-alone use as a library.
Sample MPEG audio decoder is a stripped-down libmpeg123 from [MPEG123](http://www.mpg123.de).
The Node.js bindings provide a simple way to do VAD on PCM audio input. Input data needs to be constant bitrate normalised
float (-1..+1) PCM audio samples. Detection results are returned using an async callback and additionally via events.
Supported sample rates are:
- 8000Hz*
- 16000Hz*
- 32000Hz
- 48000Hz
*recommended sample rate for best performance/accuracy tradeoff
## Installation
## API
```javascript
var VAD = require('vad').VAD,
```
### VAD(mode)
Create a new `VAD` object using the given mode. The 'mode' parameter is optional.
#### .processAudio(samples, samplerate, callback)
Analyse the given samples (`Buffer` object containing normalised 32bit float values) and notify the detected voice
event via `callback` and event.
#### .on(event, callback)
Subscribe to an event emitted by the VAD instance after detection. The event data provided to the callback is a number that
represents the voice detection result.
Supported event names are:
- 'event': VAD processsing finished successfully or with an error
- 'voice': Human speech was detected
- 'silence': Silence/non-speech was detected
- 'noise': [not implemented yet]
- 'error': an error occured during detection
### Event codes
Event codes are passed to the `processAudio` callback and to event handlers subscribed to the general
'event'-event.
#### VAD.EVENT_ERROR
Constant for voice detection errors. Passed to 'error' event handlers.
#### VAD.EVENT_SILENCE
Constant for voice detection results with no detected voices.
Passed to 'silence' event handlers.
#### VAD.EVENT_VOICE
Constant for voice detection results with detected voice.
Passed to 'voice' event handlers.
#### VAD.EVENT_NOISE
Constant for voice detection results with detected noise.
Not implemented yet
### Available VAD Modes
These contants can be used as the `mode` parameter of the `VAD` constructor to
configure the VAD algorithm.
#### VAD.MODE_NORMAL
Constant for normal voice detection mode. Suitable for high bitrate, low-noise data.
May classify noise as voice, too. The default value if `mode` is omitted in the constructor.
#### VAD.MODE_LOW_BITRATE
Detection mode optimised for low-bitrate audio.
#### VAD.MODE_AGGRESSIVE
Detection mode best suited for somewhat noisy, lower quality audio.
#### VAD.MODE_VERY_AGGRESSIVE
Detection mode with lowest miss-rate. Works well for most inputs.
### toFloatArray(buffer)
Utility function that coverts a `Buffer` object to a `TypedArray` of type `Float32Array`.
Works with node <0.12 as well as recent versions. Introduced as a node version-agnostic shim.
## Notes
The library is designed to work with input streams in mind, that is, sample buffers fed to `processAudio` should be
rather short (36ms to 144ms - depending on your needs) and the sample rate no higher than 32kHz. Sample rates higher than
than 16kHz provide no benefit to the VAD algorithm, as human voice patterns center around 4000 to 6000Hz. Minding the
Nyquist-frequency yields sample rates between 8000 and 12000Hz for best results.
## Example
```javascript
var VAD = require('vad').VAD
var pcmInputStream = getReadableAudioStreamSomehow()
var pcmOutputStream = getWritableStreamSomehow()
var vad = new VAD(VAD.MODE_LOW_BITRATE)
vad.on('voice', function() {
console.info('Voice detected!')
})
// this example tries to remove non-speech from an audio file
pcmInputStream.on('data', function(chunk) {
// assume audio data is 32bit float @ 16kHz
vad.processAudio(chunk, 160000, function(error, event) {
if (event === VAD.EVENT_VOICE) {
pcmOutputStream.write(chunk)
}
})
})
```
## License
[MIT](LICENSE)
没有合适的资源?快使用搜索试试~ 我知道了~
voixen-vad, 基于WebRTC的语音活动检测库.zip
共70个文件
c:32个
h:18个
js:5个
需积分: 21 4 下载量 169 浏览量
2019-09-18
00:28:44
上传
评论
收藏 155KB ZIP 举报
温馨提示
voixen-vad, 基于WebRTC的语音活动检测库 voixen基于webrtc的语音活动检测库基于upcoming标准的语音活动检测。 从 Chromium 提取,作为库单独使用。示例MPEG音频解码器是一个从 MPEG123的libmpeg123 。node.js 绑定提供了一
资源推荐
资源详情
资源评论
收起资源包目录
voixen-vad.zip (70个子文件)
voixen-vad-master
vendor
webrtc_vad
spl
resample.c 16KB
spl_core.c 30KB
ilbc_specific_functions.c 3KB
resample_by_2.c 6KB
real_fft.c 3KB
complex_fft_tables.h 9KB
splitting_filter.c 8KB
resample_by_2_internal.h 2KB
spl_sqrt_floor.c 2KB
levinson_durbin.c 8KB
spl_sqrt.c 5KB
min_max_operations.c 5KB
include
signal_processing_library.h 59KB
spl_inl_mips.h 7KB
spl_inl.h 4KB
spl_inl_armv7.h 3KB
real_fft.h 4KB
resample_fractional.c 8KB
resample_48khz.c 6KB
resample_by_2_internal.c 20KB
complex_fft.c 9KB
randomization_functions.c 6KB
spl_init.c 3KB
include
webrtc_vad.h 3KB
typedefs.h 4KB
LICENSE 2KB
webrtc_vad.gyp 1KB
vad
vad_filterbank.c 14KB
webrtc_vad.c 3KB
vad.cc 2KB
vad_core.c 25KB
vad_sp.h 2KB
vad_gmm.c 3KB
vad_filterbank.h 2KB
vad_core.h 4KB
vad_gmm.h 1KB
vad_sp.c 6KB
mpadec
include
mpadec.h 12KB
src
mpadec.c 16KB
bitstream.c 6KB
vbrtag.c 5KB
mpadec_internal.h 10KB
mpadec_interface.c 10KB
tabinit.c 6KB
l2tables.h 12KB
decode.c 9KB
layer2.c 13KB
huffman.h 17KB
layer1.c 7KB
layer3.c 53KB
dct64.c 11KB
mpadec.gyp 705B
index.js 230B
package.json 521B
LICENSE 1KB
src
mpa_bindings.cc 11KB
simplevad.h 2KB
vad_bindings.cc 5KB
simplevad.c 6KB
.npmignore 9B
examples
waveform
package.json 522B
waveform-generator.js 14KB
.gitignore 36B
README.md 893B
.gitignore 36B
lib
vad.js 5KB
decoderstream.js 7KB
binding.js 98B
README.md 4KB
binding.gyp 1KB
共 70 条
- 1
资源评论
weixin_38744270
- 粉丝: 327
- 资源: 2万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功