function [gci,goi] = dypsa(s,fs)
%DYPSA Derive glottal closure instances from speech [gci,goi] = (s,fs)
% Note: Needs to be combined with a voiced-voiceless detector to eliminate
% spurious closures in unvoiced and silent regions.
%
% Inputs:
% s is the speech signal
% fs is the sampling frequncy
%
% Outputs:
% gci is a vector of glottal closure sample numbers
% gco is a vector of glottal opening sample numbers derived from
% an assumed constant closed-phase fraction
%
% References:
% [1] P. A. Naylor, A. Kounoudes, J. Gudnason, and M. Brookes, Estimation of Glottal Closure
% Instants in Voiced Speech using the DYPSA Algorithm, IEEE Trans on Speech and Audio
% Processing, vol. 15, pp. 3443, Jan. 2007.
% [2] M. Brookes, P. A. Naylor, and J. Gudnason, A Quantitative Assessment of Group Delay Methods
% for Identifying Glottal Closures in Voiced Speech, IEEE Trans on Speech & Audio Processing,
% vol. 14, no. 2, pp. 456466, Mar. 2006.
% [3] A. Kounoudes, P. A. Naylor, and M. Brookes, The DYPSA algorithm for estimation of glottal
% closure instants in voiced speech, in Proc ICASSP 2002, vol. 1, Orlando, 2002, pp. 349352.
% [4] C. Ma, Y. Kamp, and L. F. Willems, A Frobenius norm approach to glottal closure detection
% from the speech signal, IEEE Trans. Speech Audio Processing, vol. 2, pp. 258265, Apr. 1994.
% [5] A. Kounoudes, Epoch Estimation for Closed-Phase Analysis of Speech, PhD Thesis,
% Imperial College, 2001.
% Algorithm Parameters
% The following parameters are defined in voicebox()
%
% dy_cpfrac=0.3; % presumed closed phase fraction of larynx cycle
% dy_cproj=0.2; % cost of projected candidate
% dy_cspurt=-0.45; % cost of a talkspurt
% dy_dopsp=1; % Use phase slope projection (1) or not (0)?
% dy_ewdly=0.0008; % window delay for energy cost function term [~ energy peak delay from closure] (sec)
% dy_ewlen=0.003; % window length for energy cost function term (sec)
% dy_ewtaper=0.001; % taper length for energy cost function window (sec)
% dy_fwlen=0.00045; % window length used to smooth group delay (sec)
% dy_fxmax=500; % max larynx frequency (Hz)
% dy_fxmin=50; % min larynx frequency (Hz)
% dy_fxminf=60; % min larynx frequency (Hz) [used for Frobenius norm only]
% dy_gwlen=0.0030; % group delay evaluation window length (sec)
% dy_lpcdur=0.020; % lpc analysis frame length (sec)
% dy_lpcn=2; % lpc additional poles
% dy_lpcnf=0.001; % lpc poles per Hz (1/Hz)
% dy_lpcstep=0.010; % lpc analysis step (sec)
% dy_nbest=5; % Number of NBest paths to keep
% dy_preemph=50; % pre-emphasis filter frequency (Hz) (to avoid preemphasis, make this very large)
% dy_spitch=0.2; % scale factor for pitch deviation cost
% dy_wener=0.3; % DP energy weighting
% dy_wpitch=0.5; % DP pitch weighting
% dy_wslope=0.1; % DP group delay slope weighting
% dy_wxcorr=0.8; % DP cross correlation weighting
% dy_xwlen=0.01; % cross-correlation length for waveform similarity (sec)
% Revision History:
%
% 3.0 - 29 Jun 2006 - Rewrote DP function to improve speed
% 2.6 - 29 Jun 2006 - Tidied up algorithm parameters
% 2.4 - 10 Jun 2006 - Made into a single file aand put into VOICEBOX
% 2.3 - 18 Mar 2005 - Removed 4kHz filtering of phase-slope function
% 2.2 - 05 Oct 2004 - dpgci uses the slopes returned from xewgrdel
% - gdwav from speech with fs<9000 is not filtered
% - Various outputs and inputs of functions have been
% removed since now there is no plotting
% 1.0 - 30 Jan 2001 - Initial version [5]
% Bugs:
% 1. Allow the projections only to extend to the end of the larynx cycle
% 2. Compensate for false pitch period cost at the start of a voicespurt
% 3. Should include energy and pahse-slope costs for the first closeure of a voicespurt
% 4. should delete candidates that are too close to the beginning or end of speech for the cost measures
% currently this is 200 samples fixed in the main routine but it should adapt to window lengths of
% cross-correlation, lpc and energy measures.
% 5. Should have an integrated voiced/voiceless detector
% 6. Allow dypsa to be called in chunks for a long speech file
% 7. Do forward & backward passes to allow for gradient descent and/or discriminative training
% 8. Glottal opening approximation does not really belong in this function
% 9. The cross correlation window is asymmetric (and overcomplex) for no very good reason
% 10. Incorporate -0.5 factor into dy_wxcorr and abolish erroneous (nx2-1)/(nx2-2) factor
% 11. Add in talkspurt cost at the beginning rather than the end of a spurt (more efficient)
% 12. Remove qmin>2 condition from voicespurt start detection (DYPSA 2 compatibility) in two places
% 13. Include energy and phase-slope costs at the start of a voicespurt
% 14. Single-closure voicespurt are only allowed if nbest=1 (should always be forbidden)
% 15. Penultimate closure candidate is always acceptd
% 16. Final element of gcic, Cfn and Ch is unused
% 17. Needs to cope better with irregular voicing (e.g. creaky voice)
% 18. Should give non-integer GCI positions for improved accuracy
% 19. Remove constraint that first voicespurt cannot begin until qrmax after the first candidate
% Copyright (C) Tasos Kounoudes, Jon Gudnason, Patrick Naylor and Mike Brookes 2006
% Version: $Id: dypsa.m,v 3.6 2007/05/04 07:01:38 dmb Exp $
%
% VOICEBOX is a MATLAB toolbox for speech processing.
% Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% This program is free software; you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation; either version 2 of the License, or
% (at your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You can obtain a copy of the GNU General Public License from
% http://www.gnu.org/copyleft/gpl.html or by writing to
% Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Extract algorithm constants from VOICEBOX
dy_preemph=voicebox('dy_preemph');
dy_lpcstep=voicebox('dy_lpcstep');
dy_lpcdur=voicebox('dy_lpcdur');
dy_dopsp=voicebox('dy_dopsp'); % Use phase slope projection (1) or not (0)?
dy_ewtaper=voicebox('dy_ewtaper'); % Prediction order of FrobNorm method in seconds
dy_ewlen=voicebox('dy_ewlen'); % windowlength of FrobNorm method in seconds
dy_ewdly=voicebox('dy_ewdly'); % shift for assymetric speech shape at start of voiced cycle
dy_cpfrac=voicebox('dy_cpfrac'); % presumed ratio of larynx cycle that is closed
dy_lpcnf=voicebox('dy_lpcnf'); % lpc poles per Hz (1/Hz)
dy_lpcn=voicebox('dy_lpcn'); % lpc additional poles
lpcord=ceil(fs*dy_lpcnf+dy_lpcn); % lpc poles
%PreEmphasise input speech
s_used=filter([1 -exp(-2*pi*dy_preemph/fs)],1,s);
% perform LPC analysis, AC method with Hamming windowing
[ar, e, k] = lpcauto(s_used,lpcord,flo
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
语音开发Matlab工具箱 (174个子文件)
dypsa.m 26KB
gaussmix.m 19KB
fxrapt.m 16KB
estnoisem.m 15KB
readsfs.m 15KB
readsph.m 11KB
gaussmixp.m 11KB
specsub.m 10KB
activlev.m 10KB
ssubmmse.m 10KB
writewav.m 9KB
readwav.m 8KB
voicebox.m 8KB
Contents.m 8KB
readaif.m 8KB
kmeanhar.m 8KB
gausprod.m 7KB
windows.m 6KB
maxgauss.m 5KB
kmeans.m 5KB
writehtk.m 5KB
readhtk.m 5KB
fram2wav.m 5KB
randvec.m 5KB
lpccovar.m 5KB
melcepst.m 5KB
txalign.m 5KB
gmmlpdf.m 5KB
histndim.m 5KB
distisar.m 5KB
lpcconv.m 4KB
lpcauto.m 4KB
momfilt.m 4KB
findpeaks.m 4KB
distitar.m 4KB
entropy.m 4KB
distchar.m 4KB
distispf.m 4KB
disteusq.m 4KB
specsubm.m 4KB
distitpf.m 4KB
rotqr2eu.m 3KB
lpcifilt.m 3KB
maxfilt.m 3KB
melbankm.m 3KB
readcnx.m 3KB
distchpf.m 3KB
glotlf.m 3KB
lpcrf2ar.m 3KB
zoomfft.m 3KB
bitsprec.m 3KB
ldatrace.m 3KB
rotro2eu.m 3KB
unixwhich.m 3KB
readau.m 3KB
schmitt.m 3KB
rotro2qr.m 3KB
lpcar2am.m 3KB
lpcar2fm.m 3KB
peak2dquad.m 3KB
dlyapsq.m 3KB
rotqr2ro.m 3KB
figbolden.m 3KB
randfilt.m 3KB
randiscr.m 2KB
frac2bin.m 2KB
frq2erb.m 2KB
dualdiag.m 2KB
irdct.m 2KB
huffman.m 2KB
ewgrpdel.m 2KB
erb2frq.m 2KB
finishat.m 2KB
rotation.m 2KB
midi2frq.m 2KB
lpcrr2am.m 2KB
spgrambw.m 2KB
roteu2qr.m 2KB
frq2mel.m 2KB
mel2frq.m 2KB
choosenk.m 2KB
zerocros.m 2KB
lin2pcma.m 2KB
meansqtf.m 2KB
pcma2lin.m 2KB
rotro2pl.m 2KB
rotqr2mr.m 2KB
enframe.m 2KB
lpcrf2rr.m 2KB
cep2pow.m 2KB
rdct.m 2KB
rotpl2ro.m 2KB
frq2midi.m 2KB
lognmpdf.m 2KB
lin2pcmu.m 2KB
pow2cep.m 2KB
usasi.m 2KB
irfft.m 2KB
ccwarpf.m 2KB
lpcar2ls.m 2KB
共 174 条
- 1
- 2
资源评论
- huanglingyue05032012-12-05正在用matlab做语音处理方面的工作,这个很实用
- donghaiyw2012-09-27很全,以后开发的基础
- daitaohao2012-12-15还可以,挺多的资料
liza4323
- 粉丝: 0
- 资源: 7
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功