语音开发Matlab工具箱资源-CSDN文库

共174个文件

m：174个

语音处理

Matlab

5星 · 超过95%的资源需积分: 11 6 浏览量 2011-04-18 22:44:21 上传评论收藏 264KB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

语音开发Matlab工具箱（174个子文件）

dypsa.m 26KB

gaussmix.m 19KB

fxrapt.m 16KB

estnoisem.m 15KB

readsfs.m 15KB

readsph.m 11KB

gaussmixp.m 11KB

specsub.m 10KB

activlev.m 10KB

ssubmmse.m 10KB

writewav.m 9KB

readwav.m 8KB

voicebox.m 8KB

Contents.m 8KB

readaif.m 8KB

kmeanhar.m 8KB

gausprod.m 7KB

windows.m 6KB

maxgauss.m 5KB

kmeans.m 5KB

writehtk.m 5KB

readhtk.m 5KB

fram2wav.m 5KB

randvec.m 5KB

lpccovar.m 5KB

melcepst.m 5KB

txalign.m 5KB

gmmlpdf.m 5KB

histndim.m 5KB

distisar.m 5KB

lpcconv.m 4KB

lpcauto.m 4KB

momfilt.m 4KB

findpeaks.m 4KB

distitar.m 4KB

entropy.m 4KB

distchar.m 4KB

distispf.m 4KB

disteusq.m 4KB

specsubm.m 4KB

distitpf.m 4KB

rotqr2eu.m 3KB

lpcifilt.m 3KB

maxfilt.m 3KB

melbankm.m 3KB

readcnx.m 3KB

distchpf.m 3KB

glotlf.m 3KB

lpcrf2ar.m 3KB

zoomfft.m 3KB

bitsprec.m 3KB

ldatrace.m 3KB

rotro2eu.m 3KB

unixwhich.m 3KB

readau.m 3KB

schmitt.m 3KB

rotro2qr.m 3KB

lpcar2am.m 3KB

lpcar2fm.m 3KB

peak2dquad.m 3KB

dlyapsq.m 3KB

rotqr2ro.m 3KB

figbolden.m 3KB

randfilt.m 3KB

randiscr.m 2KB

frac2bin.m 2KB

frq2erb.m 2KB

dualdiag.m 2KB

irdct.m 2KB

huffman.m 2KB

ewgrpdel.m 2KB

erb2frq.m 2KB

finishat.m 2KB

rotation.m 2KB

midi2frq.m 2KB

lpcrr2am.m 2KB

spgrambw.m 2KB

roteu2qr.m 2KB

frq2mel.m 2KB

mel2frq.m 2KB

choosenk.m 2KB

zerocros.m 2KB

lin2pcma.m 2KB

meansqtf.m 2KB

pcma2lin.m 2KB

rotro2pl.m 2KB

rotqr2mr.m 2KB

enframe.m 2KB

lpcrf2rr.m 2KB

cep2pow.m 2KB

rdct.m 2KB

rotpl2ro.m 2KB

frq2midi.m 2KB

lognmpdf.m 2KB

lin2pcmu.m 2KB

pow2cep.m 2KB

usasi.m 2KB

irfft.m 2KB

ccwarpf.m 2KB

lpcar2ls.m 2KB

共 174 条

function [gci,goi] = dypsa(s,fs) %DYPSA Derive glottal closure instances from speech [gci,goi] = (s,fs) % Note: Needs to be combined with a voiced-voiceless detector to eliminate % spurious closures in unvoiced and silent regions. % % Inputs: % s is the speech signal % fs is the sampling frequncy % % Outputs: % gci is a vector of glottal closure sample numbers % gco is a vector of glottal opening sample numbers derived from % an assumed constant closed-phase fraction % % References: % [1] P. A. Naylor, A. Kounoudes, J. Gudnason, and M. Brookes, Estimation of Glottal Closure % Instants in Voiced Speech using the DYPSA Algorithm, IEEE Trans on Speech and Audio % Processing, vol. 15, pp. 3443, Jan. 2007. % [2] M. Brookes, P. A. Naylor, and J. Gudnason, A Quantitative Assessment of Group Delay Methods % for Identifying Glottal Closures in Voiced Speech, IEEE Trans on Speech & Audio Processing, % vol. 14, no. 2, pp. 456466, Mar. 2006. % [3] A. Kounoudes, P. A. Naylor, and M. Brookes, The DYPSA algorithm for estimation of glottal % closure instants in voiced speech, in Proc ICASSP 2002, vol. 1, Orlando, 2002, pp. 349352. % [4] C. Ma, Y. Kamp, and L. F. Willems, A Frobenius norm approach to glottal closure detection % from the speech signal, IEEE Trans. Speech Audio Processing, vol. 2, pp. 258265, Apr. 1994. % [5] A. Kounoudes, Epoch Estimation for Closed-Phase Analysis of Speech, PhD Thesis, % Imperial College, 2001. % Algorithm Parameters % The following parameters are defined in voicebox() % % dy_cpfrac=0.3; % presumed closed phase fraction of larynx cycle % dy_cproj=0.2; % cost of projected candidate % dy_cspurt=-0.45; % cost of a talkspurt % dy_dopsp=1; % Use phase slope projection (1) or not (0)? % dy_ewdly=0.0008; % window delay for energy cost function term [~ energy peak delay from closure] (sec) % dy_ewlen=0.003; % window length for energy cost function term (sec) % dy_ewtaper=0.001; % taper length for energy cost function window (sec) % dy_fwlen=0.00045; % window length used to smooth group delay (sec) % dy_fxmax=500; % max larynx frequency (Hz) % dy_fxmin=50; % min larynx frequency (Hz) % dy_fxminf=60; % min larynx frequency (Hz) [used for Frobenius norm only] % dy_gwlen=0.0030; % group delay evaluation window length (sec) % dy_lpcdur=0.020; % lpc analysis frame length (sec) % dy_lpcn=2; % lpc additional poles % dy_lpcnf=0.001; % lpc poles per Hz (1/Hz) % dy_lpcstep=0.010; % lpc analysis step (sec) % dy_nbest=5; % Number of NBest paths to keep % dy_preemph=50; % pre-emphasis filter frequency (Hz) (to avoid preemphasis, make this very large) % dy_spitch=0.2; % scale factor for pitch deviation cost % dy_wener=0.3; % DP energy weighting % dy_wpitch=0.5; % DP pitch weighting % dy_wslope=0.1; % DP group delay slope weighting % dy_wxcorr=0.8; % DP cross correlation weighting % dy_xwlen=0.01; % cross-correlation length for waveform similarity (sec) % Revision History: % % 3.0 - 29 Jun 2006 - Rewrote DP function to improve speed % 2.6 - 29 Jun 2006 - Tidied up algorithm parameters % 2.4 - 10 Jun 2006 - Made into a single file aand put into VOICEBOX % 2.3 - 18 Mar 2005 - Removed 4kHz filtering of phase-slope function % 2.2 - 05 Oct 2004 - dpgci uses the slopes returned from xewgrdel % - gdwav from speech with fs<9000 is not filtered % - Various outputs and inputs of functions have been % removed since now there is no plotting % 1.0 - 30 Jan 2001 - Initial version [5] % Bugs: % 1. Allow the projections only to extend to the end of the larynx cycle % 2. Compensate for false pitch period cost at the start of a voicespurt % 3. Should include energy and pahse-slope costs for the first closeure of a voicespurt % 4. should delete candidates that are too close to the beginning or end of speech for the cost measures % currently this is 200 samples fixed in the main routine but it should adapt to window lengths of % cross-correlation, lpc and energy measures. % 5. Should have an integrated voiced/voiceless detector % 6. Allow dypsa to be called in chunks for a long speech file % 7. Do forward & backward passes to allow for gradient descent and/or discriminative training % 8. Glottal opening approximation does not really belong in this function % 9. The cross correlation window is asymmetric (and overcomplex) for no very good reason % 10. Incorporate -0.5 factor into dy_wxcorr and abolish erroneous (nx2-1)/(nx2-2) factor % 11. Add in talkspurt cost at the beginning rather than the end of a spurt (more efficient) % 12. Remove qmin>2 condition from voicespurt start detection (DYPSA 2 compatibility) in two places % 13. Include energy and phase-slope costs at the start of a voicespurt % 14. Single-closure voicespurt are only allowed if nbest=1 (should always be forbidden) % 15. Penultimate closure candidate is always acceptd % 16. Final element of gcic, Cfn and Ch is unused % 17. Needs to cope better with irregular voicing (e.g. creaky voice) % 18. Should give non-integer GCI positions for improved accuracy % 19. Remove constraint that first voicespurt cannot begin until qrmax after the first candidate % Copyright (C) Tasos Kounoudes, Jon Gudnason, Patrick Naylor and Mike Brookes 2006 % Version: $Id: dypsa.m,v 3.6 2007/05/04 07:01:38 dmb Exp $ % % VOICEBOX is a MATLAB toolbox for speech processing. % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % This program is free software; you can redistribute it and/or modify % it under the terms of the GNU General Public License as published by % the Free Software Foundation; either version 2 of the License, or % (at your option) any later version. % % This program is distributed in the hope that it will be useful, % but WITHOUT ANY WARRANTY; without even the implied warranty of % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the % GNU General Public License for more details. % % You can obtain a copy of the GNU General Public License from % http://www.gnu.org/copyleft/gpl.html or by writing to % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Extract algorithm constants from VOICEBOX dy_preemph=voicebox('dy_preemph'); dy_lpcstep=voicebox('dy_lpcstep'); dy_lpcdur=voicebox('dy_lpcdur'); dy_dopsp=voicebox('dy_dopsp'); % Use phase slope projection (1) or not (0)? dy_ewtaper=voicebox('dy_ewtaper'); % Prediction order of FrobNorm method in seconds dy_ewlen=voicebox('dy_ewlen'); % windowlength of FrobNorm method in seconds dy_ewdly=voicebox('dy_ewdly'); % shift for assymetric speech shape at start of voiced cycle dy_cpfrac=voicebox('dy_cpfrac'); % presumed ratio of larynx cycle that is closed dy_lpcnf=voicebox('dy_lpcnf'); % lpc poles per Hz (1/Hz) dy_lpcn=voicebox('dy_lpcn'); % lpc additional poles lpcord=ceil(fs*dy_lpcnf+dy_lpcn); % lpc poles %PreEmphasise input speech s_used=filter([1 -exp(-2*pi*dy_preemph/fs)],1,s); % perform LPC analysis, AC method with Hamming windowing [ar, e, k] = lpcauto(s_used,lpcord,flo

评论收藏

内容反馈