function [gci,goi] = dypsa(s,fs)
%DYPSA Derive glottal closure instances from speech [gci,goi] = (s,fs)
% Note: Needs to be combined with a voiced-voiceless detector to eliminate
% spurious closures in unvoiced and silent regions.
%
% Inputs:
% s is the speech signal
% fs is the sampling frequncy
%
% Outputs:
% gci is a vector of glottal closure sample numbers
% gco is a vector of glottal opening sample numbers derived from
% an assumed constant closed-phase fraction
%
% References:
% [1] P. A. Naylor, A. Kounoudes, J. Gudnason, and M. Brookes, Estimation of Glottal Closure
% Instants in Voiced Speech using the DYPSA Algorithm, IEEE Trans on Speech and Audio
% Processing, vol. 15, pp. 3443, Jan. 2007.
% [2] M. Brookes, P. A. Naylor, and J. Gudnason, A Quantitative Assessment of Group Delay Methods
% for Identifying Glottal Closures in Voiced Speech, IEEE Trans on Speech & Audio Processing,
% vol. 14, no. 2, pp. 456466, Mar. 2006.
% [3] A. Kounoudes, P. A. Naylor, and M. Brookes, The DYPSA algorithm for estimation of glottal
% closure instants in voiced speech, in Proc ICASSP 2002, vol. 1, Orlando, 2002, pp. 349352.
% [4] C. Ma, Y. Kamp, and L. F. Willems, A Frobenius norm approach to glottal closure detection
% from the speech signal, IEEE Trans. Speech Audio Processing, vol. 2, pp. 258265, Apr. 1994.
% [5] A. Kounoudes, Epoch Estimation for Closed-Phase Analysis of Speech, PhD Thesis,
% Imperial College, 2001.
% Algorithm Parameters
% The following parameters are defined in voicebox()
%
% dy_cpfrac=0.3; % presumed closed phase fraction of larynx cycle
% dy_cproj=0.2; % cost of projected candidate
% dy_cspurt=-0.45; % cost of a talkspurt
% dy_dopsp=1; % Use phase slope projection (1) or not (0)?
% dy_ewdly=0.0008; % window delay for energy cost function term [~ energy peak delay from closure] (sec)
% dy_ewlen=0.003; % window length for energy cost function term (sec)
% dy_ewtaper=0.001; % taper length for energy cost function window (sec)
% dy_fwlen=0.00045; % window length used to smooth group delay (sec)
% dy_fxmax=500; % max larynx frequency (Hz)
% dy_fxmin=50; % min larynx frequency (Hz)
% dy_fxminf=60; % min larynx frequency (Hz) [used for Frobenius norm only]
% dy_gwlen=0.0030; % group delay evaluation window length (sec)
% dy_lpcdur=0.020; % lpc analysis frame length (sec)
% dy_lpcn=2; % lpc additional poles
% dy_lpcnf=0.001; % lpc poles per Hz (1/Hz)
% dy_lpcstep=0.010; % lpc analysis step (sec)
% dy_nbest=5; % Number of NBest paths to keep
% dy_preemph=50; % pre-emphasis filter frequency (Hz) (to avoid preemphasis, make this very large)
% dy_spitch=0.2; % scale factor for pitch deviation cost
% dy_wener=0.3; % DP energy weighting
% dy_wpitch=0.5; % DP pitch weighting
% dy_wslope=0.1; % DP group delay slope weighting
% dy_wxcorr=0.8; % DP cross correlation weighting
% dy_xwlen=0.01; % cross-correlation length for waveform similarity (sec)
% Revision History:
%
% 3.0 - 29 Jun 2006 - Rewrote DP function to improve speed
% 2.6 - 29 Jun 2006 - Tidied up algorithm parameters
% 2.4 - 10 Jun 2006 - Made into a single file aand put into VOICEBOX
% 2.3 - 18 Mar 2005 - Removed 4kHz filtering of phase-slope function
% 2.2 - 05 Oct 2004 - dpgci uses the slopes returned from xewgrdel
% - gdwav from speech with fs<9000 is not filtered
% - Various outputs and inputs of functions have been
% removed since now there is no plotting
% 1.0 - 30 Jan 2001 - Initial version [5]
% Bugs:
% 1. Allow the projections only to extend to the end of the larynx cycle
% 2. Compensate for false pitch period cost at the start of a voicespurt
% 3. Should include energy and pahse-slope costs for the first closeure of a voicespurt
% 4. should delete candidates that are too close to the beginning or end of speech for the cost measures
% currently this is 200 samples fixed in the main routine but it should adapt to window lengths of
% cross-correlation, lpc and energy measures.
% 5. Should have an integrated voiced/voiceless detector
% 6. Allow dypsa to be called in chunks for a long speech file
% 7. Do forward & backward passes to allow for gradient descent and/or discriminative training
% 8. Glottal opening approximation does not really belong in this function
% 9. The cross correlation window is asymmetric (and overcomplex) for no very good reason
% 10. Incorporate -0.5 factor into dy_wxcorr and abolish erroneous (nx2-1)/(nx2-2) factor
% 11. Add in talkspurt cost at the beginning rather than the end of a spurt (more efficient)
% 12. Remove qmin>2 condition from voicespurt start detection (DYPSA 2 compatibility) in two places
% 13. Include energy and phase-slope costs at the start of a voicespurt
% 14. Single-closure voicespurt are only allowed if nbest=1 (should always be forbidden)
% 15. Penultimate closure candidate is always acceptd
% 16. Final element of gcic, Cfn and Ch is unused
% 17. Needs to cope better with irregular voicing (e.g. creaky voice)
% 18. Should give non-integer GCI positions for improved accuracy
% 19. Remove constraint that first voicespurt cannot begin until qrmax after the first candidate
% Copyright (C) Tasos Kounoudes, Jon Gudnason, Patrick Naylor and Mike Brookes 2006
% Version: $Id: dypsa.m,v 3.6 2007/05/04 07:01:38 dmb Exp $
%
% VOICEBOX is a MATLAB toolbox for speech processing.
% Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% This program is free software; you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation; either version 2 of the License, or
% (at your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You can obtain a copy of the GNU General Public License from
% http://www.gnu.org/copyleft/gpl.html or by writing to
% Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Extract algorithm constants from VOICEBOX
dy_preemph=voicebox('dy_preemph');
dy_lpcstep=voicebox('dy_lpcstep');
dy_lpcdur=voicebox('dy_lpcdur');
dy_dopsp=voicebox('dy_dopsp'); % Use phase slope projection (1) or not (0)?
dy_ewtaper=voicebox('dy_ewtaper'); % Prediction order of FrobNorm method in seconds
dy_ewlen=voicebox('dy_ewlen'); % windowlength of FrobNorm method in seconds
dy_ewdly=voicebox('dy_ewdly'); % shift for assymetric speech shape at start of voiced cycle
dy_cpfrac=voicebox('dy_cpfrac'); % presumed ratio of larynx cycle that is closed
dy_lpcnf=voicebox('dy_lpcnf'); % lpc poles per Hz (1/Hz)
dy_lpcn=voicebox('dy_lpcn'); % lpc additional poles
lpcord=ceil(fs*dy_lpcnf+dy_lpcn); % lpc poles
%PreEmphasise input speech
s_used=filter([1 -exp(-2*pi*dy_preemph/fs)],1,s);
% perform LPC analysis, AC method with Hamming windowing
[ar, e, k] = lpcauto(s_used,lpcord,flo