LPCvocoder-ucla-working-2002.rar_LPC_LPCvocoder_WORKING

共11个文件

m：6个

wav：5个

版权申诉

169 浏览量 2022-07-14 03:56:24 上传评论收藏 212KB RAR 举报

标题中的"LPC vocoder-ucla-working-2002.rar"指的是一个压缩包文件，它包含了一种叫做线性预测编码（LPC）的语音编码技术的实现。这个压缩包可能来自于2002年在加州大学洛杉矶分校（UCLA）进行的研究工作，并且已经成功运行，标记为"WORKING_predic"，暗示这是一个有效的、经过验证的线性预测编码预测器。线性预测编码（LPC）是一种广泛应用于语音编码、音频压缩和信号处理的技术。它的基本原理是通过分析信号的过去值来预测未来的值，从而减少需要存储或传输的数据量。在语音通信领域，LPC特别有用，因为它能够以相对低的比特率保持可理解的语音质量。描述中的".m file for lps prediction"指出，这个压缩包包含的是一个MATLAB文件（.m文件），用于执行线性谱预测（LPS）。LPS是LPC的一个变体，它涉及到对频谱系数而非原始样本进行预测。MATLAB是一种强大的数值计算和可视化环境，非常适合进行这种复杂信号处理任务。在标签中，我们看到"lpc"、"lpc_vocoder"、"working"、"prediction"和"vocoder_lpc"，这些都是与LPC相关的关键词。"lpc_vocoder"直接指出了这是LPC语音编码器，而"working"和"prediction"强调了代码的可用性和预测功能。"vocoder_lpc"可能是对LPC语音编码器的一种特定命名或分类。根据提供的压缩包文件名称列表，"LPC vocoder-ucla-working-2002"很可能包含了实现LPC语音编码算法的MATLAB代码，可能包括数据预处理、参数估计、预测系数计算、逆滤波等步骤。这些代码可能还包括了用于测试和验证LPC性能的示例数据集，以及必要的说明文档或实验结果。这个压缩包是一个研究级的LPC语音编码工具，适用于学术研究或工程应用，特别是那些需要高效、低带宽语音传输的场景。通过深入理解和应用这个LPC vocoder，可以学习到关于线性预测编码的理论知识，以及如何在实际项目中实施这一技术。

资源推荐

资源详情

资源评论

收起资源包目录

LPC vocoder-ucla-working-2002.rar （11个子文件）

LPC vocoder-ucla-working-2002

zescene4.wav 32KB

zescene2.wav 32KB

Main.m 1KB

synlpc2.m 4KB

proclpc.m 6KB

zescene3.wav 188KB

zescene1.wav 48KB

synlpc1.m 4KB

zescene5.wav 32KB

speechcoder2.m 2KB

speechcoder1.m 1KB

function [aCoeff,resid,pitch,G,parcor,stream] = proclpc(data,sr,L,fr,fs,preemp) % USAGE: [aCoeff,resid,pitch,G,parcor,stream] = proclpc(data,sr,L,fr,fs,preemp) % % This function computes the LPC (linear-predictive coding) coefficients that % describe a speech signal. The LPC coefficients are a short-time measure of % the speech signal which describe the signal as the output of an all-pole % filter. This all-pole filter provides a good description of the speech % articulators; thus LPC analysis is often used in speech recognition and % speech coding systems. The LPC parameters are recalculated, by default in % this implementation, every 20ms. % % The results of LPC analysis are a new representation of the signal % s(n) = G e(n) - sum from 1 to L a(i)s(n-i) % where s(n) is the original data. a(i) and e(n) are the outputs of the LPC % analysis with a(i) representing the LPC model. The e(n) term represents % either the speech source's excitation, or the residual: the details of the % signal that are not captured by the LPC coefficients. The G factor is a % gain term. % % LPC analysis is performed on a monaural sound vector (data) which has been % sampled at a sampling rate of "sr". The following optional parameters modify % the behaviour of this algorithm. % L - The order of the analysis. There are L+1 LPC coefficients in the output % array aCoeff for each frame of data. L defaults to 13. % fr - Frame time increment, in ms. The LPC analysis is done starting every % fr ms in time. Defaults to 20ms (50 LPC vectors a second) % fs - Frame size in ms. The LPC analysis is done by windowing the speech % data with a rectangular window that is fs ms long. Defaults to 30ms % preemp - This variable is the epsilon in a digital one-zero filter which % serves to preemphasize the speech signal and compensate for the 6dB % per octave rolloff in the radiation function. Defaults to .9378. % % The output variables from this function are % aCoeff - The LPC analysis results, a(i). One column of L numbers for each % frame of data % resid - The LPC residual, e(n). One column of sr*fs samples representing % the excitation or residual of the LPC filter. % pitch - A frame-by-frame estimate of the pitch of the signal, calculated % by finding the peak in the residual's autocorrelation for each frame. % G - The LPC gain for each frame. % parcor - The parcor coefficients. The parcor coefficients give the ratio % between adjacent sections in a tubular model of the speech % articulators. There are L parcor coefficients for each frame of % speech. % stream - The LPC analysis' residual or excitation signal as one long vector. % Overlapping frames of the resid output combined into a new one- % dimensional signal and post-filtered. % % The synlpc routine inverts this transform and returns the original speech % signal. % % This code was graciously provided by: % Delores Etter (University of Colorado, Boulder) and % Professor Geoffrey Orsak (Southern Methodist University) % It was first published in % Orsak, G.C. et al. "Collaborative SP education using the Internet and % MATLAB" IEEE SIGNAL PROCESSING MAGAZINE Nov. 1995. vol.12, no.6, pp. % 23-32. % Modified and debugging plots added by Kate Nguyen and Malcolm Slaney % A more complete set of routines for LPC analysis can be found at % http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html % (c) 1998 Interval Research Corporation if (nargin<3), L = 13; end if (nargin<4), fr = 20; end if (nargin<5), fs = 30; end if (nargin<6), preemp = .9378; end [row col] = size(data); if col==1 data=data'; end nframe = 0; msfr = round(sr/1000*fr); % Convert ms to samples msfs = round(sr/1000*fs); % Convert ms to samples duration = length(data); speech = filter([1 -preemp], 1, data)'; % Preemphasize speech msoverlap = msfs - msfr; ramp = [0:1/(msoverlap-1):1]'; % Compute part of window for frameIndex=1:msfr:duration-msfs+1 % frame rate=20ms frameData = speech(frameIndex:(frameIndex+msfs-1)); % frame size=30ms nframe = nframe+1; autoCor = xcorr(frameData); % Compute the cross correlation autoCorVec = autoCor(msfs+[0:L]); % Levinson's method err(1) = autoCorVec(1); k(1) = 0; A = []; for index=1:L numerator = [1 A.']*autoCorVec(index+1:-1:2); denominator = -1*err(index); k(index) = numerator/denominator; % PARCOR coeffs A = [A+k(index)*flipud(A); k(index)]; err(index+1) = (1-k(index)^2)*err(index); end aCoeff(:,nframe) = [1; A]; parcor(:,nframe) = k'; % Calculate the filter % response % by evaluating the % z-transform if 0 gain=0; cft=0:(1/255):1; for index=1:L gain = gain + aCoeff(index,nframe)*exp(-i*2*pi*cft).^index; end gain = abs(1./gain); spec(:,nframe) = 20*log10(gain(1:128))'; plot(20*log10(gain)); title(nframe); drawnow; end % Calculate the filter response % from the filter's impulse % response (to check above). if 0 impulseResponse = filter(1, aCoeff(:,nframe), [1 zeros(1,255)]); freqResp = 20*log10(abs(fft(impulseResponse))); plot(freqResp); end errSig = filter([1 A'],1,frameData); % find excitation noise G(nframe) = sqrt(err(L+1)); % gain autoCorErr = xcorr(errSig); % calculate pitch & voicing information [B,I] = sort(autoCorErr); num = length(I); if B(num-1) > .01*B(num) pitch(nframe) = abs(I(num) - I(num-1)); else pitch(nframe) = 0; end % calculate additional info to improve the compressed sound quality resid(:,nframe) = errSig/G(nframe); if(frameIndex==1) % add residual frames using a trapezoidal window stream = resid(1:msfr,nframe); else stream = [stream; overlap+resid(1:msoverlap,nframe).*ramp; resid(msoverlap+1:msfr,nframe)]; end if(frameIndex+msfr+msfs-1 > duration) stream = [stream; resid(msfr+1:msfs,nframe)]; else overlap = resid(msfr+1:msfs,nframe).*flipud(ramp); end end stream = filter(1, [1 -preemp], stream)';

评论收藏

内容反馈

版权申诉