1 Voice Analytics
1.1 Business context & Introduction
The whole project consists of two sections, voice analytics and agent automated quality
assurance.
Voice Analytics section serves both as a preliminary step for quality assurance process and as a
independent part to provide telesales team with helper functions regarding improving speech
qualities, visualizing call transcripts, etc.
Voice Analytics section consists of several parts and functions in the following order: 1. Speech
Super Resolution 2. Speech Noise Reduction 3. Voice Activity detection 4. Speaker
Diarization 5. Speech to Text
Voice Analytics section relies on the Malaya-speech package, which is a speech-toolkit for
Bahasa Malaysia, and utilizes pretrained models for Speech Resolution, Speech Noise Reduction,
and Speech to Text. Additionally, we retrain the speech-to-text model using call logs from
Income agents. We apply customized functions and models for Voice Activity Detection and
Speaker Diarization part.
1.2 Set up
For voice analytics section, set up should be mainly focused on the whole environment setup.
First, we need multiple packages for model inference and some other helper functions:
1. Tensorflow
2. Malaya_speech
3. soundfile
4. pydub
5. librosa
Second, we need additional package for model retraining:
1. warp-rnnt
Pretrained models are downloaded from hugging face and saved in /.cache/huggingface folders.
1.3 Functions walkthrough
The overall flow is as the following flow chart:
评论0
最新资源