# Digital Human Intelligent Dialogue System - Linly-Talker â 'Interactive Dialogue with Your Virtual Self'
<div align="center">
<h1>Linly-Talker WebUI</h1>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/Kedreamix/Linly-Talker)
<img src="docs/linly_logo.png" /><br>
[![Open In Colab](https://img.shields.io/badge/Colab-F9AB00?style=for-the-badge&logo=googlecolab&color=525252)](https://colab.research.google.com/github/Kedreamix/Linly-Talker/blob/main/colab_webui.ipynb)
[![Licence](https://img.shields.io/badge/LICENSE-MIT-green.svg?style=for-the-badge)](https://github.com/Kedreamix/Linly-Talker/blob/main/LICENSE)
[![Huggingface](https://img.shields.io/badge/ð¤%20-Models%20Repo-yellow.svg?style=for-the-badge)](https://huggingface.co/Kedreamix/Linly-Talker)
[**English**](./README.md) | [**ä¸æç®ä½**](./README_zh.md)
</div>
**2023.12 Update** ð
**Users can upload any images for the conversation**
**2024.01 Update** ðð
- **Exciting news! I've now incorporated both the powerful GeminiPro and Qwen large models into our conversational scene. Users can now upload images during the conversation, adding a whole new dimension to the interactions.**
- **The deployment invocation method for FastAPI has been updated.**
- **The advanced settings options for Microsoft TTS have been updated, increasing the variety of voice types. Additionally, video subtitles have been introduced to enhance visualization.**
- **Updated the GPT multi-turn conversation system to establish contextual connections in dialogue, enhancing the interactivity and realism of the digital persona.**
**2024.02 Update** ð
- **Updated Gradio to the latest version 4.16.0, providing the interface with additional functionalities such as capturing images from the camera to create digital personas, among others.**
- **ASR and THG have been updated. FunASR from Alibaba has been integrated into ASR, enhancing its speed significantly. Additionally, the THG section now incorporates the Wav2Lip model, while ER-NeRF is currently in preparation (Coming Soon).**
- **I have incorporated the GPT-SoVITS model, which is a voice cloning method. By fine-tuning it with just one minute of a person's speech data, it can effectively clone their voice. The results are quite impressive and worth recommending.**
- **I have integrated a web user interface (WebUI) that allows for better execution of Linly-Talker.**
**2024.04 Update** ð
- **Updated the offline mode for Paddle TTS, excluding Edge TTS.**
- **Updated ER-NeRF as one of the choices for Avatar generation.**
- **Updated app_talk.py to allow for the free upload of voice and images/videos for generation without being based on a dialogue scenario.**
**2024.05 Update** ð
- **Updated the beginner-friendly AutoDL deployment tutorial, and also updated the codewithgpu image, allowing for one-click experience and learning.**
- **Updated WebUI.py: Linly-Talker WebUI now supports multiple modules, multiple models, and multiple options**
**2024.06 Update** ð
- **Integrated MuseTalk into Linly-Talker and updated the WebUI, enabling basic real-time conversation capabilities.**
- **The refined WebUI defaults to not loading the LLM model to reduce GPU memory usage. It directly responds with text to complete voiceovers. The enhanced WebUI features three main functions: personalized character generation, multi-turn intelligent dialogue with digital humans, and real-time MuseTalk conversations. These improvements reduce previous GPU memory redundancies and add more prompts to assist users effectively.**
**2024.08 Update** ð
- **Updated CosyVoice to offer high-quality text-to-speech (TTS) functionality and voice cloning capabilities; also upgraded to Wav2Lipv2 to enhance overall performance.**
**2024.09 Update** ð
- **Added Linly-Talker API documentation, providing detailed interface descriptions to help users access Linly-Talkerâs features via the API.**
---
<details>
<summary>Content</summary>
<!-- TOC -->
- [Digital Human Intelligent Dialogue System - Linly-Talker â 'Interactive Dialogue with Your Virtual Self'](#digital-human-intelligent-dialogue-system---linly-talker--interactive-dialogue-with-your-virtual-self)
- [Introduction](#introduction)
- [TO DO LIST](#to-do-list)
- [Example](#example)
- [Setup Environment](#setup-environment)
- [API Documentation](#api-documentation)
- [ASR - Speech Recognition](#asr---speech-recognition)
- [Whisper](#whisper)
- [FunASR](#funasr)
- [Coming Soon](#coming-soon)
- [TTS - Text To Speech](#tts---text-to-speech)
- [Edge TTS](#edge-tts)
- [PaddleTTS](#paddletts)
- [Coming Soon](#coming-soon-1)
- [Voice Clone](#voice-clone)
- [GPT-SoVITSï¼Recommendï¼](#gpt-sovitsrecommend)
- [XTTS](#xtts)
- [CosyVoice](#cosyvoice)
- [Coming Soon](#coming-soon-2)
- [THG - Avatar](#thg---avatar)
- [SadTalker](#sadtalker)
- [Wav2Lip](#wav2lip)
- [Wav2Lipv2](#wav2lipv2)
- [ER-NeRF](#er-nerf)
- [MuseTalk](#musetalk)
- [Coming Soon](#coming-soon-3)
- [LLM - Conversation](#llm---conversation)
- [Linly-AI](#linly-ai)
- [Qwen](#qwen)
- [Gemini-Pro](#gemini-pro)
- [ChatGPT](#chatgpt)
- [ChatGLM](#chatglm)
- [GPT4Free](#gpt4free)
- [LLM Multiple Model Selection](#llm-multiple-model-selection)
- [Coming Soon](#coming-soon-4)
- [Optimizations](#optimizations)
- [Gradio](#gradio)
- [Start WebUI](#start-webui)
- [WebUI](#webui)
- [Old Verison](#old-verison)
- [Folder structure](#folder-structure)
- [Reference](#reference)
- [License](#license)
- [Star History](#star-history)
<!-- /TOC -->
</details>
## Introduction
Linly-Talker is an innovative digital human conversation system that integrates the latest artificial intelligence technologies, including Large Language Models (LLM) ð¤, Automatic Speech Recognition (ASR) ðï¸, Text-to-Speech (TTS) ð£ï¸, and voice cloning technology ð¤. This system offers an interactive web interface through the Gradio platform ð, allowing users to upload images ð· and engage in personalized dialogues with AI ð¬.
The core features of the system include:
1. **Multi-Model Integration**: Linly-Talker combines major models such as Linly, GeminiPro, Qwen, as well as visual models like Whisper, SadTalker, to achieve high-quality dialogues and visual generation.
2. **Multi-Turn Conversational Ability**: Through the multi-turn dialogue system powered by GPT models, Linly-Talker can understand and maintain contextually relevant and coherent conversations, significantly enhancing the authenticity of the interaction.
3. **Voice Cloning**: Utilizing technologies like GPT-SoVITS, users can upload a one-minute voice sample for fine-tuning, and the system will clone the user's voice, enabling the digital human to converse in the user's voice.
4. **Real-Time Interaction**: The system supports real-time speech recognition and video captioning, allowing users to communicate naturally with the digital human via voice.
5. **Visual Enhancement**: With digital human generation technologies, Linly-Talker can create realistic digital human avatars, providing a more immersive experience.
The design philosophy of Linly-Talker is to create a new form of human-computer interaction that goes beyond simple Q&A. By integrating advanced technologies, it offers an intelligent digital human capable of understanding, responding to, and simulating human communication.
![The system architecture of multimodal humanâcomputer interaction.](docs/HOI_en.png)
> [!NOTE]
>
> You can watch the demo video [here](https://www.bilibili.com/video/BV1rN4y1a76x/).
>
> I have recorded a series of videos on Bilibili, which also represent every step of my updates and methods of use. For detailed information, please refer to [Digital Human Dialogue System - Linly-Talker Collection](https://space.bilib
萧鼎
- 粉丝: 3w+
- 资源: 157
最新资源
- 纤维混凝土,纤维骨料细观尺度混凝土模型,可控制骨料,纤维的尺寸和体积率 四面体网格划分及六面体网格投影 模型可用于abaqus Ansys ls-dyna flac3d等有限元软件
- 户外储能电源方案双向逆变器板资料,原理文件,PCB文件,源代码,电感与变压器规格参数,户外储能电源2KW(最大3KW)双向逆变电源生产资料,本生产资料含有前级DCDC源程序,后级的SPWM 本户外储能
- maxwell外转子电机设计,外转子电机电磁仿真
- Shizuku_.apk
- 光储交直流微电网离并网变 仿真模型由光伏PV及其DC DC变器、储能及其双向DC DC变器、直流负载、逆变器、交流负载、断路器以及交流主网组成的光储交直流微电网 光储交直流电网 运行目标: 储能控制
- prescan,carsim,simulink三软件联合仿真,实现弯道超车,避撞前方机动车,使用frent坐标系下五次多项式规划加模型预测控制,有横向轨迹跟踪对比图,仿真图 可包调试运行 需要安装
- 模拟IC设计,Sigma-delta ADC,三阶单环Sigma-Delta 调制器,实际电路和仿真状态,enob为16;并有文档相关说明;用好需要有一定的模拟IC基础,适合sd adc入门用
- 双向Buck-Boost电路仿真,闭环控制算法
- Comsol等离子体仿真,Ar细通道棒板流注放电 电子密度,电子温度等
- Comsol等离子体仿真,Ar棒板粗通道流注放电 电子密度,电子温度,电场强度等 5.5,6.0版本
- 2408统计表 ,2408统计表
- COMSOL电磁超声仿真: Crack detection in L-shaped aluminum plate via electromagnetic ultrasonic measurements
- Comsol波导BIC
- 光伏储能基于VSG同步发电机控制的并网仿真模型 基于Matlab Simulink仿真平台 储能为buck-boost电路(双向DC DC变) 光伏为boost电路 主电路采用三相全桥PWM逆变器 1
- 斜碰,侧碰,偏置碰撞有限元模型,ls dyna
- 两极式单相光伏并网仿真 前极:Boost电路+电导增量法 后极:桥式逆变+L型滤波+电压外环电流内环控制 并网电流和电网电压同频同相,单位功率因数并网,谐波失真率0.39%,并网效率高 有配套vid
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈