# Digital Human Intelligent Dialogue System - Linly-Talker â 'Interactive Dialogue with Your Virtual Self'
<div align="center">
<h1>Linly-Talker WebUI</h1>
[](https://github.com/Kedreamix/Linly-Talker)
<img src="docs/linly_logo.png" /><br>
[](https://colab.research.google.com/github/Kedreamix/Linly-Talker/blob/main/colab_webui.ipynb)
[](https://github.com/Kedreamix/Linly-Talker/blob/main/LICENSE)
[](https://huggingface.co/Kedreamix/Linly-Talker)
[**English**](./README.md) | [**ä¸æç®ä½**](./README_zh.md)
</div>
**2023.12 Update** ð
**Users can upload any images for the conversation**
**2024.01 Update** ðð
- **Exciting news! I've now incorporated both the powerful GeminiPro and Qwen large models into our conversational scene. Users can now upload images during the conversation, adding a whole new dimension to the interactions.**
- **The deployment invocation method for FastAPI has been updated.**
- **The advanced settings options for Microsoft TTS have been updated, increasing the variety of voice types. Additionally, video subtitles have been introduced to enhance visualization.**
- **Updated the GPT multi-turn conversation system to establish contextual connections in dialogue, enhancing the interactivity and realism of the digital persona.**
**2024.02 Update** ð
- **Updated Gradio to the latest version 4.16.0, providing the interface with additional functionalities such as capturing images from the camera to create digital personas, among others.**
- **ASR and THG have been updated. FunASR from Alibaba has been integrated into ASR, enhancing its speed significantly. Additionally, the THG section now incorporates the Wav2Lip model, while ER-NeRF is currently in preparation (Coming Soon).**
- **I have incorporated the GPT-SoVITS model, which is a voice cloning method. By fine-tuning it with just one minute of a person's speech data, it can effectively clone their voice. The results are quite impressive and worth recommending.**
- **I have integrated a web user interface (WebUI) that allows for better execution of Linly-Talker.**
**2024.04 Update** ð
- **Updated the offline mode for Paddle TTS, excluding Edge TTS.**
- **Updated ER-NeRF as one of the choices for Avatar generation.**
- **Updated app_talk.py to allow for the free upload of voice and images/videos for generation without being based on a dialogue scenario.**
**2024.05 Update** ð
- **Updated the beginner-friendly AutoDL deployment tutorial, and also updated the codewithgpu image, allowing for one-click experience and learning.**
- **Updated WebUI.py: Linly-Talker WebUI now supports multiple modules, multiple models, and multiple options**
**2024.06 Update** ð
- **Integrated MuseTalk into Linly-Talker and updated the WebUI, enabling basic real-time conversation capabilities.**
- **The refined WebUI defaults to not loading the LLM model to reduce GPU memory usage. It directly responds with text to complete voiceovers. The enhanced WebUI features three main functions: personalized character generation, multi-turn intelligent dialogue with digital humans, and real-time MuseTalk conversations. These improvements reduce previous GPU memory redundancies and add more prompts to assist users effectively.**
**2024.08 Update** ð
- **Updated CosyVoice to offer high-quality text-to-speech (TTS) functionality and voice cloning capabilities; also upgraded to Wav2Lipv2 to enhance overall performance.**
**2024.09 Update** ð
- **Added Linly-Talker API documentation, providing detailed interface descriptions to help users access Linly-Talkerâs features via the API.**
---
<details>
<summary>Content</summary>
<!-- TOC -->
- [Digital Human Intelligent Dialogue System - Linly-Talker â 'Interactive Dialogue with Your Virtual Self'](#digital-human-intelligent-dialogue-system---linly-talker--interactive-dialogue-with-your-virtual-self)
- [Introduction](#introduction)
- [TO DO LIST](#to-do-list)
- [Example](#example)
- [Setup Environment](#setup-environment)
- [API Documentation](#api-documentation)
- [ASR - Speech Recognition](#asr---speech-recognition)
- [Whisper](#whisper)
- [FunASR](#funasr)
- [Coming Soon](#coming-soon)
- [TTS - Text To Speech](#tts---text-to-speech)
- [Edge TTS](#edge-tts)
- [PaddleTTS](#paddletts)
- [Coming Soon](#coming-soon-1)
- [Voice Clone](#voice-clone)
- [GPT-SoVITSï¼Recommendï¼](#gpt-sovitsrecommend)
- [XTTS](#xtts)
- [CosyVoice](#cosyvoice)
- [Coming Soon](#coming-soon-2)
- [THG - Avatar](#thg---avatar)
- [SadTalker](#sadtalker)
- [Wav2Lip](#wav2lip)
- [Wav2Lipv2](#wav2lipv2)
- [ER-NeRF](#er-nerf)
- [MuseTalk](#musetalk)
- [Coming Soon](#coming-soon-3)
- [LLM - Conversation](#llm---conversation)
- [Linly-AI](#linly-ai)
- [Qwen](#qwen)
- [Gemini-Pro](#gemini-pro)
- [ChatGPT](#chatgpt)
- [ChatGLM](#chatglm)
- [GPT4Free](#gpt4free)
- [LLM Multiple Model Selection](#llm-multiple-model-selection)
- [Coming Soon](#coming-soon-4)
- [Optimizations](#optimizations)
- [Gradio](#gradio)
- [Start WebUI](#start-webui)
- [WebUI](#webui)
- [Old Verison](#old-verison)
- [Folder structure](#folder-structure)
- [Reference](#reference)
- [License](#license)
- [Star History](#star-history)
<!-- /TOC -->
</details>
## Introduction
Linly-Talker is an innovative digital human conversation system that integrates the latest artificial intelligence technologies, including Large Language Models (LLM) ð¤, Automatic Speech Recognition (ASR) ðï¸, Text-to-Speech (TTS) ð£ï¸, and voice cloning technology ð¤. This system offers an interactive web interface through the Gradio platform ð, allowing users to upload images ð· and engage in personalized dialogues with AI ð¬.
The core features of the system include:
1. **Multi-Model Integration**: Linly-Talker combines major models such as Linly, GeminiPro, Qwen, as well as visual models like Whisper, SadTalker, to achieve high-quality dialogues and visual generation.
2. **Multi-Turn Conversational Ability**: Through the multi-turn dialogue system powered by GPT models, Linly-Talker can understand and maintain contextually relevant and coherent conversations, significantly enhancing the authenticity of the interaction.
3. **Voice Cloning**: Utilizing technologies like GPT-SoVITS, users can upload a one-minute voice sample for fine-tuning, and the system will clone the user's voice, enabling the digital human to converse in the user's voice.
4. **Real-Time Interaction**: The system supports real-time speech recognition and video captioning, allowing users to communicate naturally with the digital human via voice.
5. **Visual Enhancement**: With digital human generation technologies, Linly-Talker can create realistic digital human avatars, providing a more immersive experience.
The design philosophy of Linly-Talker is to create a new form of human-computer interaction that goes beyond simple Q&A. By integrating advanced technologies, it offers an intelligent digital human capable of understanding, responding to, and simulating human communication.

> [!NOTE]
>
> You can watch the demo video [here](https://www.bilibili.com/video/BV1rN4y1a76x/).
>
> I have recorded a series of videos on Bilibili, which also represent every step of my updates and methods of use. For detailed information, please refer to [Digital Human Dialogue System - Linly-Talker Collection](https://space.bilib

萧鼎
- 粉丝: 3w+
- 资源: 158
最新资源
- 【微信小程序源码】京东首页demo
- 《大闹天宫》动画美术风格中的中国传统元素分析_张星辉.caj
- VCU Simulink需求与功能开发文档:集成档位控制、ON Start启动、上下电管理、扭矩调控、能量优化与滑行回收的全方位控制系统需求说明,VCU Simulink需求与功能开发文档:集成档位控
- 基于COMSOL Multiphysics的三维岩石酸化过程模拟:探讨酸液在碳酸盐岩储层中的流动、传质与反应机制,利用COMSOL Multiphysics模拟三维岩石酸化过程:探讨酸液在碳酸盐岩储层
- 台达DVP PLC与西门子V20变频器通讯程序:可靠控制,自动化调整,接线与设置指南,台达DVP PLC与西门子V20变频器通讯程序:可靠控制,自动化调整,接线与设置指南,台达DVP PLC与3台西门
- 基于Python的Django-vue基于大数据的学习资源推送系统实现源码-说明文档-演示视频.zip
- PHP API 客户端,可让您与 deepseek API 进行交互 deepseek-php-client-2.0.3
- 【微信小程序源码】和茶网
- 自然启发MPPT优化技术,霜冰优化算法RIME在MPPT中对光伏局部遮阴情况的性能提升研究,霜冰算法RIME优化mppt,光伏mppt , 局部遮阴光伏mppt 2023年,H Su等人受到自然界霜冰
- 使用 PHP Deepseek 实现问答 ask-deepseek
- COMSOL Multiphysics中的comsol支架静态分析:基本原理、操作与结果分析,COMSOL Multiphysics中的comsol支架静态分析:基本原理、操作与结果分析,comsol
- 基于Python的Django-vue基于大数据的银行信用卡用户的数仓系统源码-说明文档-演示视频.zip
- 翱捷功能机常见空间问题的解决
- 西门子博途1500双驱同步编程实例分享,结构化编程、伺服同步运行、多用户权限登录,开发者必备的学习参考(版本v16),西门子博途V16全新双驱同步与三轴码垛程序:结构化编程框架,多用户权限控制,高值学
- DotSpatial库学习
- Delphi 12.5 控件之delphi实现腾讯签名算算法源代码.rar
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈


