
Linly TalkerIt is an innovativeDigital HumanA dialog system that combines Large Language Models (LLMs) with visual models to create a novel approach to human-computer interaction. The system integrates various technologies such as Whisper, Linly, Microsoft Speech Services, and SadTalker generation system, aiming to provide a realistic digital human dialog experience.Linly-Talker supports users to upload images for dialog and enhances interactivity and realism through a multi-round dialog system. The project was developed by Kedreamix and is open-sourced on GitHub for developers and researchers to use and improve.
Linly Talker Features
- Multi-model Integration: Linly-Talker integrates big models such as Linly, GeminiPro, Qwen, and visual models such as Whisper, SadTalker, etc., which enables high quality dialog and visual generation.
- Multi-Round Dialogue Capability: With the multi-round dialog system modeled by GPT, Linly-Talker is able to understand and maintain contextually relevant and coherent dialogs, which greatly enhances the realism of the interaction.
- Voice Cloning: Using technologies such as GPT-SoVITS, users can upload a one-minute voice sample for fine-tuning, and the system will clone the user's voice, enabling the digital person to converse in the user's voice.
- Real-time interaction: The system supports real-time speech recognition and video captioning, enabling users to communicate naturally with digital people via voice.
- Visual Enhancement: Through technologies such as Digital Human Generation, Linly-Talker is able to generate realistic digital human images to provide a more immersive experience.
Official website link:https://github.com/Kedreamix/Linly-Talker