Backend User Guide | Open LLM Vtuber

📄️ Backend Architecture Overview

Core Component Interaction Flow

📄️ Configuration file

This is the basic structure of the configuration file as of v1.0.0.

📄️ Speech Recognition (ASR)

Speech Recognition (ASR, Automatic Speech Recognition) converts user speech to text. This project supports multiple speech recognition model implementations.

📄️ Language Models (LLM)

This project supports multiple large language model backends and models.

📄️ Agent

An Agent is an LLM system that includes memory, tools, and personality. The default option in the current version is basicmemoryagent.

📄️ Speech Synthesis (TTS)

After installing the required dependencies and configuring conf.yaml, enable the corresponding speech synthesis engine by modifying the TTS_MODEL option in conf.yaml.

📄️ Translation

Translation Feature

📄️ Docker Deployment

Due to significant refactoring, the Docker image has not yet been updated to version v1.0.0, but it will be updated soon.

📄️ Character Settings & Prompts

The Open-LLM-VTuber project allows you to modify the character's personality prompt and supports adding multiple different characters that can be switched in the frontend.

📄️ Remote Deployment, Non-Local Deployment, Mobile

If you are: