đī¸ Backend Architecture Overview
Core Component Interaction Flow
đī¸ Configuration file
This is the basic structure of the configuration file as of v1.0.0.
đī¸ Speech Recognition (ASR)
Speech Recognition (ASR, Automatic Speech Recognition) converts user's speech into text. This project supports the implementation of multiple speech recognition models.
đī¸ Language Models (LLM)
This project supports multiple large language model backends and models.
đī¸ Agent
An Agent is an LLM system that includes memory, tools, and personality. The default option in the current version is basicmemoryagent.
đī¸ Speech Synthesis (TTS)
After installing the required dependencies and configuring conf.yaml, enable the corresponding speech synthesis engine by modifying the TTS_MODEL option in conf.yaml.
đī¸ Translation
Translation Feature
đī¸ Docker Deployment
Due to significant refactoring, the Docker image has not yet been updated to version v1.0.0, but it will be updated soon.