config.yaml file to configure Memobase Backend.
Full Explanation of config.yaml
We use a single config.yaml file as the source to configure Memobase Backend. An example is like this:
Configuration Categories
Storage and Performance
persistent_chat_blobs: boolean, default tofalse. If set totrue, the chat blobs will be persisted in the database.buffer_flush_interval: int, default to3600(1 hour). Controls how frequently the chat buffer is flushed to persistent storage.max_chat_blob_buffer_token_size: int, default to1024. This is the parameter to control the buffer size of Memobase. Larger numbers lower your LLM cost but increase profile update lag.max_profile_subtopics: int, default to15. The maximum subtopics one topic can have. When a topic has more than this, it will trigger a re-organization.max_pre_profile_token_size: int, default to128. The maximum token size of one profile slot. When a profile slot is larger, it will trigger a re-summary.cache_user_profiles_ttl: int, default to1200(20 minutes). Time-to-live for cached user profiles in seconds.llm_tab_separator: string, default to"::". The separator used for tabs in LLM communications.
Timezone Configuration
use_timezone: string, default tonull. Options include"UTC","America/New_York","Europe/London","Asia/Tokyo", and"Asia/Shanghai". If not set, the system’s local timezone is used.
LLM Configuration
language: string, default to"en", available options{"en", "zh"}. The prompt language of Memobase.llm_style: string, default to"openai", available options{"openai", "doubao_cache"}. The LLM provider style.llm_base_url: string, default tonull. The base URL of any OpenAI-Compatible API.llm_api_key: string, required. Your LLM API key.llm_openai_default_query: dictionary, default tonull. Default query parameters for OpenAI API calls.llm_openai_default_header: dictionary, default tonull. Default headers for OpenAI API calls.best_llm_model: string, default to"gpt-4o-mini". The AI model to use for primary functions.summary_llm_model: string, default tonull. The AI model to use for summarization. If not specified, falls back tobest_llm_model.system_prompt: string, default tonull. Custom system prompt for the LLM.
Embedding Configuration
enable_event_embedding: boolean, default totrue. Whether to enable event embedding.embedding_provider: string, default to"openai", available options{"openai", "jina"}. The embedding provider to use.embedding_api_key: string, default tonull. If not specified and provider is OpenAI, falls back tollm_api_key.embedding_base_url: string, default tonull. For Jina, defaults to"https://api.jina.ai/v1"if not specified.embedding_dim: int, default to1536. The dimension size of the embeddings.embedding_model: string, default to"text-embedding-3-small". For Jina, must be"jina-embeddings-v3".embedding_max_token_size: int, default to8192. Maximum token size for text to be embedded.
Profile Configuration
Check what a profile is in Memobase here.additional_user_profiles: list, default to[]. Add additional user profiles. Each profile should have atopicand a list ofsub_topics.- For
topic, it must have atopicfield and optionally adescriptionfield:
- For each
sub_topic, it must have anamefield (or just be a string) and optionally adescriptionfield:
- For
overwrite_user_profiles: list, default tonull. Format is the same asadditional_user_profiles. Memobase has built-in profile slots likework_title,name, etc. For full control of the slots, use this parameter. The final profile slots will be only those defined here.profile_strict_mode: boolean, default tofalse. Enforces strict validation of profile structure.profile_validate_mode: boolean, default totrue. Enables validation of profile data.
Summary Configuration
minimum_chats_token_size_for_event_summary: int, default to256. Minimum token size required to trigger an event summary.event_tags: list, default to[]. Custom event tags for classification.
Telemetry Configuration
telemetry_deployment_environment: string, default to"local". The deployment environment identifier for telemetry.
Environment Variable Overrides
All configuration values can be overridden using environment variables. The naming convention is to prefix the configuration field name withMEMOBASE_ and convert it to uppercase.
For example, to override the llm_api_key configuration:
- Keeping sensitive information like API keys out of configuration files
- Deploying to different environments (development, staging, production)
- Containerized deployments where environment variables are the preferred configuration method