openrag/docs/docs/reference/configuration.mdx
2025-09-30 18:16:12 -04:00

120 lines
No EOL
8 KiB
Text

---
title: Environment variables and configuration values
slug: /reference/configuration
---
OpenRAG supports multiple configuration methods with the following priority, from highest to lowest:
1. [Configuration file (`config.yaml`)](#configuration-file) - The `config.yaml` file is generated with values input during [Application onboarding](/install#application-onboarding), and controls the [OpenRAG configuration variables](#openrag-config-variables).
2. [Environment variables](#environment-variables) - Environment variables control how OpenRAG connects to services. Environment variables in the `.env` control underlying services such as Langflow authentication, Oauth settings, and OpenSearch security.
3. [Langflow runtime overrides](#langflow-runtime-overrides)
4. [Default or fallback values](#default-values-and-fallbacks)
## OpenRAG configuration variables {#openrag-config-variables}
These values control what the OpenRAG application does.
### Provider settings
| Variable | Description | Default |
| -------------------- | ---------------------------------------- | -------- |
| `MODEL_PROVIDER` | Model provider (openai, anthropic, etc.) | `openai` |
| `PROVIDER_API_KEY` | API key for the model provider. | |
| `PROVIDER_ENDPOINT` | Custom provider endpoint. Only used for IBM or Ollama providers. | |
| `PROVIDER_PROJECT_ID`| Project ID for providers. Only required for the IBM watsonx.ai provider. | |
| `OPENAI_API_KEY` | OpenAI API key. | |
### Knowledge settings
| Variable | Description | Default |
| ------------------------------ | --------------------------------------- | ------------------------ |
| `EMBEDDING_MODEL` | Embedding model for vector search. | `text-embedding-3-small` |
| `CHUNK_SIZE` | Text chunk size for document processing. | `1000` |
| `CHUNK_OVERLAP` | Overlap between chunks. | `200` |
| `OCR_ENABLED` | Enable OCR for image processing. | `true` |
| `PICTURE_DESCRIPTIONS_ENABLED` | Enable picture descriptions. | `false` |
### Agent settings
| Variable | Description | Default |
| --------------- | --------------------------------- | ------------------------ |
| `LLM_MODEL` | Language model for the chat agent. | `gpt-4o-mini` |
| `SYSTEM_PROMPT` | System prompt for the agent. | "You are a helpful AI assistant with access to a knowledge base. Answer questions based on the provided context." |
## Environment variables
## Required variables
| Variable | Description |
| ----------------------------- | ------------------------------------------- |
| `OPENAI_API_KEY` | Your OpenAI API key |
| `OPENSEARCH_PASSWORD` | Password for OpenSearch admin user |
| `LANGFLOW_SUPERUSER` | Langflow admin username |
| `LANGFLOW_SUPERUSER_PASSWORD` | Langflow admin password |
| `LANGFLOW_CHAT_FLOW_ID` | This value is pre-filled. The default value is found in [.env.example](https://github.com/langflow-ai/openrag/blob/main/.env.example). |
| `LANGFLOW_INGEST_FLOW_ID` | This value is pre-filled. The default value is found in [.env.example](https://github.com/langflow-ai/openrag/blob/main/.env.example). |
| `NUDGES_FLOW_ID` | This value is pre-filled. The default value is found in [.env.example](https://github.com/langflow-ai/openrag/blob/main/.env.example). |
## Ingestion configuration
| Variable | Description |
| ------------------------------ | ------------------------------------------------------ |
| `DISABLE_INGEST_WITH_LANGFLOW` | Disable Langflow ingestion pipeline. Default: `false`. |
- `false` or unset: Uses Langflow pipeline (upload → ingest → delete).
- `true`: Uses traditional OpenRAG processor for document ingestion.
## Optional variables
| Variable | Description |
| ------------------------------------------------------------------------- | ------------------------------------------------------------------ |
| `OPENSEARCH_HOST` | OpenSearch host (default: `localhost`) |
| `OPENSEARCH_PORT` | OpenSearch port (default: `9200`) |
| `OPENSEARCH_USERNAME` | OpenSearch username (default: `admin`) |
| `LANGFLOW_URL` | Langflow URL (default: `http://localhost:7860`) |
| `LANGFLOW_PUBLIC_URL` | Public URL for Langflow (default: `http://localhost:7860`) |
| `GOOGLE_OAUTH_CLIENT_ID` / `GOOGLE_OAUTH_CLIENT_SECRET` | Google OAuth authentication |
| `MICROSOFT_GRAPH_OAUTH_CLIENT_ID` / `MICROSOFT_GRAPH_OAUTH_CLIENT_SECRET` | Microsoft OAuth |
| `WEBHOOK_BASE_URL` | Base URL for webhook endpoints |
| `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` | AWS integrations |
| `SESSION_SECRET` | Session management (default: auto-generated, change in production) |
| `LANGFLOW_KEY` | Explicit Langflow API key (auto-generated if not provided) |
| `LANGFLOW_SECRET_KEY` | Secret key for Langflow internal operations |
| `DOCLING_OCR_ENGINE` | OCR engine for document processing |
| `LANGFLOW_AUTO_LOGIN` | Enable auto-login for Langflow (default: `False`) |
| `LANGFLOW_NEW_USER_IS_ACTIVE` | New users are active by default (default: `False`) |
| `LANGFLOW_ENABLE_SUPERUSER_CLI` | Enable superuser CLI (default: `False`) |
| `OPENRAG_DOCUMENTS_PATHS` | Document paths for ingestion (default: `./documents`) |
## Langflow runtime overrides
Langflow runtime overrides allow you to modify component settings at runtime without changing the base configuration.
Runtime overrides are implemented through **tweaks** - parameter modifications that are passed to specific Langflow components during flow execution.
For more information on tweaks, see [Input schema (tweaks)](https://docs.langflow.org/concepts-publish#input-schema).
## Default values and fallbacks
When no environment variables or configuration file values are provided, OpenRAG uses default values.
These values can be found in the code base at the following locations.
### OpenRAG configuration defaults
These values are are defined in `src/config/config_manager.py`.
### System configuration defaults
These fallback values are defined in `src/config/settings.py`.
### TUI default values
These values are defined in `src/tui/managers/env_manager.py`.
### Frontend default values
These values are defined in `frontend/src/lib/constants.ts`.
### Docling preset configurations
These values are defined in `src/api/settings.py`.