From f45441a9e5986c4ecbc2252387ed376b6f37426a Mon Sep 17 00:00:00 2001 From: Mendon Kissling <59585235+mendonk@users.noreply.github.com> Date: Mon, 29 Sep 2025 17:09:53 -0400 Subject: [PATCH] slight-cleanup --- docs/docs/configure/configuration.md | 127 +++++++++++++++++---------- 1 file changed, 80 insertions(+), 47 deletions(-) diff --git a/docs/docs/configure/configuration.md b/docs/docs/configure/configuration.md index 2387c2ce..79a5226a 100644 --- a/docs/docs/configure/configuration.md +++ b/docs/docs/configure/configuration.md @@ -1,48 +1,20 @@ --- -title: Configuration +title: Environment variables and configuration values slug: /configure/configuration --- -# Configuration - OpenRAG supports multiple configuration methods with the following priority: 1. **Environment Variables** (highest priority) 2. **Configuration File** (`config.yaml`) -3. **Langflow Flow Settings** (runtime override) -4. **Default Values** (fallback) +3. **Default Values** (fallback) -## Configuration File +## Environment variables -Create a `config.yaml` file in the project root to configure OpenRAG: +Environment variables will override configuration file settings. +You can create a `.env` file in the project root to set these variables. -```yaml -# OpenRAG Configuration File -provider: - model_provider: "openai" # openai, anthropic, azure, etc. - api_key: "your-api-key" # or use OPENAI_API_KEY env var - -knowledge: - embedding_model: "text-embedding-3-small" - chunk_size: 1000 - chunk_overlap: 200 - ocr: true - picture_descriptions: false - -agent: - llm_model: "gpt-4o-mini" - system_prompt: "You are a helpful AI assistant..." -``` - -## Environment Variables - -Environment variables will override configuration file settings. You can still use `.env` files: - -```bash -cp .env.example .env -``` - -## Required Variables +## Required variables | Variable | Description | | ----------------------------- | ------------------------------------------- | @@ -54,7 +26,7 @@ cp .env.example .env | `LANGFLOW_INGEST_FLOW_ID` | ID of your Langflow ingestion flow | | `NUDGES_FLOW_ID` | ID of your Langflow nudges/suggestions flow | -## Ingestion Configuration +## Ingestion configuration | Variable | Description | | ------------------------------ | ------------------------------------------------------ | @@ -63,10 +35,14 @@ cp .env.example .env - `false` or unset: Uses Langflow pipeline (upload → ingest → delete) - `true`: Uses traditional OpenRAG processor for document ingestion -## Optional Variables +## Optional variables | Variable | Description | | ------------------------------------------------------------------------- | ------------------------------------------------------------------ | +| `OPENSEARCH_HOST` | OpenSearch host (default: `localhost`) | +| `OPENSEARCH_PORT` | OpenSearch port (default: `9200`) | +| `OPENSEARCH_USERNAME` | OpenSearch username (default: `admin`) | +| `LANGFLOW_URL` | Langflow URL (default: `http://localhost:7860`) | | `LANGFLOW_PUBLIC_URL` | Public URL for Langflow (default: `http://localhost:7860`) | | `GOOGLE_OAUTH_CLIENT_ID` / `GOOGLE_OAUTH_CLIENT_SECRET` | Google OAuth authentication | | `MICROSOFT_GRAPH_OAUTH_CLIENT_ID` / `MICROSOFT_GRAPH_OAUTH_CLIENT_SECRET` | Microsoft OAuth | @@ -75,20 +51,27 @@ cp .env.example .env | `SESSION_SECRET` | Session management (default: auto-generated, change in production) | | `LANGFLOW_KEY` | Explicit Langflow API key (auto-generated if not provided) | | `LANGFLOW_SECRET_KEY` | Secret key for Langflow internal operations | +| `DOCLING_OCR_ENGINE` | OCR engine for document processing | +| `LANGFLOW_AUTO_LOGIN` | Enable auto-login for Langflow (default: `False`) | +| `LANGFLOW_NEW_USER_IS_ACTIVE` | New users are active by default (default: `False`) | +| `LANGFLOW_ENABLE_SUPERUSER_CLI` | Enable superuser CLI (default: `False`) | +| `OPENRAG_DOCUMENTS_PATHS` | Document paths for ingestion (default: `./documents`) | -## OpenRAG Configuration Variables +## OpenRAG configuration variables These environment variables override settings in `config.yaml`: -### Provider Settings +### Provider settings -| Variable | Description | Default | -| ------------------ | ---------------------------------------- | -------- | -| `MODEL_PROVIDER` | Model provider (openai, anthropic, etc.) | `openai` | -| `PROVIDER_API_KEY` | API key for the model provider | | -| `OPENAI_API_KEY` | OpenAI API key (backward compatibility) | | +| Variable | Description | Default | +| -------------------- | ---------------------------------------- | -------- | +| `MODEL_PROVIDER` | Model provider (openai, anthropic, etc.) | `openai` | +| `PROVIDER_API_KEY` | API key for the model provider | | +| `PROVIDER_ENDPOINT` | Custom provider endpoint (e.g., Watson) | | +| `PROVIDER_PROJECT_ID`| Project ID for providers (e.g., Watson) | | +| `OPENAI_API_KEY` | OpenAI API key (backward compatibility) | | -### Knowledge Settings +### Knowledge settings | Variable | Description | Default | | ------------------------------ | --------------------------------------- | ------------------------ | @@ -98,11 +81,61 @@ These environment variables override settings in `config.yaml`: | `OCR_ENABLED` | Enable OCR for image processing | `true` | | `PICTURE_DESCRIPTIONS_ENABLED` | Enable picture descriptions | `false` | -### Agent Settings +### Agent settings | Variable | Description | Default | | --------------- | --------------------------------- | ------------------------ | | `LLM_MODEL` | Language model for the chat agent | `gpt-4o-mini` | -| `SYSTEM_PROMPT` | System prompt for the agent | Default assistant prompt | +| `SYSTEM_PROMPT` | System prompt for the agent | "You are a helpful AI assistant with access to a knowledge base. Answer questions based on the provided context." | -See `.env.example` for a complete list with descriptions, and `docker-compose*.yml` for runtime usage. +See `docker-compose-*.yml` files for runtime usage examples. + +## Configuration file + +Create a `config.yaml` file in the project root to configure OpenRAG: + +```yaml +# OpenRAG Configuration File +provider: + model_provider: "openai" # openai, anthropic, azure, etc. + api_key: "your-api-key" # or use OPENAI_API_KEY env var + endpoint: "" # For custom provider endpoints (e.g., Watson/IBM) + project_id: "" # For providers that need project IDs (e.g., Watson/IBM) + +knowledge: + embedding_model: "text-embedding-3-small" + chunk_size: 1000 + chunk_overlap: 200 + doclingPresets: "standard" # standard, ocr, picture_description, VLM + ocr: true + picture_descriptions: false + +agent: + llm_model: "gpt-4o-mini" + system_prompt: "You are a helpful AI assistant with access to a knowledge base. Answer questions based on the provided context." +``` + +## Default Values and Fallbacks + +When no environment variables or configuration file values are provided, OpenRAG uses default values. +These values can be found in the code base at the following locations. + +### OpenRAG configuration defaults + +These values are are defined in `src/config/config_manager.py`. + +### System configuration defaults + +These fallback values are defined in `src/config/settings.py`. + +### TUI default values + +These values are defined in `src/tui/managers/env_manager.py`. + +### Frontend default values + +These values are defined in `frontend/src/lib/constants.ts`. + +### Docling preset configurations + +These values are defined in `src/api/settings.py`. \ No newline at end of file