From f45441a9e5986c4ecbc2252387ed376b6f37426a Mon Sep 17 00:00:00 2001
From: Mendon Kissling <59585235+mendonk@users.noreply.github.com>
Date: Mon, 29 Sep 2025 17:09:53 -0400
Subject: [PATCH] slight-cleanup

---
 docs/docs/configure/configuration.md | 127 +++++++++++++++++----------
 1 file changed, 80 insertions(+), 47 deletions(-)

diff --git a/docs/docs/configure/configuration.md b/docs/docs/configure/configuration.md
index 2387c2ce..79a5226a 100644
--- a/docs/docs/configure/configuration.md
+++ b/docs/docs/configure/configuration.md
@@ -1,48 +1,20 @@
 ---
-title: Configuration
+title: Environment variables and configuration values
 slug: /configure/configuration
 ---
 
-# Configuration
-
 OpenRAG supports multiple configuration methods with the following priority:
 
 1. **Environment Variables** (highest priority)
 2. **Configuration File** (`config.yaml`)
-3. **Langflow Flow Settings** (runtime override)
-4. **Default Values** (fallback)
+3. **Default Values** (fallback)
 
-## Configuration File
+## Environment variables
 
-Create a `config.yaml` file in the project root to configure OpenRAG:
+Environment variables will override configuration file settings.
+You can create a `.env` file in the project root to set these variables.
 
-```yaml
-# OpenRAG Configuration File
-provider:
-  model_provider: "openai" # openai, anthropic, azure, etc.
-  api_key: "your-api-key" # or use OPENAI_API_KEY env var
-
-knowledge:
-  embedding_model: "text-embedding-3-small"
-  chunk_size: 1000
-  chunk_overlap: 200
-  ocr: true
-  picture_descriptions: false
-
-agent:
-  llm_model: "gpt-4o-mini"
-  system_prompt: "You are a helpful AI assistant..."
-```
-
-## Environment Variables
-
-Environment variables will override configuration file settings. You can still use `.env` files:
-
-```bash
-cp .env.example .env
-```
-
-## Required Variables
+## Required variables
 
 | Variable                      | Description                                 |
 | ----------------------------- | ------------------------------------------- |
@@ -54,7 +26,7 @@ cp .env.example .env
 | `LANGFLOW_INGEST_FLOW_ID`     | ID of your Langflow ingestion flow          |
 | `NUDGES_FLOW_ID`              | ID of your Langflow nudges/suggestions flow |
 
-## Ingestion Configuration
+## Ingestion configuration
 
 | Variable                       | Description                                            |
 | ------------------------------ | ------------------------------------------------------ |
@@ -63,10 +35,14 @@ cp .env.example .env
 - `false` or unset: Uses Langflow pipeline (upload → ingest → delete)
 - `true`: Uses traditional OpenRAG processor for document ingestion
 
-## Optional Variables
+## Optional variables
 
 | Variable                                                                  | Description                                                        |
 | ------------------------------------------------------------------------- | ------------------------------------------------------------------ |
+| `OPENSEARCH_HOST`                                                         | OpenSearch host (default: `localhost`)                             |
+| `OPENSEARCH_PORT`                                                         | OpenSearch port (default: `9200`)                                  |
+| `OPENSEARCH_USERNAME`                                                     | OpenSearch username (default: `admin`)                            |
+| `LANGFLOW_URL`                                                            | Langflow URL (default: `http://localhost:7860`)                    |
 | `LANGFLOW_PUBLIC_URL`                                                     | Public URL for Langflow (default: `http://localhost:7860`)         |
 | `GOOGLE_OAUTH_CLIENT_ID` / `GOOGLE_OAUTH_CLIENT_SECRET`                   | Google OAuth authentication                                        |
 | `MICROSOFT_GRAPH_OAUTH_CLIENT_ID` / `MICROSOFT_GRAPH_OAUTH_CLIENT_SECRET` | Microsoft OAuth                                                    |
@@ -75,20 +51,27 @@ cp .env.example .env
 | `SESSION_SECRET`                                                          | Session management (default: auto-generated, change in production) |
 | `LANGFLOW_KEY`                                                            | Explicit Langflow API key (auto-generated if not provided)         |
 | `LANGFLOW_SECRET_KEY`                                                     | Secret key for Langflow internal operations                        |
+| `DOCLING_OCR_ENGINE`                                                      | OCR engine for document processing                                |
+| `LANGFLOW_AUTO_LOGIN`                                                     | Enable auto-login for Langflow (default: `False`)                 |
+| `LANGFLOW_NEW_USER_IS_ACTIVE`                                             | New users are active by default (default: `False`)                 |
+| `LANGFLOW_ENABLE_SUPERUSER_CLI`                                           | Enable superuser CLI (default: `False`)                            |
+| `OPENRAG_DOCUMENTS_PATHS`                                                 | Document paths for ingestion (default: `./documents`)              |
 
-## OpenRAG Configuration Variables
+## OpenRAG configuration variables
 
 These environment variables override settings in `config.yaml`:
 
-### Provider Settings
+### Provider settings
 
-| Variable           | Description                              | Default  |
-| ------------------ | ---------------------------------------- | -------- |
-| `MODEL_PROVIDER`   | Model provider (openai, anthropic, etc.) | `openai` |
-| `PROVIDER_API_KEY` | API key for the model provider           |          |
-| `OPENAI_API_KEY`   | OpenAI API key (backward compatibility)  |          |
+| Variable             | Description                              | Default  |
+| -------------------- | ---------------------------------------- | -------- |
+| `MODEL_PROVIDER`     | Model provider (openai, anthropic, etc.) | `openai` |
+| `PROVIDER_API_KEY`   | API key for the model provider           |          |
+| `PROVIDER_ENDPOINT`  | Custom provider endpoint (e.g., Watson)  |          |
+| `PROVIDER_PROJECT_ID`| Project ID for providers (e.g., Watson)  |          |
+| `OPENAI_API_KEY`     | OpenAI API key (backward compatibility)  |          |
 
-### Knowledge Settings
+### Knowledge settings
 
 | Variable                       | Description                             | Default                  |
 | ------------------------------ | --------------------------------------- | ------------------------ |
@@ -98,11 +81,61 @@ These environment variables override settings in `config.yaml`:
 | `OCR_ENABLED`                  | Enable OCR for image processing         | `true`                   |
 | `PICTURE_DESCRIPTIONS_ENABLED` | Enable picture descriptions             | `false`                  |
 
-### Agent Settings
+### Agent settings
 
 | Variable        | Description                       | Default                  |
 | --------------- | --------------------------------- | ------------------------ |
 | `LLM_MODEL`     | Language model for the chat agent | `gpt-4o-mini`            |
-| `SYSTEM_PROMPT` | System prompt for the agent       | Default assistant prompt |
+| `SYSTEM_PROMPT` | System prompt for the agent       | "You are a helpful AI assistant with access to a knowledge base. Answer questions based on the provided context." |
 
-See `.env.example` for a complete list with descriptions, and `docker-compose*.yml` for runtime usage.
+See `docker-compose-*.yml` files for runtime usage examples.
+
+## Configuration file
+
+Create a `config.yaml` file in the project root to configure OpenRAG:
+
+```yaml
+# OpenRAG Configuration File
+provider:
+  model_provider: "openai" # openai, anthropic, azure, etc.
+  api_key: "your-api-key" # or use OPENAI_API_KEY env var
+  endpoint: "" # For custom provider endpoints (e.g., Watson/IBM)
+  project_id: "" # For providers that need project IDs (e.g., Watson/IBM)
+
+knowledge:
+  embedding_model: "text-embedding-3-small"
+  chunk_size: 1000
+  chunk_overlap: 200
+  doclingPresets: "standard" # standard, ocr, picture_description, VLM
+  ocr: true
+  picture_descriptions: false
+
+agent:
+  llm_model: "gpt-4o-mini"
+  system_prompt: "You are a helpful AI assistant with access to a knowledge base. Answer questions based on the provided context."
+```
+
+## Default Values and Fallbacks
+
+When no environment variables or configuration file values are provided, OpenRAG uses default values.
+These values can be found in the code base at the following locations.
+
+### OpenRAG configuration defaults
+
+These values are are defined in `src/config/config_manager.py`.
+
+### System configuration defaults
+
+These fallback values are defined in `src/config/settings.py`.
+
+### TUI default values
+
+These values are defined in `src/tui/managers/env_manager.py`.
+
+### Frontend default values
+
+These values are defined in `frontend/src/lib/constants.ts`.
+
+### Docling preset configurations
+
+These values are defined in `src/api/settings.py`.
\ No newline at end of file