openrag/docs/docs/reference/configuration.mdx
2025-12-15 16:04:30 -08:00

178 lines
No EOL
13 KiB
Text

---
title: Environment variables
slug: /reference/configuration
---
import PartialDockerComposeUp from '@site/docs/_partial-docker-compose-up.mdx';
OpenRAG recognizes environment variables from the following sources:
* [Environment variables](#configure-environment-variables): Values set in the `.env` file in the OpenRAG installation directory.
* [Langflow runtime overrides](#langflow-runtime-overrides): Langflow components can set environment variables at runtime.
* [Default or fallback values](#default-values-and-fallbacks): These values are default or fallback values if OpenRAG doesn't find a value.
## Configure environment variables
Environment variables are set in a `.env` file in the root of your OpenRAG project directory.
For an example `.env` file, see [`.env.example` in the OpenRAG repository](https://github.com/langflow-ai/openrag/blob/main/.env.example).
The Docker Compose files are populated with values from your `.env`, so you don't need to edit the Docker Compose files manually.
Environment variables always take precedence over other variables.
### Set environment variables {#set-environment-variables}
Environment variables are either mutable or immutable.
If you edit mutable environment variables, you can apply the changes by stopping and restarting the OpenRAG services after editing the `.env` file:
1. [Stop the OpenRAG services](/manage-services).
2. Edit your `.env` file.
3. [Restart the OpenRAG services](/manage-services).
If you edit immutable environment variables, you must [redeploy OpenRAG](/reinstall) with your modified `.env` file.
For example, with self-managed services, do the following:
1. Stop the deployment:
```bash title="Docker"
docker compose down
```
```bash title="Podman"
podman compose down
```
2. Edit your `.env` file.
3. Redeploy OpenRAG:
<PartialDockerComposeUp />
4. Restart the Docling service.
5. Launch the OpenRAG app, and then repeat the [application onboarding process](/install#application-onboarding). The values in your `.env` file are automatically populated.
## Supported environment variables
All OpenRAG configuration can be controlled through environment variables.
### Model provider settings {#model-provider-settings}
Configure which models and providers OpenRAG uses to generate text and embeddings.
You only need to provide credentials for the providers you are using in OpenRAG.
These variables are initially set during the [application onboarding process](/install#application-onboarding).
Some of these variables are immutable and can only be changed by redeploying OpenRAG, as explained in [Set environment variables](#set-environment-variables).
| Variable | Default | Description |
|----------|---------|-------------|
| `EMBEDDING_MODEL` | `text-embedding-3-small` | Embedding model for generating vector embeddings for documents in the knowledge base and similarity search queries. Can be changed after the application onboarding process. Accepts one or more models. |
| `LLM_MODEL` | `gpt-4o-mini` | Language model for language processing and text generation in the **Chat** feature. |
| `MODEL_PROVIDER` | `openai` | Model provider, as one of `openai`, `watsonx`, `ollama`, or `anthropic`. |
| `ANTHROPIC_API_KEY` | Not set | API key for the Anthropic language model provider. |
| `OPENAI_API_KEY` | Not set | API key for the OpenAI model provider, which is also the default model provider. |
| `OLLAMA_ENDPOINT` | Not set | Custom provider endpoint for the Ollama model provider. |
| `WATSONX_API_KEY` | Not set | API key for the IBM watsonx.ai model provider. |
| `WATSONX_ENDPOINT` | Not set | Custom provider endpoint for the IBM watsonx.ai model provider. |
| `WATSONX_PROJECT_ID` | Not set | Project ID for the IBM watsonx.ai model provider. |
### Document processing settings {#document-processing-settings}
Control how OpenRAG [processes and ingests documents](/ingestion) into your knowledge base.
| Variable | Default | Description |
|----------|---------|-------------|
| `CHUNK_OVERLAP` | `200` | Overlap between chunks. |
| `CHUNK_SIZE` | `1000` | Text chunk size for document processing. |
| `DISABLE_INGEST_WITH_LANGFLOW` | `false` | Disable Langflow ingestion pipeline. |
| `DOCLING_OCR_ENGINE` | Set by OS | OCR engine for document processing. For macOS, `ocrmac`. For any other OS, `easyocr`. |
| `OCR_ENABLED` | `false` | Enable OCR for image processing. |
| `OPENRAG_DOCUMENTS_PATHS` | `./openrag-documents` | Document paths for ingestion. |
| `PICTURE_DESCRIPTIONS_ENABLED` | `false` | Enable picture descriptions. |
### Langflow settings {#langflow-settings}
Configure the OpenRAG Langflow server's authentication, contact point, and built-in flow definitions.
:::info
The `LANGFLOW_SUPERUSER_PASSWORD` is set in your `.env` file, and this value determines the default values for several other Langflow authentication variables.
If the `LANGFLOW_SUPERUSER_PASSWORD` variable isn't set, then the Langflow server starts _without_ authentication enabled.
For better security, it is recommended to set `LANGFLOW_SUPERUSER_PASSWORD` so the [Langflow server starts with authentication enabled](https://docs.langflow.org/api-keys-and-authentication#start-a-langflow-server-with-authentication-enabled).
:::
| Variable | Default | Description |
|----------|---------|-------------|
| `LANGFLOW_AUTO_LOGIN` | Determined by `LANGFLOW_SUPERUSER_PASSWORD` | Whether to enable [auto-login mode](https://docs.langflow.org/api-keys-and-authentication#langflow-auto-login) for the Langflow visual editor and CLI. If `LANGFLOW_SUPERUSER_PASSWORD` isn't set, then `LANGFLOW_AUTO_LOGIN` is `True` and auto-login mode is enabled. If `LANGFLOW_SUPERUSER_PASSWORD` is set, then `LANGFLOW_AUTO_LOGIN` is `False` and auto-login mode is disabled. Langflow API calls always require authentication with a Langflow API key regardless of the auto-login setting. |
| `LANGFLOW_ENABLE_SUPERUSER_CLI` | Determined by `LANGFLOW_SUPERUSER_PASSWORD` | Whether to enable the [Langflow CLI `langflow superuser` command](https://docs.langflow.org/api-keys-and-authentication#langflow-enable-superuser-cli). If `LANGFLOW_SUPERUSER_PASSWORD` isn't set, then `LANGFLOW_ENABLE_SUPERUSER_CLI` is `True` and superuser accounts can be created with the Langflow CLI. If `LANGFLOW_SUPERUSER_PASSWORD` is set, then `LANGFLOW_ENABLE_SUPERUSER_CLI` is `False` and the `langflow superuser` command is disabled. |
| `LANGFLOW_NEW_USER_IS_ACTIVE` | Determined by `LANGFLOW_SUPERUSER_PASSWORD` | Whether new [Langflow user accounts are active by default](https://docs.langflow.org/api-keys-and-authentication#langflow-new-user-is-active). If `LANGFLOW_SUPERUSER_PASSWORD` isn't set, then `LANGFLOW_NEW_USER_IS_ACTIVE` is `True` and new user accounts are active by default. If `LANGFLOW_SUPERUSER_PASSWORD` is set, then `LANGFLOW_NEW_USER_IS_ACTIVE` is `False` and new user accounts are inactive by default. |
| `LANGFLOW_PUBLIC_URL` | `http://localhost:7860` | Public URL for the Langflow instance. Forms the base URL for Langflow API calls and other interfaces with your OpenRAG Langflow instance. |
| `LANGFLOW_KEY` | Automatically generated | A Langflow API key to run flows with Langflow API calls. Because Langflow API keys are server-specific, allow OpenRAG to generate this key initially. You can create additional Langflow API keys after deploying OpenRAG. |
| `LANGFLOW_SECRET_KEY` | Automatically generated | Secret encryption key for Langflow internal operations. It is recommended to [generate your own Langflow secret key](https://docs.langflow.org/api-keys-and-authentication#langflow-secret-key) for this variable. If this variable isn't set, then Langflow generates a secret key automatically. |
| `LANGFLOW_SUPERUSER` | `admin` | Username for the Langflow administrator user. |
| `LANGFLOW_SUPERUSER_PASSWORD` | Not set | Langflow administrator password. If this variable isn't set, then the Langflow server starts _without_ authentication enabled. It is recommended to set `LANGFLOW_SUPERUSER_PASSWORD` so the [Langflow server starts with authentication enabled](https://docs.langflow.org/api-keys-and-authentication#start-a-langflow-server-with-authentication-enabled). |
| `LANGFLOW_URL` | `http://localhost:7860` | URL for the Langflow instance. |
| `LANGFLOW_CHAT_FLOW_ID`, `LANGFLOW_INGEST_FLOW_ID`, `NUDGES_FLOW_ID` | Built-in flow IDs | These variables are set automatically to the IDs of the chat, ingestion, and nudges [flows](/agents). The default values are found in [`.env.example`](https://github.com/langflow-ai/openrag/blob/main/.env.example). Only change these values if you want to replace a built-in flow with your own custom flow. The flow JSON must be present in your version of the OpenRAG codebase. For example, if you [deploy self-managed services](/docker), you can add the flow JSON to your local clone of the OpenRAG repository before deploying OpenRAG. |
| `SYSTEM_PROMPT` | `You are a helpful AI assistant with access to a knowledge base. Answer questions based on the provided context.` | System prompt instructions for the agent driving the **Chat** flow. |
### OAuth provider settings
Configure [OAuth providers](/ingestion#oauth-ingestion) and external service integrations.
| Variable | Default | Description |
|----------|---------|-------------|
| `AWS_ACCESS_KEY_ID`<br/>`AWS_SECRET_ACCESS_KEY` | Not set | Enable access to AWS S3 with an [AWS OAuth app](https://docs.aws.amazon.com/singlesignon/latest/userguide/manage-your-applications.html) integration. |
| `GOOGLE_OAUTH_CLIENT_ID`<br/>`GOOGLE_OAUTH_CLIENT_SECRET` | Not set | Enable the [Google OAuth client](https://developers.google.com/identity/protocols/oauth2) integration. You can generate these values in the [Google Cloud Console](https://console.cloud.google.com/apis/credentials). |
| `MICROSOFT_GRAPH_OAUTH_CLIENT_ID`<br/>`MICROSOFT_GRAPH_OAUTH_CLIENT_SECRET` | Not set | Enable the [Microsoft Graph OAuth client](https://learn.microsoft.com/en-us/onedrive/developer/rest-api/getting-started/graph-oauth) integration by providing [Azure application registration credentials for SharePoint and OneDrive](https://learn.microsoft.com/en-us/onedrive/developer/rest-api/getting-started/app-registration?view=odsp-graph-online). |
| `WEBHOOK_BASE_URL` | Not set | Base URL for OAuth connector webhook endpoints. If this variable isn't set, a default base URL is used. |
### OpenSearch settings
Configure OpenSearch database authentication.
| Variable | Default | Description |
|----------|---------|-------------|
| `OPENSEARCH_HOST` | `localhost` | OpenSearch instance host. |
| `OPENSEARCH_PORT` | `9200` | OpenSearch instance port. |
| `OPENSEARCH_USERNAME` | `admin` | OpenSearch administrator username. |
| `OPENSEARCH_PASSWORD` | Must be set at start up | Required. OpenSearch administrator password. Must adhere to the [OpenSearch password complexity requirements](https://docs.opensearch.org/latest/security/configuration/demo-configuration/#setting-up-a-custom-admin-password). You must set this directly in the `.env` or in the TUI's [**Basic/Advanced Setup**](/install#setup). |
### System settings
Configure general system components, session management, and logging.
| Variable | Default | Description |
|----------|---------|-------------|
| `LANGFLOW_KEY_RETRIES` | `15` | Number of retries for Langflow key generation. |
| `LANGFLOW_KEY_RETRY_DELAY` | `2.0` | Delay between retries in seconds. |
| `LANGFLOW_VERSION` | `OPENRAG_VERSION` | Langflow Docker image version. By default, OpenRAG uses the `OPENRAG_VERSION` for the Langflow Docker image version. |
| `LOG_FORMAT` | Not set | Set to `json` to enabled JSON-formatted log output. If this variable isn't set, then the default logging format is used. |
| `LOG_LEVEL` | `INFO` | Logging level. Can be one of `DEBUG`, `INFO`, `WARNING`, or `ERROR`. `DEBUG` provides the most detailed logs but can impact performance. |
| `MAX_WORKERS` | `1` | Maximum number of workers for document processing. |
| `OPENRAG_VERSION` | `latest` | The version of the OpenRAG Docker images to run. For more information, see [Upgrade OpenRAG](/upgrade) |
| `SERVICE_NAME` | `openrag` | Service name for logging. |
| `SESSION_SECRET` | Automatically generated | Session management. |
## Langflow runtime overrides
You can modify [flow](/agents) settings at runtime without permanently changing the flow's configuration.
Runtime overrides are implemented through _tweaks_, which are one-time parameter modifications that are passed to specific Langflow components during flow execution.
For more information on tweaks, see the Langflow documentation on [Input schema (tweaks)](https://docs.langflow.org/concepts-publish#input-schema).
## Default values and fallbacks
If a variable isn't set by environment variables or a configuration file, OpenRAG can use a default value if one is defined in the codebase.
Default values can be found in the OpenRAG repository:
* OpenRAG configuration: [`config_manager.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/config_manager.py)
* System configuration: [`settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py)
* Logging configuration: [`logging_config.py`](https://github.com/langflow-ai/openrag/blob/main/src/utils/logging_config.py)