Install OpenRAG containers

OpenRAG has two Docker Compose files. Both files deploy the same applications and containers locally, but they are for different environments:

docker-compose.yml is an OpenRAG deployment with GPU support for accelerated AI processing. This Docker Compose file requires an NVIDIA GPU with CUDA support.
docker-compose-cpu.yml is a CPU-only version of OpenRAG for systems without NVIDIA GPU support. Use this Docker Compose file for environments where GPU drivers aren't available.

Prerequisites

Install the following:
- Python version 3.10 to 3.13.
- uv.
- Podman (recommended) or Docker.
- podman-compose or Docker Compose. To use Docker Compose with Podman, you must alias Docker Compose commands to Podman commands.
Microsoft Windows only: To run OpenRAG on Windows, you must use the Windows Subsystem for Linux (WSL).
Install WSL for OpenRAG
1. Install WSL with the Ubuntu distribution using WSL 2:
  
  wsl --install -d Ubuntu
  
  For new installations, the wsl --install command uses WSL 2 and Ubuntu by default.
  
  For existing WSL installations, you can change the distribution and check the WSL version.
  
  Known limitation
  OpenRAG isn't compatible with nested virtualization, which can cause networking issues. Don't install OpenRAG on a WSL distribution that is installed inside a Windows VM. Instead, install OpenRAG on your base OS or a non-nested Linux VM.
2. Start your WSL Ubuntu distribution if it doesn't start automatically.
3. Set up a username and password for your WSL distribution.
4. Install Docker Desktop for Windows with WSL 2. When you reach the Docker Desktop WSL integration settings, make sure your Ubuntu distribution is enabled, and then click Apply & Restart to enable Docker support in WSL.
5. Install and run OpenRAG from within your WSL Ubuntu distribution.
If you encounter issues with port forwarding or the Windows Firewall, you might need to adjust the Hyper-V firewall settings to allow communication between your WSL distribution and the Windows host. For more troubleshooting advice for networking issues, see Troubleshooting WSL common issues.
Prepare model providers and credentials.

During Application Onboarding, you must select language model and embedding model providers. If your chosen provider offers both types, you can use the same provider for both selections. If your provider offers only one type, such as Anthropic, you must select two providers.

Gather the credentials and connection details for your chosen model providers before starting onboarding:
- OpenAI: Create an OpenAI API key.
- Anthropic language models: Create an Anthropic API key.
- IBM watsonx.ai: Get your watsonx.ai API endpoint, IBM project ID, and IBM API key from your watsonx deployment.
- Ollama: Use the Ollama documentation to set up your Ollama instance locally, in the cloud, or on a remote server, and then get your Ollama server's base URL.
Optional: Install GPU support with an NVIDIA GPU, CUDA support, and compatible NVIDIA drivers on the OpenRAG host machine. This is required to use the GPU-accelerated Docker Compose file. If you choose not to use GPU support, you must use the CPU-only Docker Compose file instead.

Install OpenRAG with Docker Compose

To install OpenRAG with Docker Compose, do the following:

Clone the OpenRAG repository.

git clone https://github.com/langflow-ai/openrag.git
cd openrag

Install dependencies.
```
uv sync
```
Copy the example .env file included in the repository root. The example file includes all environment variables with comments to guide you in finding and setting their values.
```
cp .env.example .env
```
Alternatively, create a new .env file in the repository root.
```
touch .env
```
The Docker Compose files are populated with the values from your .env file. The OPENSEARCH_PASSWORD value must be set. OPENSEARCH_PASSWORD can be automatically generated when using the TUI, but for a Docker Compose installation, you can set it manually instead. To generate an OpenSearch admin password, see the OpenSearch documentation.

The following values are optional:
```
OPENAI_API_KEY=your_openai_api_key
LANGFLOW_SECRET_KEY=your_secret_key
```
OPENAI_API_KEY is optional. You can provide it during Application Onboarding or choose a different model provider. If you want to set it in your .env file, you can find your OpenAI API key in your OpenAI account.

LANGFLOW_SECRET_KEY is optional. Langflow will auto-generate it if not set. For more information, see the Langflow documentation.

The following Langflow configuration values are optional but important to consider:
```
LANGFLOW_SUPERUSER=admin
LANGFLOW_SUPERUSER_PASSWORD=your_langflow_password
```
LANGFLOW_SUPERUSER defaults to admin. You can omit it or set it to a different username. LANGFLOW_SUPERUSER_PASSWORD is optional. If omitted, Langflow runs in autologin mode with no password required. If set, Langflow requires password authentication.

For more information on configuring OpenRAG with environment variables, see Environment variables.
Start docling serve on the host machine. OpenRAG Docker installations require that docling serve is running on port 5001 on the host machine. This enables Mac MLX support for document processing.
```
uv run python scripts/docling_ctl.py start --port 5001
```

Confirm docling serve is running.

uv run python scripts/docling_ctl.py status

Make sure the response shows that docling serve is running, for example:

Status: running
Endpoint: http://127.0.0.1:5001
Docs: http://127.0.0.1:5001/docs
PID: 27746

Deploy OpenRAG locally with Docker Compose based on your deployment type.

docker-compose.yml
docker-compose-cpu.yml

docker compose build
docker compose up -d

docker compose -f docker-compose-cpu.yml up -d

The OpenRAG Docker Compose file starts five containers:

Container Name	Default Address	Purpose
OpenRAG Backend	http://localhost:8000	FastAPI server and core functionality.
OpenRAG Frontend	http://localhost:3000	React web interface for users.
Langflow	http://localhost:7860	AI workflow engine and flow management.
OpenSearch	http://localhost:9200	Vector database for document storage.
OpenSearch Dashboards	http://localhost:5601	Database administration interface.

Verify installation by confirming all services are running.
```
docker compose ps
```
You can now access OpenRAG at the following endpoints:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- Langflow: http://localhost:7860
Continue with Application Onboarding.

To stop docling serve when you're done with your OpenRAG deployment, run:

uv run python scripts/docling_ctl.py stop

Application onboarding

The first time you start OpenRAG, whether using the TUI or a .env file, you must complete application onboarding.

warning

Most values from onboarding can be changed later in the OpenRAG Settings page, but there are important restrictions.

The language model provider and embeddings model provider can only be selected at onboarding. To change your provider selection later, you must reinstall OpenRAG.

You can use different providers for your language model and embedding model, such as Anthropic for the language model and OpenAI for the embeddings model.

Choose one LLM provider and complete these steps:

Anthropic
OpenAI
IBM watsonx.ai
Ollama

info

Anthropic does not provide embedding models. If you select Anthropic for your language model, you must then select a different provider for embeddings.

Enable Use environment Anthropic API key to automatically use your key from the .env file. Alternatively, paste an Anthropic API key into the field.
Under Advanced settings, select your Language Model.
Click Complete.
In the second onboarding panel, select a provider for embeddings and select your Embedding Model.
To complete the onboarding tasks, click What is OpenRAG, and then click Add a Document. Alternatively, click Skip overview.
Continue with the Quickstart.

Enable Get API key from environment variable to automatically enter your key from the TUI-generated .env file. Alternatively, paste an OpenAI API key into the field.
Under Advanced settings, select your Language Model.
Click Complete.
In the second onboarding panel, select a provider for embeddings and select your Embedding Model.
To complete the onboarding tasks, click What is OpenRAG, and then click Add a Document. Alternatively, click Skip overview.
Continue with the Quickstart.

tip

Ollama is not included with OpenRAG. To install Ollama, see the Ollama documentation.

To connect to an Ollama server running on your local machine, enter your Ollama server's base URL address. The default Ollama server address is http://localhost:11434. OpenRAG connects to the Ollama server and populates the model lists with the server's available models.
Select the Embedding Model and Language Model your Ollama server is running.
Ollama model selection and external server configuration
Using Ollama for your OpenRAG language model provider offers greater flexibility and configuration, but can also be overwhelming to start. These recommendations are a reasonable starting point for users with at least one GPU and experience running LLMs locally.

For best performance, OpenRAG recommends OpenAI's gpt-oss:20b language model. However, this model uses 16GB of RAM, so consider using Ollama Cloud or running Ollama on a remote machine.

For generating embeddings, OpenRAG recommends the nomic-embed-text embedding model, which provides high-quality embeddings optimized for retrieval tasks.

To run models in Ollama Cloud, follow these steps:
1. Sign in to Ollama Cloud. In a terminal, enter ollama signin to connect your local environment with Ollama Cloud.
2. To run the model, in Ollama, select the gpt-oss:20b-cloud model, or run ollama run gpt-oss:20b-cloud in a terminal. Ollama Cloud models are run at the same URL as your local Ollama server at http://localhost:11434, and automatically offloaded to Ollama's cloud service.
3. Connect OpenRAG to the same local Ollama server as you would for local models in onboarding, using the default address of http://localhost:11434.
4. In the Language model field, select the gpt-oss:20b-cloud model.
To run models on a remote Ollama server, follow these steps:
1. Ensure your remote Ollama server is accessible from your OpenRAG instance.
2. In the Ollama Base URL field, enter your remote Ollama server's base URL, such as http://your-remote-server:11434. OpenRAG connects to the remote Ollama server and populates the lists with the server's available models.
3. Select your Embedding model and Language model from the available options.
Click Complete.
To complete the onboarding tasks, click What is OpenRAG, and then click Add a Document.
Continue with the Quickstart.

Container management commands

Manage your OpenRAG containers with the following commands. These commands are also available in the TUI's Status menu.

Upgrade containers

Upgrade your containers to the latest version while preserving your data.

docker compose pull
docker compose up -d --force-recreate

Rebuild containers (destructive)

Reset state by rebuilding all of your containers. Your OpenSearch and Langflow databases will be lost. Documents stored in the ./documents directory will persist, since the directory is mounted as a volume in the OpenRAG backend container.

docker compose up --build --force-recreate --remove-orphans

Remove all containers and data (destructive)

Completely remove your OpenRAG installation and delete all data. This deletes all of your data, including OpenSearch data, uploaded documents, and authentication.

docker compose down --volumes --remove-orphans --rmi local
docker system prune -f

Prerequisites​

Install OpenRAG with Docker Compose​

Application onboarding​

Container management commands​

Upgrade containers​

Rebuild containers (destructive)​

Remove all containers and data (destructive)​