diff --git a/docs/docs/_partial-ollama.mdx b/docs/docs/_partial-ollama.mdx deleted file mode 100644 index a5164e97..00000000 --- a/docs/docs/_partial-ollama.mdx +++ /dev/null @@ -1,24 +0,0 @@ -import Icon from "@site/src/components/icon/icon"; - -Using Ollama for your OpenRAG language model provider offers greater flexibility and configuration, but can also be overwhelming to start. -These recommendations are a reasonable starting point for users with at least one GPU and experience running LLMs locally. - -For best performance, OpenRAG recommends OpenAI's `gpt-oss:20b` language model. However, this model uses 16GB of RAM, so consider using Ollama Cloud or running Ollama on a remote machine. - -For generating embeddings, OpenRAG recommends the [`nomic-embed-text`](https://ollama.com/library/nomic-embed-text) embedding model, which provides high-quality embeddings optimized for retrieval tasks. - -To run models in [**Ollama Cloud**](https://docs.ollama.com/cloud), follow these steps: - - 1. Sign in to Ollama Cloud. - In a terminal, enter `ollama signin` to connect your local environment with Ollama Cloud. - 2. To run the model, in Ollama, select the `gpt-oss:20b-cloud` model, or run `ollama run gpt-oss:20b-cloud` in a terminal. - Ollama Cloud models are run at the same URL as your local Ollama server at `http://localhost:11434`, and automatically offloaded to Ollama's cloud service. - 3. Connect OpenRAG to the same local Ollama server as you would for local models in onboarding, using the default address of `http://localhost:11434`. - 4. In the **Language model** field, select the `gpt-oss:20b-cloud` model. -

-To run models on a **remote Ollama server**, follow these steps: - - 1. Ensure your remote Ollama server is accessible from your OpenRAG instance. - 2. In the **Ollama Base URL** field, enter your remote Ollama server's base URL, such as `http://your-remote-server:11434`. - OpenRAG connects to the remote Ollama server and populates the lists with the server's available models. - 3. Select your **Embedding model** and **Language model** from the available options. \ No newline at end of file diff --git a/docs/docs/_partial-onboarding.mdx b/docs/docs/_partial-onboarding.mdx index 4714e891..a8628648 100644 --- a/docs/docs/_partial-onboarding.mdx +++ b/docs/docs/_partial-onboarding.mdx @@ -1,7 +1,6 @@ import Icon from "@site/src/components/icon/icon"; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; -import PartialOllama from '@site/docs/_partial-ollama.mdx'; ## Application onboarding @@ -11,65 +10,97 @@ Some of these variables, such as the embedding models, can be changed seamlessly Others are immutable and require you to destroy and recreate the OpenRAG containers. For more information, see [Environment variables](/reference/configuration). -You can use different providers for your language model and embedding model, such as Anthropic for the language model and OpenAI for the embeddings model. +You can use different providers for your language model and embedding model, such as Anthropic for the language model and OpenAI for the embedding model. Additionally, you can set multiple embedding models. You only need to complete onboarding for your preferred providers. - - + + - :::info - Anthropic doesn't provide embedding models. If you select Anthropic for your language model, you must select a different provider for embeddings. - ::: +:::info +Anthropic doesn't provide embedding models. If you select Anthropic for your language model, you must select a different provider for the embedding model. +::: - 1. Enable **Use environment Anthropic API key** to automatically use your key from the `.env` file. - Alternatively, paste an Anthropic API key into the field. - 2. Under **Advanced settings**, select your **Language Model**. - 3. Click **Complete**. - 4. In the second onboarding panel, select a provider for embeddings and select your **Embedding Model**. - 5. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**. - Alternatively, click - + If you haven't set `ANTHROPIC_API_KEY` in your `.env` file, you must enter the key manually. - 1. Enable **Get API key from environment variable** to automatically enter your key from the TUI-generated `.env` file. - Alternatively, paste an OpenAI API key into the field. - 2. Under **Advanced settings**, select your **Language Model**. - 3. Click **Complete**. - 4. In the second onboarding panel, select a provider for embeddings and select your **Embedding Model**. - 5. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**. - Alternatively, click - +3. Click **Complete**. - 1. Complete the fields for **watsonx.ai API Endpoint**, **IBM Project ID**, and **IBM API key**. - These values are found in your IBM watsonx deployment. - 2. Under **Advanced settings**, select your **Language Model**. - 3. Click **Complete**. - 4. In the second onboarding panel, select a provider for embeddings and select your **Embedding Model**. - 5. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**. - Alternatively, click - +5. Continue through the overview slides for a brief introduction to OpenRAG, or click + - 1. To connect to an Ollama server running on your local machine, enter your Ollama server's base URL address. - The default Ollama server address is `http://localhost:11434`. - OpenRAG connects to the Ollama server and populates the model lists with the server's available models. - 2. Select the **Embedding Model** and **Language Model** your Ollama server is running. -
- Ollama model selection and external server configuration - -
- 3. Click **Complete**. - 4. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**. +1. Use the values from your IBM watsonx deployment for the **watsonx.ai API Endpoint**, **IBM Project ID**, and **IBM API key** fields. -
-
\ No newline at end of file +2. Under **Advanced settings**, select the language model that you want to use. + +3. Click **Complete**. + +4. Select a provider for embeddings, provide the required information, and then select the embedding model you want to use. +For information about another provider's credentials and settings, see the instructions for your chosen provider. + +5. Continue through the overview slides for a brief introduction to OpenRAG, or click
+ + +:::info +Ollama isn't installed with OpenRAG. You must install it separately if you want to use Ollama as a model provider. +::: + +Using Ollama as your language and embedding model provider offers greater flexibility and configuration options for hosting models, but it can be advanced for new users. +The recommendations given here are a reasonable starting point for users with at least one GPU and experience running LLMs locally. + +The OpenRAG team recommends the OpenAI `gpt-oss:20b` lanuage model and the [`nomic-embed-text`](https://ollama.com/library/nomic-embed-text) embedding model. +However, `gpt-oss:20b` uses 16GB of RAM, so consider using Ollama Cloud or running Ollama on a remote machine. + +1. [Install Ollama locally or on a remote server](https://docs.ollama.com/) or [run models in Ollama Cloud](https://docs.ollama.com/cloud). + + If you are running a remote server, it must be accessible from your OpenRAG deployment. + +2. In OpenRAG onboarding, connect to your Ollama server: + + * **Local Ollama server**: Enter your Ollama server's base URL and port. The default Ollama server address is `http://localhost:11434`. + * **Ollama Cloud**: Because Ollama Cloud models run at the same address as a local Ollama server and automatically offload to Ollama's cloud service, you can use the same base URL and port as you would for a local Ollama server. The default address is `http://localhost:11434`. + * **Remote server**: Enter your remote Ollama server's base URL and port, such as `http://your-remote-server:11434`. + + If the connection succeeds, OpenRAG populates the model lists with the server's available models. + +3. Select the **Embedding Model** and **Language Model** your Ollama server is running. + + To use different providers for these models, you must configure both providers, and select the relevant model for each provider. + +4. Click **Complete**. + +5. Continue through the overview slides for a brief introduction to OpenRAG, or click + + +1. Enter your OpenAI API key, or enable **Get API key from environment variable** to pull the key from your OpenRAG `.env` file. + + If you entered an OpenAI API key during setup, enable **Get API key from environment variable**. + +2. Under **Advanced settings**, select the language model that you want to use. + +3. Click **Complete**. + +4. Select a provider for embeddings, provide the required information, and then select the embedding model you want to use. +For information about another provider's credentials and settings, see the instructions for your chosen provider. + +5. Continue through the overview slides for a brief introduction to OpenRAG, or click +
\ No newline at end of file diff --git a/docs/docs/_partial-prereq-no-script.mdx b/docs/docs/_partial-prereq-no-script.mdx index 02e36824..83c65495 100644 --- a/docs/docs/_partial-prereq-no-script.mdx +++ b/docs/docs/_partial-prereq-no-script.mdx @@ -2,5 +2,5 @@ - Install [Podman](https://podman.io/docs/installation) (recommended) or [Docker](https://docs.docker.com/get-docker/). -- Install [Podman Compose](https://docs.podman.io/en/latest/markdown/podman-compose.1.html) or [Docker Compose](https://docs.docker.com/compose/install/). -To use Docker Compose with Podman, you must alias Docker Compose commands to Podman commands. \ No newline at end of file +- Install [`podman-compose`](https://docs.podman.io/en/latest/markdown/podman-compose.1.html) or [Docker Compose](https://docs.docker.com/compose/install/). +To use Docker Compose with Podman, you must alias Docker Compose commands to Podman commands. \ No newline at end of file diff --git a/docs/docs/get-started/docker.mdx b/docs/docs/get-started/docker.mdx index 98910af8..b6700d5a 100644 --- a/docs/docs/get-started/docker.mdx +++ b/docs/docs/get-started/docker.mdx @@ -102,44 +102,63 @@ To install OpenRAG with Docker Compose, do the following: 7. Deploy OpenRAG locally with the appropriate Docker Compose file for your environment. Both files deploy the same services. - - [`docker-compose.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose.yml) is an OpenRAG deployment with GPU support for accelerated AI processing. This Docker Compose file requires an NVIDIA GPU with [CUDA](https://docs.nvidia.com/cuda/) support. + * [`docker-compose.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose.yml) is an OpenRAG deployment with GPU support for accelerated AI processing. This Docker Compose file requires an NVIDIA GPU with [CUDA](https://docs.nvidia.com/cuda/) support. - ```bash - docker compose build - docker compose up -d - ``` + * Docker: - - [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without NVIDIA GPU support. Use this Docker Compose file for environments where GPU drivers aren't available. + ```bash + docker compose build + docker compose up -d + ``` - ```bash - docker compose -f docker-compose-cpu.yml up -d - ``` + * Podman Compose: - + ```bash + podman compose build + podman compose up -d + ``` + + * [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without NVIDIA GPU support. Use this Docker Compose file for environments where GPU drivers aren't available. + + * Docker: + + ```bash + docker compose -f docker-compose-cpu.yml up -d + ``` + + * Podman Compose: + + ```bash + podman compose -f docker-compose-cpu.yml up -d + ``` The OpenRAG Docker Compose file starts five containers: - | Container Name | Default Address | Purpose | + + | Container Name | Default address | Purpose | |---|---|---| | OpenRAG Backend | http://localhost:8000 | FastAPI server and core functionality. | - | OpenRAG Frontend | http://localhost:3000 | React web interface for users. | - | Langflow | http://localhost:7860 | AI workflow engine and flow management. | - | OpenSearch | http://localhost:9200 | Vector database for document storage. | - | OpenSearch Dashboards | http://localhost:5601 | Database administration interface. | + | OpenRAG Frontend | http://localhost:3000 | React web interface for user interaction. | + | Langflow | http://localhost:7860 | [AI workflow engine](/agents). | + | OpenSearch | http://localhost:9200 | Datastore for [knowledge](/knowledge). | + | OpenSearch Dashboards | http://localhost:5601 | OpenSearch database administration interface. | -8. Verify installation by confirming all services are running. +8. Wait while the containers start, and then confirm all containers are running: - - ```bash - docker compose ps - ``` + * Docker Compose: - You can now access OpenRAG at the following endpoints: + ```bash + docker compose ps + ``` - - **Frontend**: http://localhost:3000 - - **Backend API**: http://localhost:8000 - - **Langflow**: http://localhost:7860 + * Podman Compose: -9. Access the OpenRAG frontend to continue with [application onboarding](#application-onboarding). + ```bash + podman compose ps + ``` + + If all containers are running, you can access your OpenRAG services at their addresses. + +9. Access the OpenRAG frontend at `http://localhost:3000` to continue with [application onboarding](#application-onboarding). diff --git a/docs/docs/reference/configuration.mdx b/docs/docs/reference/configuration.mdx index 097eb9d9..7b62bd36 100644 --- a/docs/docs/reference/configuration.mdx +++ b/docs/docs/reference/configuration.mdx @@ -56,23 +56,27 @@ For example, with self-managed services, do the following: All OpenRAG configuration can be controlled through environment variables. -### AI provider settings +### Model provider settings {#model-provider-settings} Configure which models and providers OpenRAG uses to generate text and embeddings. -These are initially set during [application onboarding](/install#application-onboarding). -Some values are immutable and can only be changed by redeploying OpenRAG, as explained in [Set environment variables](#set-environment-variables). +You only need to provide credentials for the providers you are using in OpenRAG. + +These variables are initially set during [application onboarding](/install#application-onboarding). +Some of these variables are immutable and can only be changed by redeploying OpenRAG, as explained in [Set environment variables](#set-environment-variables). | Variable | Default | Description | |----------|---------|-------------| | `EMBEDDING_MODEL` | `text-embedding-3-small` | Embedding model for generating vector embeddings for documents in the knowledge base and similarity search queries. Can be changed after application onboarding. Accepts one or more models. | | `LLM_MODEL` | `gpt-4o-mini` | Language model for language processing and text generation in the **Chat** feature. | -| `MODEL_PROVIDER` | `openai` | Model provider, such as OpenAI or IBM watsonx.ai. | -| `OPENAI_API_KEY` | Not set | Optional OpenAI API key for the default model. For other providers, use `PROVIDER_API_KEY`. | -| `PROVIDER_API_KEY` | Not set | API key for the model provider. | -| `PROVIDER_ENDPOINT` | Not set | Custom provider endpoint for the IBM and Ollama model providers. Leave unset for other model providers. | -| `PROVIDER_PROJECT_ID` | Not set | Project ID for the IBM watsonx.ai model provider only. Leave unset for other model providers. | +| `MODEL_PROVIDER` | `openai` | Model provider, as one of `openai`, `watsonx`, `ollama`, or `anthropic`. | +| `ANTHROPIC_API_KEY` | Not set | API key for the Anthropic language model provider. | +| `OPENAI_API_KEY` | Not set | API key for the OpenAI model provider, which is also the default model provider. | +| `OLLAMA_ENDPOINT` | Not set | Custom provider endpoint for the Ollama model provider. | +| `WATSONX_API_KEY` | Not set | API key for the IBM watsonx.ai model provider. | +| `WATSONX_ENDPOINT` | Not set | Custom provider endpoint for the IBM watsonx.ai model provider. | +| `WATSONX_PROJECT_ID` | Not set | Project ID for the IBM watsonx.ai model provider. | -### Document processing +### Document processing settings Control how OpenRAG [processes and ingests documents](/ingestion) into your knowledge base.