revise onboarding partial

This commit is contained in:
April M 2025-12-05 07:02:33 -08:00
parent dc724fe8e7
commit bfb028808f
5 changed files with 139 additions and 109 deletions

View file

@ -1,24 +0,0 @@
import Icon from "@site/src/components/icon/icon";
Using Ollama for your OpenRAG language model provider offers greater flexibility and configuration, but can also be overwhelming to start.
These recommendations are a reasonable starting point for users with at least one GPU and experience running LLMs locally.
For best performance, OpenRAG recommends OpenAI's `gpt-oss:20b` language model. However, this model uses 16GB of RAM, so consider using Ollama Cloud or running Ollama on a remote machine.
For generating embeddings, OpenRAG recommends the [`nomic-embed-text`](https://ollama.com/library/nomic-embed-text) embedding model, which provides high-quality embeddings optimized for retrieval tasks.
To run models in [**Ollama Cloud**](https://docs.ollama.com/cloud), follow these steps:
1. Sign in to Ollama Cloud.
In a terminal, enter `ollama signin` to connect your local environment with Ollama Cloud.
2. To run the model, in Ollama, select the `gpt-oss:20b-cloud` model, or run `ollama run gpt-oss:20b-cloud` in a terminal.
Ollama Cloud models are run at the same URL as your local Ollama server at `http://localhost:11434`, and automatically offloaded to Ollama's cloud service.
3. Connect OpenRAG to the same local Ollama server as you would for local models in onboarding, using the default address of `http://localhost:11434`.
4. In the **Language model** field, select the `gpt-oss:20b-cloud` model.
<br></br>
To run models on a **remote Ollama server**, follow these steps:
1. Ensure your remote Ollama server is accessible from your OpenRAG instance.
2. In the **Ollama Base URL** field, enter your remote Ollama server's base URL, such as `http://your-remote-server:11434`.
OpenRAG connects to the remote Ollama server and populates the lists with the server's available models.
3. Select your **Embedding model** and **Language model** from the available options.

View file

@ -1,7 +1,6 @@
import Icon from "@site/src/components/icon/icon";
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import PartialOllama from '@site/docs/_partial-ollama.mdx';
## Application onboarding
@ -11,65 +10,97 @@ Some of these variables, such as the embedding models, can be changed seamlessly
Others are immutable and require you to destroy and recreate the OpenRAG containers.
For more information, see [Environment variables](/reference/configuration).
You can use different providers for your language model and embedding model, such as Anthropic for the language model and OpenAI for the embeddings model.
You can use different providers for your language model and embedding model, such as Anthropic for the language model and OpenAI for the embedding model.
Additionally, you can set multiple embedding models.
You only need to complete onboarding for your preferred providers.
<Tabs groupId="Provider">
<TabItem value="Anthropic" label="Anthropic" default>
<Tabs groupId="Provider">
<TabItem value="Anthropic" label="Anthropic" default>
:::info
Anthropic doesn't provide embedding models. If you select Anthropic for your language model, you must select a different provider for embeddings.
:::
:::info
Anthropic doesn't provide embedding models. If you select Anthropic for your language model, you must select a different provider for the embedding model.
:::
1. Enable **Use environment Anthropic API key** to automatically use your key from the `.env` file.
Alternatively, paste an Anthropic API key into the field.
2. Under **Advanced settings**, select your **Language Model**.
3. Click **Complete**.
4. In the second onboarding panel, select a provider for embeddings and select your **Embedding Model**.
5. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**.
Alternatively, click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
1. Enter your Anthropic API key, or enable **Get API key from environment variable** to pull the key from your OpenRAG `.env` file.
</TabItem>
<TabItem value="OpenAI" label="OpenAI">
If you haven't set `ANTHROPIC_API_KEY` in your `.env` file, you must enter the key manually.
1. Enable **Get API key from environment variable** to automatically enter your key from the TUI-generated `.env` file.
Alternatively, paste an OpenAI API key into the field.
2. Under **Advanced settings**, select your **Language Model**.
3. Click **Complete**.
4. In the second onboarding panel, select a provider for embeddings and select your **Embedding Model**.
5. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**.
Alternatively, click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
2. Under **Advanced settings**, select the language model that you want to use.
</TabItem>
<TabItem value="IBM watsonx.ai" label="IBM watsonx.ai">
3. Click **Complete**.
1. Complete the fields for **watsonx.ai API Endpoint**, **IBM Project ID**, and **IBM API key**.
These values are found in your IBM watsonx deployment.
2. Under **Advanced settings**, select your **Language Model**.
3. Click **Complete**.
4. In the second onboarding panel, select a provider for embeddings and select your **Embedding Model**.
5. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**.
Alternatively, click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
4. Select a provider for embeddings, provide the required information, and then select the embedding model you want to use.
For information about another provider's credentials and settings, see the instructions for your chosen provider.
</TabItem>
<TabItem value="Ollama" label="Ollama">
5. Continue through the overview slides for a brief introduction to OpenRAG, or click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
The overview demonstrates some basic functionality that is covered in the [quickstart](/quickstart#chat-with-documents) and in other parts of the OpenRAG documentation.
:::info
Ollama isn't installed with OpenRAG. To install Ollama, see the [Ollama documentation](https://docs.ollama.com/).
:::
</TabItem>
<TabItem value="IBM watsonx.ai" label="IBM watsonx.ai">
1. To connect to an Ollama server running on your local machine, enter your Ollama server's base URL address.
The default Ollama server address is `http://localhost:11434`.
OpenRAG connects to the Ollama server and populates the model lists with the server's available models.
2. Select the **Embedding Model** and **Language Model** your Ollama server is running.
<details closed>
<summary>Ollama model selection and external server configuration</summary>
<PartialOllama />
</details>
3. Click **Complete**.
4. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**.
1. Use the values from your IBM watsonx deployment for the **watsonx.ai API Endpoint**, **IBM Project ID**, and **IBM API key** fields.
</TabItem>
</Tabs>
2. Under **Advanced settings**, select the language model that you want to use.
3. Click **Complete**.
4. Select a provider for embeddings, provide the required information, and then select the embedding model you want to use.
For information about another provider's credentials and settings, see the instructions for your chosen provider.
5. Continue through the overview slides for a brief introduction to OpenRAG, or click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
The overview demonstrates some basic functionality that is covered in the [quickstart](/quickstart#chat-with-documents) and in other parts of the OpenRAG documentation.
</TabItem>
<TabItem value="Ollama" label="Ollama">
:::info
Ollama isn't installed with OpenRAG. You must install it separately if you want to use Ollama as a model provider.
:::
Using Ollama as your language and embedding model provider offers greater flexibility and configuration options for hosting models, but it can be advanced for new users.
The recommendations given here are a reasonable starting point for users with at least one GPU and experience running LLMs locally.
The OpenRAG team recommends the OpenAI `gpt-oss:20b` lanuage model and the [`nomic-embed-text`](https://ollama.com/library/nomic-embed-text) embedding model.
However, `gpt-oss:20b` uses 16GB of RAM, so consider using Ollama Cloud or running Ollama on a remote machine.
1. [Install Ollama locally or on a remote server](https://docs.ollama.com/) or [run models in Ollama Cloud](https://docs.ollama.com/cloud).
If you are running a remote server, it must be accessible from your OpenRAG deployment.
2. In OpenRAG onboarding, connect to your Ollama server:
* **Local Ollama server**: Enter your Ollama server's base URL and port. The default Ollama server address is `http://localhost:11434`.
* **Ollama Cloud**: Because Ollama Cloud models run at the same address as a local Ollama server and automatically offload to Ollama's cloud service, you can use the same base URL and port as you would for a local Ollama server. The default address is `http://localhost:11434`.
* **Remote server**: Enter your remote Ollama server's base URL and port, such as `http://your-remote-server:11434`.
If the connection succeeds, OpenRAG populates the model lists with the server's available models.
3. Select the **Embedding Model** and **Language Model** your Ollama server is running.
To use different providers for these models, you must configure both providers, and select the relevant model for each provider.
4. Click **Complete**.
5. Continue through the overview slides for a brief introduction to OpenRAG, or click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
The overview demonstrates some basic functionality that is covered in the [quickstart](/quickstart#chat-with-documents) and in other parts of the OpenRAG documentation.
</TabItem>
<TabItem value="OpenAI" label="OpenAI (default)">
1. Enter your OpenAI API key, or enable **Get API key from environment variable** to pull the key from your OpenRAG `.env` file.
If you entered an OpenAI API key during setup, enable **Get API key from environment variable**.
2. Under **Advanced settings**, select the language model that you want to use.
3. Click **Complete**.
4. Select a provider for embeddings, provide the required information, and then select the embedding model you want to use.
For information about another provider's credentials and settings, see the instructions for your chosen provider.
5. Continue through the overview slides for a brief introduction to OpenRAG, or click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
The overview demonstrates some basic functionality that is covered in the [quickstart](/quickstart#chat-with-documents) and in other parts of the OpenRAG documentation.
</TabItem>
</Tabs>

View file

@ -2,5 +2,5 @@
- Install [Podman](https://podman.io/docs/installation) (recommended) or [Docker](https://docs.docker.com/get-docker/).
- Install [Podman Compose](https://docs.podman.io/en/latest/markdown/podman-compose.1.html) or [Docker Compose](https://docs.docker.com/compose/install/).
To use Docker Compose with Podman, you must alias Docker Compose commands to Podman commands.
- Install [`podman-compose`](https://docs.podman.io/en/latest/markdown/podman-compose.1.html) or [Docker Compose](https://docs.docker.com/compose/install/).
To use Docker Compose with Podman, you must alias Docker Compose commands to Podman commands.

View file

@ -102,44 +102,63 @@ To install OpenRAG with Docker Compose, do the following:
7. Deploy OpenRAG locally with the appropriate Docker Compose file for your environment.
Both files deploy the same services.
- [`docker-compose.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose.yml) is an OpenRAG deployment with GPU support for accelerated AI processing. This Docker Compose file requires an NVIDIA GPU with [CUDA](https://docs.nvidia.com/cuda/) support.
* [`docker-compose.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose.yml) is an OpenRAG deployment with GPU support for accelerated AI processing. This Docker Compose file requires an NVIDIA GPU with [CUDA](https://docs.nvidia.com/cuda/) support.
```bash
docker compose build
docker compose up -d
```
* Docker:
- [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without NVIDIA GPU support. Use this Docker Compose file for environments where GPU drivers aren't available.
```bash
docker compose build
docker compose up -d
```
```bash
docker compose -f docker-compose-cpu.yml up -d
```
* Podman Compose:
<!-- add podman compose -->
```bash
podman compose build
podman compose up -d
```
* [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without NVIDIA GPU support. Use this Docker Compose file for environments where GPU drivers aren't available.
* Docker:
```bash
docker compose -f docker-compose-cpu.yml up -d
```
* Podman Compose:
```bash
podman compose -f docker-compose-cpu.yml up -d
```
The OpenRAG Docker Compose file starts five containers:
| Container Name | Default Address | Purpose |
| Container Name | Default address | Purpose |
|---|---|---|
| OpenRAG Backend | http://localhost:8000 | FastAPI server and core functionality. |
| OpenRAG Frontend | http://localhost:3000 | React web interface for users. |
| Langflow | http://localhost:7860 | AI workflow engine and flow management. |
| OpenSearch | http://localhost:9200 | Vector database for document storage. |
| OpenSearch Dashboards | http://localhost:5601 | Database administration interface. |
| OpenRAG Frontend | http://localhost:3000 | React web interface for user interaction. |
| Langflow | http://localhost:7860 | [AI workflow engine](/agents). |
| OpenSearch | http://localhost:9200 | Datastore for [knowledge](/knowledge). |
| OpenSearch Dashboards | http://localhost:5601 | OpenSearch database administration interface. |
8. Verify installation by confirming all services are running.
8. Wait while the containers start, and then confirm all containers are running:
<!-- add podman compose -->
```bash
docker compose ps
```
* Docker Compose:
You can now access OpenRAG at the following endpoints:
```bash
docker compose ps
```
- **Frontend**: http://localhost:3000
- **Backend API**: http://localhost:8000
- **Langflow**: http://localhost:7860
* Podman Compose:
9. Access the OpenRAG frontend to continue with [application onboarding](#application-onboarding).
```bash
podman compose ps
```
If all containers are running, you can access your OpenRAG services at their addresses.
9. Access the OpenRAG frontend at `http://localhost:3000` to continue with [application onboarding](#application-onboarding).
<PartialOnboarding />

View file

@ -56,23 +56,27 @@ For example, with self-managed services, do the following:
All OpenRAG configuration can be controlled through environment variables.
### AI provider settings
### Model provider settings {#model-provider-settings}
Configure which models and providers OpenRAG uses to generate text and embeddings.
These are initially set during [application onboarding](/install#application-onboarding).
Some values are immutable and can only be changed by redeploying OpenRAG, as explained in [Set environment variables](#set-environment-variables).
You only need to provide credentials for the providers you are using in OpenRAG.
These variables are initially set during [application onboarding](/install#application-onboarding).
Some of these variables are immutable and can only be changed by redeploying OpenRAG, as explained in [Set environment variables](#set-environment-variables).
| Variable | Default | Description |
|----------|---------|-------------|
| `EMBEDDING_MODEL` | `text-embedding-3-small` | Embedding model for generating vector embeddings for documents in the knowledge base and similarity search queries. Can be changed after application onboarding. Accepts one or more models. |
| `LLM_MODEL` | `gpt-4o-mini` | Language model for language processing and text generation in the **Chat** feature. |
| `MODEL_PROVIDER` | `openai` | Model provider, such as OpenAI or IBM watsonx.ai. |
| `OPENAI_API_KEY` | Not set | Optional OpenAI API key for the default model. For other providers, use `PROVIDER_API_KEY`. |
| `PROVIDER_API_KEY` | Not set | API key for the model provider. |
| `PROVIDER_ENDPOINT` | Not set | Custom provider endpoint for the IBM and Ollama model providers. Leave unset for other model providers. |
| `PROVIDER_PROJECT_ID` | Not set | Project ID for the IBM watsonx.ai model provider only. Leave unset for other model providers. |
| `MODEL_PROVIDER` | `openai` | Model provider, as one of `openai`, `watsonx`, `ollama`, or `anthropic`. |
| `ANTHROPIC_API_KEY` | Not set | API key for the Anthropic language model provider. |
| `OPENAI_API_KEY` | Not set | API key for the OpenAI model provider, which is also the default model provider. |
| `OLLAMA_ENDPOINT` | Not set | Custom provider endpoint for the Ollama model provider. |
| `WATSONX_API_KEY` | Not set | API key for the IBM watsonx.ai model provider. |
| `WATSONX_ENDPOINT` | Not set | Custom provider endpoint for the IBM watsonx.ai model provider. |
| `WATSONX_PROJECT_ID` | Not set | Project ID for the IBM watsonx.ai model provider. |
### Document processing
### Document processing settings
Control how OpenRAG [processes and ingests documents](/ingestion) into your knowledge base.