revise onboarding partial
This commit is contained in:
parent
dc724fe8e7
commit
bfb028808f
5 changed files with 139 additions and 109 deletions
|
|
@ -1,24 +0,0 @@
|
|||
import Icon from "@site/src/components/icon/icon";
|
||||
|
||||
Using Ollama for your OpenRAG language model provider offers greater flexibility and configuration, but can also be overwhelming to start.
|
||||
These recommendations are a reasonable starting point for users with at least one GPU and experience running LLMs locally.
|
||||
|
||||
For best performance, OpenRAG recommends OpenAI's `gpt-oss:20b` language model. However, this model uses 16GB of RAM, so consider using Ollama Cloud or running Ollama on a remote machine.
|
||||
|
||||
For generating embeddings, OpenRAG recommends the [`nomic-embed-text`](https://ollama.com/library/nomic-embed-text) embedding model, which provides high-quality embeddings optimized for retrieval tasks.
|
||||
|
||||
To run models in [**Ollama Cloud**](https://docs.ollama.com/cloud), follow these steps:
|
||||
|
||||
1. Sign in to Ollama Cloud.
|
||||
In a terminal, enter `ollama signin` to connect your local environment with Ollama Cloud.
|
||||
2. To run the model, in Ollama, select the `gpt-oss:20b-cloud` model, or run `ollama run gpt-oss:20b-cloud` in a terminal.
|
||||
Ollama Cloud models are run at the same URL as your local Ollama server at `http://localhost:11434`, and automatically offloaded to Ollama's cloud service.
|
||||
3. Connect OpenRAG to the same local Ollama server as you would for local models in onboarding, using the default address of `http://localhost:11434`.
|
||||
4. In the **Language model** field, select the `gpt-oss:20b-cloud` model.
|
||||
<br></br>
|
||||
To run models on a **remote Ollama server**, follow these steps:
|
||||
|
||||
1. Ensure your remote Ollama server is accessible from your OpenRAG instance.
|
||||
2. In the **Ollama Base URL** field, enter your remote Ollama server's base URL, such as `http://your-remote-server:11434`.
|
||||
OpenRAG connects to the remote Ollama server and populates the lists with the server's available models.
|
||||
3. Select your **Embedding model** and **Language model** from the available options.
|
||||
|
|
@ -1,7 +1,6 @@
|
|||
import Icon from "@site/src/components/icon/icon";
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
import PartialOllama from '@site/docs/_partial-ollama.mdx';
|
||||
|
||||
## Application onboarding
|
||||
|
||||
|
|
@ -11,65 +10,97 @@ Some of these variables, such as the embedding models, can be changed seamlessly
|
|||
Others are immutable and require you to destroy and recreate the OpenRAG containers.
|
||||
For more information, see [Environment variables](/reference/configuration).
|
||||
|
||||
You can use different providers for your language model and embedding model, such as Anthropic for the language model and OpenAI for the embeddings model.
|
||||
You can use different providers for your language model and embedding model, such as Anthropic for the language model and OpenAI for the embedding model.
|
||||
Additionally, you can set multiple embedding models.
|
||||
|
||||
You only need to complete onboarding for your preferred providers.
|
||||
|
||||
<Tabs groupId="Provider">
|
||||
<TabItem value="Anthropic" label="Anthropic" default>
|
||||
<Tabs groupId="Provider">
|
||||
<TabItem value="Anthropic" label="Anthropic" default>
|
||||
|
||||
:::info
|
||||
Anthropic doesn't provide embedding models. If you select Anthropic for your language model, you must select a different provider for embeddings.
|
||||
:::
|
||||
:::info
|
||||
Anthropic doesn't provide embedding models. If you select Anthropic for your language model, you must select a different provider for the embedding model.
|
||||
:::
|
||||
|
||||
1. Enable **Use environment Anthropic API key** to automatically use your key from the `.env` file.
|
||||
Alternatively, paste an Anthropic API key into the field.
|
||||
2. Under **Advanced settings**, select your **Language Model**.
|
||||
3. Click **Complete**.
|
||||
4. In the second onboarding panel, select a provider for embeddings and select your **Embedding Model**.
|
||||
5. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**.
|
||||
Alternatively, click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
|
||||
1. Enter your Anthropic API key, or enable **Get API key from environment variable** to pull the key from your OpenRAG `.env` file.
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="OpenAI" label="OpenAI">
|
||||
If you haven't set `ANTHROPIC_API_KEY` in your `.env` file, you must enter the key manually.
|
||||
|
||||
1. Enable **Get API key from environment variable** to automatically enter your key from the TUI-generated `.env` file.
|
||||
Alternatively, paste an OpenAI API key into the field.
|
||||
2. Under **Advanced settings**, select your **Language Model**.
|
||||
3. Click **Complete**.
|
||||
4. In the second onboarding panel, select a provider for embeddings and select your **Embedding Model**.
|
||||
5. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**.
|
||||
Alternatively, click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
|
||||
2. Under **Advanced settings**, select the language model that you want to use.
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="IBM watsonx.ai" label="IBM watsonx.ai">
|
||||
3. Click **Complete**.
|
||||
|
||||
1. Complete the fields for **watsonx.ai API Endpoint**, **IBM Project ID**, and **IBM API key**.
|
||||
These values are found in your IBM watsonx deployment.
|
||||
2. Under **Advanced settings**, select your **Language Model**.
|
||||
3. Click **Complete**.
|
||||
4. In the second onboarding panel, select a provider for embeddings and select your **Embedding Model**.
|
||||
5. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**.
|
||||
Alternatively, click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
|
||||
4. Select a provider for embeddings, provide the required information, and then select the embedding model you want to use.
|
||||
For information about another provider's credentials and settings, see the instructions for your chosen provider.
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="Ollama" label="Ollama">
|
||||
5. Continue through the overview slides for a brief introduction to OpenRAG, or click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
|
||||
The overview demonstrates some basic functionality that is covered in the [quickstart](/quickstart#chat-with-documents) and in other parts of the OpenRAG documentation.
|
||||
|
||||
:::info
|
||||
Ollama isn't installed with OpenRAG. To install Ollama, see the [Ollama documentation](https://docs.ollama.com/).
|
||||
:::
|
||||
</TabItem>
|
||||
<TabItem value="IBM watsonx.ai" label="IBM watsonx.ai">
|
||||
|
||||
1. To connect to an Ollama server running on your local machine, enter your Ollama server's base URL address.
|
||||
The default Ollama server address is `http://localhost:11434`.
|
||||
OpenRAG connects to the Ollama server and populates the model lists with the server's available models.
|
||||
2. Select the **Embedding Model** and **Language Model** your Ollama server is running.
|
||||
<details closed>
|
||||
<summary>Ollama model selection and external server configuration</summary>
|
||||
<PartialOllama />
|
||||
</details>
|
||||
3. Click **Complete**.
|
||||
4. To complete the onboarding tasks, click **What is OpenRAG**, and then click **Add a Document**.
|
||||
1. Use the values from your IBM watsonx deployment for the **watsonx.ai API Endpoint**, **IBM Project ID**, and **IBM API key** fields.
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
2. Under **Advanced settings**, select the language model that you want to use.
|
||||
|
||||
3. Click **Complete**.
|
||||
|
||||
4. Select a provider for embeddings, provide the required information, and then select the embedding model you want to use.
|
||||
For information about another provider's credentials and settings, see the instructions for your chosen provider.
|
||||
|
||||
5. Continue through the overview slides for a brief introduction to OpenRAG, or click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
|
||||
The overview demonstrates some basic functionality that is covered in the [quickstart](/quickstart#chat-with-documents) and in other parts of the OpenRAG documentation.
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="Ollama" label="Ollama">
|
||||
|
||||
:::info
|
||||
Ollama isn't installed with OpenRAG. You must install it separately if you want to use Ollama as a model provider.
|
||||
:::
|
||||
|
||||
Using Ollama as your language and embedding model provider offers greater flexibility and configuration options for hosting models, but it can be advanced for new users.
|
||||
The recommendations given here are a reasonable starting point for users with at least one GPU and experience running LLMs locally.
|
||||
|
||||
The OpenRAG team recommends the OpenAI `gpt-oss:20b` lanuage model and the [`nomic-embed-text`](https://ollama.com/library/nomic-embed-text) embedding model.
|
||||
However, `gpt-oss:20b` uses 16GB of RAM, so consider using Ollama Cloud or running Ollama on a remote machine.
|
||||
|
||||
1. [Install Ollama locally or on a remote server](https://docs.ollama.com/) or [run models in Ollama Cloud](https://docs.ollama.com/cloud).
|
||||
|
||||
If you are running a remote server, it must be accessible from your OpenRAG deployment.
|
||||
|
||||
2. In OpenRAG onboarding, connect to your Ollama server:
|
||||
|
||||
* **Local Ollama server**: Enter your Ollama server's base URL and port. The default Ollama server address is `http://localhost:11434`.
|
||||
* **Ollama Cloud**: Because Ollama Cloud models run at the same address as a local Ollama server and automatically offload to Ollama's cloud service, you can use the same base URL and port as you would for a local Ollama server. The default address is `http://localhost:11434`.
|
||||
* **Remote server**: Enter your remote Ollama server's base URL and port, such as `http://your-remote-server:11434`.
|
||||
|
||||
If the connection succeeds, OpenRAG populates the model lists with the server's available models.
|
||||
|
||||
3. Select the **Embedding Model** and **Language Model** your Ollama server is running.
|
||||
|
||||
To use different providers for these models, you must configure both providers, and select the relevant model for each provider.
|
||||
|
||||
4. Click **Complete**.
|
||||
|
||||
5. Continue through the overview slides for a brief introduction to OpenRAG, or click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
|
||||
The overview demonstrates some basic functionality that is covered in the [quickstart](/quickstart#chat-with-documents) and in other parts of the OpenRAG documentation.
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="OpenAI" label="OpenAI (default)">
|
||||
|
||||
1. Enter your OpenAI API key, or enable **Get API key from environment variable** to pull the key from your OpenRAG `.env` file.
|
||||
|
||||
If you entered an OpenAI API key during setup, enable **Get API key from environment variable**.
|
||||
|
||||
2. Under **Advanced settings**, select the language model that you want to use.
|
||||
|
||||
3. Click **Complete**.
|
||||
|
||||
4. Select a provider for embeddings, provide the required information, and then select the embedding model you want to use.
|
||||
For information about another provider's credentials and settings, see the instructions for your chosen provider.
|
||||
|
||||
5. Continue through the overview slides for a brief introduction to OpenRAG, or click <Icon name="ArrowRight" aria-hidden="true"/> **Skip overview**.
|
||||
The overview demonstrates some basic functionality that is covered in the [quickstart](/quickstart#chat-with-documents) and in other parts of the OpenRAG documentation.
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
|
@ -2,5 +2,5 @@
|
|||
|
||||
- Install [Podman](https://podman.io/docs/installation) (recommended) or [Docker](https://docs.docker.com/get-docker/).
|
||||
|
||||
- Install [Podman Compose](https://docs.podman.io/en/latest/markdown/podman-compose.1.html) or [Docker Compose](https://docs.docker.com/compose/install/).
|
||||
To use Docker Compose with Podman, you must alias Docker Compose commands to Podman commands.
|
||||
- Install [`podman-compose`](https://docs.podman.io/en/latest/markdown/podman-compose.1.html) or [Docker Compose](https://docs.docker.com/compose/install/).
|
||||
To use Docker Compose with Podman, you must alias Docker Compose commands to Podman commands.
|
||||
|
|
@ -102,44 +102,63 @@ To install OpenRAG with Docker Compose, do the following:
|
|||
7. Deploy OpenRAG locally with the appropriate Docker Compose file for your environment.
|
||||
Both files deploy the same services.
|
||||
|
||||
- [`docker-compose.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose.yml) is an OpenRAG deployment with GPU support for accelerated AI processing. This Docker Compose file requires an NVIDIA GPU with [CUDA](https://docs.nvidia.com/cuda/) support.
|
||||
* [`docker-compose.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose.yml) is an OpenRAG deployment with GPU support for accelerated AI processing. This Docker Compose file requires an NVIDIA GPU with [CUDA](https://docs.nvidia.com/cuda/) support.
|
||||
|
||||
```bash
|
||||
docker compose build
|
||||
docker compose up -d
|
||||
```
|
||||
* Docker:
|
||||
|
||||
- [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without NVIDIA GPU support. Use this Docker Compose file for environments where GPU drivers aren't available.
|
||||
```bash
|
||||
docker compose build
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose-cpu.yml up -d
|
||||
```
|
||||
* Podman Compose:
|
||||
|
||||
<!-- add podman compose -->
|
||||
```bash
|
||||
podman compose build
|
||||
podman compose up -d
|
||||
```
|
||||
|
||||
* [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without NVIDIA GPU support. Use this Docker Compose file for environments where GPU drivers aren't available.
|
||||
|
||||
* Docker:
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose-cpu.yml up -d
|
||||
```
|
||||
|
||||
* Podman Compose:
|
||||
|
||||
```bash
|
||||
podman compose -f docker-compose-cpu.yml up -d
|
||||
```
|
||||
|
||||
The OpenRAG Docker Compose file starts five containers:
|
||||
| Container Name | Default Address | Purpose |
|
||||
|
||||
| Container Name | Default address | Purpose |
|
||||
|---|---|---|
|
||||
| OpenRAG Backend | http://localhost:8000 | FastAPI server and core functionality. |
|
||||
| OpenRAG Frontend | http://localhost:3000 | React web interface for users. |
|
||||
| Langflow | http://localhost:7860 | AI workflow engine and flow management. |
|
||||
| OpenSearch | http://localhost:9200 | Vector database for document storage. |
|
||||
| OpenSearch Dashboards | http://localhost:5601 | Database administration interface. |
|
||||
| OpenRAG Frontend | http://localhost:3000 | React web interface for user interaction. |
|
||||
| Langflow | http://localhost:7860 | [AI workflow engine](/agents). |
|
||||
| OpenSearch | http://localhost:9200 | Datastore for [knowledge](/knowledge). |
|
||||
| OpenSearch Dashboards | http://localhost:5601 | OpenSearch database administration interface. |
|
||||
|
||||
8. Verify installation by confirming all services are running.
|
||||
8. Wait while the containers start, and then confirm all containers are running:
|
||||
|
||||
<!-- add podman compose -->
|
||||
```bash
|
||||
docker compose ps
|
||||
```
|
||||
* Docker Compose:
|
||||
|
||||
You can now access OpenRAG at the following endpoints:
|
||||
```bash
|
||||
docker compose ps
|
||||
```
|
||||
|
||||
- **Frontend**: http://localhost:3000
|
||||
- **Backend API**: http://localhost:8000
|
||||
- **Langflow**: http://localhost:7860
|
||||
* Podman Compose:
|
||||
|
||||
9. Access the OpenRAG frontend to continue with [application onboarding](#application-onboarding).
|
||||
```bash
|
||||
podman compose ps
|
||||
```
|
||||
|
||||
If all containers are running, you can access your OpenRAG services at their addresses.
|
||||
|
||||
9. Access the OpenRAG frontend at `http://localhost:3000` to continue with [application onboarding](#application-onboarding).
|
||||
|
||||
<PartialOnboarding />
|
||||
|
||||
|
|
|
|||
|
|
@ -56,23 +56,27 @@ For example, with self-managed services, do the following:
|
|||
|
||||
All OpenRAG configuration can be controlled through environment variables.
|
||||
|
||||
### AI provider settings
|
||||
### Model provider settings {#model-provider-settings}
|
||||
|
||||
Configure which models and providers OpenRAG uses to generate text and embeddings.
|
||||
These are initially set during [application onboarding](/install#application-onboarding).
|
||||
Some values are immutable and can only be changed by redeploying OpenRAG, as explained in [Set environment variables](#set-environment-variables).
|
||||
You only need to provide credentials for the providers you are using in OpenRAG.
|
||||
|
||||
These variables are initially set during [application onboarding](/install#application-onboarding).
|
||||
Some of these variables are immutable and can only be changed by redeploying OpenRAG, as explained in [Set environment variables](#set-environment-variables).
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `EMBEDDING_MODEL` | `text-embedding-3-small` | Embedding model for generating vector embeddings for documents in the knowledge base and similarity search queries. Can be changed after application onboarding. Accepts one or more models. |
|
||||
| `LLM_MODEL` | `gpt-4o-mini` | Language model for language processing and text generation in the **Chat** feature. |
|
||||
| `MODEL_PROVIDER` | `openai` | Model provider, such as OpenAI or IBM watsonx.ai. |
|
||||
| `OPENAI_API_KEY` | Not set | Optional OpenAI API key for the default model. For other providers, use `PROVIDER_API_KEY`. |
|
||||
| `PROVIDER_API_KEY` | Not set | API key for the model provider. |
|
||||
| `PROVIDER_ENDPOINT` | Not set | Custom provider endpoint for the IBM and Ollama model providers. Leave unset for other model providers. |
|
||||
| `PROVIDER_PROJECT_ID` | Not set | Project ID for the IBM watsonx.ai model provider only. Leave unset for other model providers. |
|
||||
| `MODEL_PROVIDER` | `openai` | Model provider, as one of `openai`, `watsonx`, `ollama`, or `anthropic`. |
|
||||
| `ANTHROPIC_API_KEY` | Not set | API key for the Anthropic language model provider. |
|
||||
| `OPENAI_API_KEY` | Not set | API key for the OpenAI model provider, which is also the default model provider. |
|
||||
| `OLLAMA_ENDPOINT` | Not set | Custom provider endpoint for the Ollama model provider. |
|
||||
| `WATSONX_API_KEY` | Not set | API key for the IBM watsonx.ai model provider. |
|
||||
| `WATSONX_ENDPOINT` | Not set | Custom provider endpoint for the IBM watsonx.ai model provider. |
|
||||
| `WATSONX_PROJECT_ID` | Not set | Project ID for the IBM watsonx.ai model provider. |
|
||||
|
||||
### Document processing
|
||||
### Document processing settings
|
||||
|
||||
Control how OpenRAG [processes and ingests documents](/ingestion) into your knowledge base.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue