Merge remote-tracking branch 'origin/main' into lfx-openrag-update-flows

This commit is contained in:
Lucas Oliveira 2025-09-30 16:28:38 -03:00
commit 7d53242eaa
16 changed files with 553 additions and 382 deletions

View file

@ -0,0 +1,59 @@
name: Build Langflow Responses Multi-Arch
on:
workflow_dispatch:
jobs:
build:
strategy:
fail-fast: false
matrix:
include:
- platform: linux/amd64
arch: amd64
runs-on: ubuntu-latest
- platform: linux/arm64
arch: arm64
runs-on: [self-hosted, linux, ARM64, langflow-ai-arm64-2]
runs-on: ${{ matrix.runs-on }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Build and push langflow (${{ matrix.arch }})
uses: docker/build-push-action@v5
with:
context: .
file: ./Dockerfile.langflow
platforms: ${{ matrix.platform }}
push: true
tags: phact/langflow:responses-${{ matrix.arch }}
cache-from: type=gha,scope=langflow-responses-${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=langflow-responses-${{ matrix.arch }}
manifest:
needs: build
runs-on: ubuntu-latest
steps:
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Create and push multi-arch manifest
run: |
docker buildx imagetools create -t phact/langflow:responses \
phact/langflow:responses-amd64 \
phact/langflow:responses-arm64

View file

@ -138,7 +138,7 @@ podman machine start
### Common Issues
See common issues and fixes: [docs/reference/troubleshooting.mdx](docs/docs/reference/troubleshooting.mdx)
See common issues and fixes: [docs/support/troubleshoot.mdx](docs/docs/reference/troubleshoot.mdx)

View file

@ -1,40 +1,88 @@
---
title: Docker Deployment
title: Docker deployment
slug: /get-started/docker
---
# Docker Deployment
There are two different Docker Compose files.
They deploy the same applications and containers, but to different environments.
## Standard Deployment
- [`docker-compose.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose.yml) is an OpenRAG deployment with GPU support for accelerated AI processing.
```bash
# Build and start all services
docker compose build
docker compose up -d
```
- [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without GPU support. Use this Docker compose file for environments where GPU drivers aren't available.
## CPU-Only Deployment
To install OpenRAG with Docker Compose:
For environments without GPU support:
1. Clone the OpenRAG repository.
```bash
git clone https://github.com/langflow-ai/openrag.git
cd openrag
```
```bash
docker compose -f docker-compose-cpu.yml up -d
```
2. Copy the example `.env` file that is included in the repository root.
The example file includes all environment variables with comments to guide you in finding and setting their values.
```bash
cp .env.example .env
```
## Force Rebuild
Alternatively, create a new `.env` file in the repository root.
```
touch .env
```
If you need to reset state or rebuild everything:
3. Set environment variables. The Docker Compose files are populated with values from your `.env`, so the following values are **required** to be set:
```bash
OPENSEARCH_PASSWORD=your_secure_password
OPENAI_API_KEY=your_openai_api_key
LANGFLOW_SUPERUSER=admin
LANGFLOW_SUPERUSER_PASSWORD=your_langflow_password
LANGFLOW_SECRET_KEY=your_secret_key
```
For more information on configuring OpenRAG with environment variables, see [Environment variables](/configure/configuration).
For additional configuration values, including `config.yaml`, see [Configuration](/configure/configuration).
4. Deploy OpenRAG with Docker Compose based on your deployment type.
For GPU-enabled systems, run the following command:
```bash
docker compose up -d
```
For CPU-only systems, run the following command:
```bash
docker compose -f docker-compose-cpu.yml up -d
```
The OpenRAG Docker Compose file starts five containers:
| Container Name | Default Address | Purpose |
|---|---|---|
| OpenRAG Backend | http://localhost:8000 | FastAPI server and core functionality. |
| OpenRAG Frontend | http://localhost:3000 | React web interface for users. |
| Langflow | http://localhost:7860 | AI workflow engine and flow management. |
| OpenSearch | http://localhost:9200 | Vector database for document storage. |
| OpenSearch Dashboards | http://localhost:5601 | Database administration interface. |
5. Verify installation by confirming all services are running.
```bash
docker compose ps
```
You can now access the application at:
- **Frontend**: http://localhost:3000
- **Backend API**: http://localhost:8000
- **Langflow**: http://localhost:7860
Continue with the [Quickstart](/quickstart).
## Rebuild all Docker containers
If you need to reset state and rebuild all of your containers, run the following command.
Your OpenSearch and Langflow databases will be lost.
Documents stored in the `./documents` directory will persist, since the directory is mounted as a volume in the OpenRAG backend container.
```bash
docker compose up --build --force-recreate --remove-orphans
```
## Service URLs
After deployment, services are available at:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- Langflow: http://localhost:7860
- OpenSearch: http://localhost:9200
- OpenSearch Dashboards: http://localhost:5601

View file

@ -10,7 +10,7 @@ OpenRAG can be installed in multiple ways:
* [**Python wheel**](#install-python-wheel): Install the OpenRAG Python wheel and use the [OpenRAG Terminal User Interface (TUI)](/get-started/tui) to install, run, and configure your OpenRAG deployment without running Docker commands.
* [**Docker Compose**](#install-and-run-docker): Clone the OpenRAG repository and deploy OpenRAG with Docker Compose, including all services and dependencies.
* [**Docker Compose**](get-started/docker): Clone the OpenRAG repository and deploy OpenRAG with Docker Compose, including all services and dependencies.
## Prerequisites
@ -138,80 +138,4 @@ The `LANGFLOW_PUBLIC_URL` controls where the Langflow web interface can be acces
The `WEBHOOK_BASE_URL` controls where the endpoint for `/connectors/CONNECTOR_TYPE/webhook` will be available.
This connection enables real-time document synchronization with external services.
For example, for Google Drive file synchronization the webhook URL is `/connectors/google_drive/webhook`.
## Docker {#install-and-run-docker}
There are two different Docker Compose files.
They deploy the same applications and containers, but to different environments.
- [`docker-compose.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose.yml) is an OpenRAG deployment with GPU support for accelerated AI processing.
- [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without GPU support. Use this Docker compose file for environments where GPU drivers aren't available.
To install OpenRAG with Docker Compose:
1. Clone the OpenRAG repository.
```bash
git clone https://github.com/langflow-ai/openrag.git
cd openrag
```
2. Copy the example `.env` file that is included in the repository root.
The example file includes all environment variables with comments to guide you in finding and setting their values.
```bash
cp .env.example .env
```
Alternatively, create a new `.env` file in the repository root.
```
touch .env
```
3. Set environment variables. The Docker Compose files are populated with values from your `.env`, so the following values are **required** to be set:
```bash
OPENSEARCH_PASSWORD=your_secure_password
OPENAI_API_KEY=your_openai_api_key
LANGFLOW_SUPERUSER=admin
LANGFLOW_SUPERUSER_PASSWORD=your_langflow_password
LANGFLOW_SECRET_KEY=your_secret_key
```
For more information on configuring OpenRAG with environment variables, see [Environment variables](/configure/configuration).
For additional configuration values, including `config.yaml`, see [Configuration](/configure/configuration).
4. Deploy OpenRAG with Docker Compose based on your deployment type.
For GPU-enabled systems, run the following command:
```bash
docker compose up -d
```
For CPU-only systems, run the following command:
```bash
docker compose -f docker-compose-cpu.yml up -d
```
The OpenRAG Docker Compose file starts five containers:
| Container Name | Default Address | Purpose |
|---|---|---|
| OpenRAG Backend | http://localhost:8000 | FastAPI server and core functionality. |
| OpenRAG Frontend | http://localhost:3000 | React web interface for users. |
| Langflow | http://localhost:7860 | AI workflow engine and flow management. |
| OpenSearch | http://localhost:9200 | Vector database for document storage. |
| OpenSearch Dashboards | http://localhost:5601 | Database administration interface. |
5. Verify installation by confirming all services are running.
```bash
docker compose ps
```
You can now access the application at:
- **Frontend**: http://localhost:3000
- **Backend API**: http://localhost:8000
- **Langflow**: http://localhost:7860
Continue with the Quickstart.
For example, for Google Drive file synchronization the webhook URL is `/connectors/google_drive/webhook`.

View file

@ -1,66 +1,94 @@
---
title: Terminal Interface (TUI)
title: Terminal User Interface (TUI) commands
slug: /get-started/tui
---
# OpenRAG TUI Guide
The OpenRAG Terminal User Interface (TUI) provides a streamlined way to set up, configure, and monitor your OpenRAG deployment directly from the terminal.
The OpenRAG Terminal User Interface (TUI) provides a streamlined way to set up, configure, and monitor your OpenRAG deployment directly from the terminal, on any operating system.
![OpenRAG TUI Interface](@site/static/img/OpenRAG_TUI_2025-09-10T13_04_11_757637.svg)
## Launch
The TUI offers an easier way to use OpenRAG without sacrificing control.
Instead of starting OpenRAG using Docker commands and manually editing values in the `.env` file, the TUI walks you through the setup. It prompts for variables where required, creates a `.env` file for you, and then starts OpenRAG.
Once OpenRAG is running, use the TUI to monitor your application, control your containers, and retrieve logs.
## Start the TUI
To start the TUI, run the following commands from the directory where you installed OpenRAG.
For more information, see [Install OpenRAG](/install).
```bash
uv sync
uv run openrag
```
## Features
### Welcome Screen
- Quick setup options: basic (no auth) or advanced (OAuth)
- Service monitoring: container status at a glance
- Quick actions: diagnostics, logs, configuration
### Configuration Screen
- Environment variables: guided forms for required settings
- API keys: secure input with validation
- OAuth setup: Google and Microsoft
- Document paths: configure ingestion directories
- Auto-save: generates and updates `.env`
### Service Monitor
- Container status: real-time state of services
- Resource usage: CPU, memory, network
- Service control: start/stop/restart
- Health checks: health indicators for all components
### Log Viewer
- Live logs: stream logs across services
- Filtering: by service (backend, frontend, Langflow, OpenSearch)
- Levels: DEBUG/INFO/WARNING/ERROR
- Export: save logs for later analysis
### Diagnostics
- System checks: Docker/Podman availability and configuration
- Environment validation: verify required variables
- Network tests: connectivity between services
- Performance metrics: system capacity and recommendations
The TUI Welcome Screen offers basic and advanced setup options.
For more information on setup values during installation, see [Install OpenRAG](/install).
## Navigation
- Arrow keys: move between options
- Tab/Shift+Tab: switch fields and buttons
- Enter: select/confirm
- Escape: back
- Q: quit
- Number keys (1-4): quick access to main screens
## Benefits
1. Simplified setup without manual file edits
2. Clear visual feedback and error messages
3. Integrated monitoring and control
4. Cross-platform: Linux, macOS, Windows
5. Fully terminal-based; no browser required
The TUI accepts mouse input or keyboard commands.
- <kbd>Arrow keys</kbd>: move between options
- <kbd>Tab</kbd>/<kbd>Shift+Tab</kbd>: switch fields and buttons
- <kbd>Enter</kbd>: select/confirm
- <kbd>Escape</kbd>: back
- <kbd>Q</kbd>: quit
- <kbd>Number keys (1-4)</kbd>: quick access to main screens
## Container management
The TUI can deploy, manage, and upgrade your OpenRAG containers.
### Start container services
Click **Start Container Services** to start the OpenRAG containers.
The TUI automatically detects your container runtime, and then checks if your machine has compatible GPU support by checking for `CUDA`, `NVIDIA_SMI`, and Docker/Podman runtime support. This check determines which Docker Compose file OpenRAG uses.
The TUI then pulls the images and deploys the containers with the following command.
```bash
docker compose up -d
```
If images are missing, the TUI runs `docker compose pull`, then runs `docker compose up -d`.
### Start native services
A "native" service in OpenRAG refers to a service run natively on your machine, and not within a container.
The `docling-serve` process is a native service in OpenRAG, because it's a document processing service that is run on your local machine, and controlled separately from the containers.
To start or stop `docling-serve` or any other native services, in the TUI main menu, click **Start Native Services** or **Stop Native Services**.
To view the status, port, or PID of a native service, in the TUI main menu, click [Status](#status).
### Status
The **Status** menu displays information on your container deployment.
Here you can check container health, find your service ports, view logs, and upgrade your containers.
To view streaming logs, select the container you want to view, and press <kbd>l</kbd>.
To copy your logs, click **Copy to Clipboard**.
To **upgrade** your containers, click **Upgrade**.
**Upgrade** runs `docker compose pull` and then `docker compose up -d --force-recreate`.
The first command pulls the latest images of OpenRAG.
The second command recreates the containers with your data persisted.
To **reset** your containers, click **Reset**.
Reset gives you a completely fresh start.
Reset deletes all of your data, including OpenSearch data, uploaded documents, and authentication.
**Reset** runs two commands.
It first stops and removes all containers, volumes, and local images.
```
docker compose down --volumes --remove-orphans --rmi local
```
When the first command is complete, OpenRAG removes any additional Docker objects with `prune`.
```
docker system prune -f
```
## Diagnostics
The **Diagnostics** menu provides health monitoring for your container runtimes and monitoring of your OpenSearch security.

View file

@ -1,24 +0,0 @@
---
title: Troubleshooting
slug: /reference/troubleshooting
---
# Troubleshooting
## Podman on macOS
If using Podman on macOS, you may need to increase VM memory:
```bash
podman machine stop
podman machine rm
podman machine init --memory 8192 # 8 GB example
podman machine start
```
## Common Issues
1. OpenSearch fails to start: Check that `OPENSEARCH_PASSWORD` is set and meets requirements
2. Langflow connection issues: Verify `LANGFLOW_SUPERUSER` credentials are correct
3. Out of memory errors: Increase Docker memory allocation or use CPU-only mode
4. Port conflicts: Ensure ports 3000, 7860, 8000, 9200, 5601 are available

View file

@ -0,0 +1,107 @@
---
title: Troubleshoot
slug: /support/troubleshoot
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
This page provides troubleshooting advice for issues you might encounter when using OpenRAG or contributing to OpenRAG.
## OpenSearch fails to start
Check that `OPENSEARCH_PASSWORD` is set and meets requirements.
The password must contain at least 8 characters, and must contain at least one uppercase letter, one lowercase letter, one digit, and one special character that is strong.
## Langflow connection issues
Verify the `LANGFLOW_SUPERUSER` credentials are correct.
## Memory errors
### Container out of memory errors
Increase Docker memory allocation or use [docker-compose-cpu.yml](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) to deploy OpenRAG.
### Podman on macOS memory issues
If you're using Podman on macOS, you may need to increase VM memory on your Podman machine.
This example increases the machine size to 8 GB of RAM, which should be sufficient to run OpenRAG.
```bash
podman machine stop
podman machine rm
podman machine init --memory 8192 # 8 GB example
podman machine start
```
## Port conflicts
Ensure ports 3000, 7860, 8000, 9200, 5601 are available.
## Langflow container already exists
If you are running other versions of Langflow containers on your machine, you may encounter an issue where Docker or Podman thinks Langflow is already up.
Remove just the problem container, or clean up all containers and start fresh.
To reset your local containers and pull new images, do the following:
1. Stop your containers and completely remove them.
<Tabs groupId="Container software">
<TabItem value="Docker" label="Docker" default>
```bash
# Stop all running containers
docker stop $(docker ps -q)
# Remove all containers (including stopped ones)
docker rm --force $(docker ps -aq)
# Remove all images
docker rmi --force $(docker images -q)
# Remove all volumes
docker volume prune --force
# Remove all networks (except default)
docker network prune --force
# Clean up any leftover data
docker system prune --all --force --volumes
```
</TabItem>
<TabItem value="Podman" label="Podman">
```bash
# Stop all running containers
podman stop --all
# Remove all containers (including stopped ones)
podman rm --all --force
# Remove all images
podman rmi --all --force
# Remove all volumes
podman volume prune --force
# Remove all networks (except default)
podman network prune --force
# Clean up any leftover data
podman system prune --all --force --volumes
```
</TabItem>
</Tabs>
2. Restart OpenRAG and upgrade to get the latest images for your containers.
```bash
uv run openrag
```
3. In the OpenRAG TUI, click **Status**, and then click **Upgrade**.
When the **Close** button is active, the upgrade is complete.
Close the window and open the OpenRAG appplication.

View file

@ -76,12 +76,12 @@ const sidebars = {
},
{
type: "category",
label: "Reference",
label: "Support",
items: [
{
type: "doc",
id: "reference/troubleshooting",
label: "Troubleshooting"
id: "support/troubleshoot",
label: "Troubleshoot"
},
],
},

View file

@ -1,29 +1,29 @@
"use client"
"use client";
import * as React from "react"
import * as SwitchPrimitives from "@radix-ui/react-switch"
import * as SwitchPrimitives from "@radix-ui/react-switch";
import * as React from "react";
import { cn } from "@/lib/utils"
import { cn } from "@/lib/utils";
const Switch = React.forwardRef<
React.ElementRef<typeof SwitchPrimitives.Root>,
React.ComponentPropsWithoutRef<typeof SwitchPrimitives.Root>
React.ElementRef<typeof SwitchPrimitives.Root>,
React.ComponentPropsWithoutRef<typeof SwitchPrimitives.Root>
>(({ className, ...props }, ref) => (
<SwitchPrimitives.Root
className={cn(
"peer inline-flex h-6 w-11 shrink-0 cursor-pointer items-center rounded-full border-2 border-transparent transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 focus-visible:ring-offset-background disabled:cursor-not-allowed disabled:opacity-50 data-[state=checked]:bg-primary data-[state=unchecked]:bg-input",
className
)}
{...props}
ref={ref}
>
<SwitchPrimitives.Thumb
className={cn(
"pointer-events-none block h-5 w-5 rounded-full bg-background shadow-lg ring-0 transition-transform data-[state=checked]:translate-x-5 data-[state=unchecked]:translate-x-0"
)}
/>
</SwitchPrimitives.Root>
))
Switch.displayName = SwitchPrimitives.Root.displayName
<SwitchPrimitives.Root
className={cn(
"peer inline-flex h-6 w-11 shrink-0 cursor-pointer items-center rounded-full border-2 border-transparent transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 focus-visible:ring-offset-background disabled:cursor-not-allowed disabled:opacity-50 data-[state=checked]:bg-primary data-[state=unchecked]:bg-muted",
className,
)}
{...props}
ref={ref}
>
<SwitchPrimitives.Thumb
className={cn(
"pointer-events-none block h-5 w-5 rounded-full bg-background shadow-lg ring-0 transition-transform data-[state=checked]:translate-x-5 data-[state=unchecked]:translate-x-0 data-[state=unchecked]:bg-primary",
)}
/>
</SwitchPrimitives.Root>
));
Switch.displayName = SwitchPrimitives.Root.displayName;
export { Switch }
export { Switch };

View file

@ -8,7 +8,9 @@ interface UpdateFlowSettingVariables {
llm_model?: string;
system_prompt?: string;
embedding_model?: string;
doclingPresets?: string;
table_structure?: boolean;
ocr?: boolean;
picture_descriptions?: boolean;
chunk_size?: number;
chunk_overlap?: number;
}

View file

@ -13,7 +13,9 @@ export interface KnowledgeSettings {
embedding_model?: string;
chunk_size?: number;
chunk_overlap?: number;
doclingPresets?: string;
table_structure?: boolean;
ocr?: boolean;
picture_descriptions?: boolean;
}
export interface Settings {

View file

@ -1,10 +1,5 @@
"use client";
import {
Tooltip,
TooltipContent,
TooltipTrigger,
} from "@radix-ui/react-tooltip";
import { ArrowUpRight, Loader2, PlugZap, RefreshCw } from "lucide-react";
import { useSearchParams } from "next/navigation";
import { Suspense, useCallback, useEffect, useState } from "react";
@ -30,13 +25,13 @@ import {
import { Checkbox } from "@/components/ui/checkbox";
import { Input } from "@/components/ui/input";
import { Label } from "@/components/ui/label";
import { RadioGroup, RadioGroupItem } from "@/components/ui/radio-group";
import {
Select,
SelectContent,
SelectTrigger,
SelectValue,
} from "@/components/ui/select";
import { Switch } from "@/components/ui/switch";
import { Textarea } from "@/components/ui/textarea";
import { useAuth } from "@/contexts/auth-context";
import { useTask } from "@/contexts/task-context";
@ -116,7 +111,10 @@ function KnowledgeSourcesPage() {
const [systemPrompt, setSystemPrompt] = useState<string>("");
const [chunkSize, setChunkSize] = useState<number>(1024);
const [chunkOverlap, setChunkOverlap] = useState<number>(50);
const [processingMode, setProcessingMode] = useState<string>("standard");
const [tableStructure, setTableStructure] = useState<boolean>(false);
const [ocr, setOcr] = useState<boolean>(false);
const [pictureDescriptions, setPictureDescriptions] =
useState<boolean>(false);
// Fetch settings using React Query
const { data: settings = {} } = useGetSettingsQuery({
@ -200,12 +198,24 @@ function KnowledgeSourcesPage() {
}
}, [settings.knowledge?.chunk_overlap]);
// Sync processing mode with settings data
// Sync docling settings with settings data
useEffect(() => {
if (settings.knowledge?.doclingPresets) {
setProcessingMode(settings.knowledge.doclingPresets);
if (settings.knowledge?.table_structure !== undefined) {
setTableStructure(settings.knowledge.table_structure);
}
}, [settings.knowledge?.doclingPresets]);
}, [settings.knowledge?.table_structure]);
useEffect(() => {
if (settings.knowledge?.ocr !== undefined) {
setOcr(settings.knowledge.ocr);
}
}, [settings.knowledge?.ocr]);
useEffect(() => {
if (settings.knowledge?.picture_descriptions !== undefined) {
setPictureDescriptions(settings.knowledge.picture_descriptions);
}
}, [settings.knowledge?.picture_descriptions]);
// Update model selection immediately
const handleModelChange = (newModel: string) => {
@ -236,11 +246,20 @@ function KnowledgeSourcesPage() {
debouncedUpdate({ chunk_overlap: numValue });
};
// Update processing mode
const handleProcessingModeChange = (mode: string) => {
setProcessingMode(mode);
// Update the configuration setting (backend will also update the flow automatically)
debouncedUpdate({ doclingPresets: mode });
// Update docling settings
const handleTableStructureChange = (checked: boolean) => {
setTableStructure(checked);
updateFlowSettingMutation.mutate({ table_structure: checked });
};
const handleOcrChange = (checked: boolean) => {
setOcr(checked);
updateFlowSettingMutation.mutate({ ocr: checked });
};
const handlePictureDescriptionsChange = (checked: boolean) => {
setPictureDescriptions(checked);
updateFlowSettingMutation.mutate({ picture_descriptions: checked });
};
// Helper function to get connector icon
@ -574,7 +593,9 @@ function KnowledgeSourcesPage() {
// Only reset form values if the API call was successful
setChunkSize(DEFAULT_KNOWLEDGE_SETTINGS.chunk_size);
setChunkOverlap(DEFAULT_KNOWLEDGE_SETTINGS.chunk_overlap);
setProcessingMode(DEFAULT_KNOWLEDGE_SETTINGS.processing_mode);
setTableStructure(false);
setOcr(false);
setPictureDescriptions(false);
closeDialog(); // Close after successful completion
})
.catch((error) => {
@ -1068,76 +1089,62 @@ function KnowledgeSourcesPage() {
</div>
</div>
</div>
<div className="space-y-3">
<Label className="text-base font-medium">Ingestion presets</Label>
<RadioGroup
value={processingMode}
onValueChange={handleProcessingModeChange}
className="space-y-3"
>
<div className="flex items-center space-x-3">
<RadioGroupItem value="standard" id="standard" />
<div className="flex-1">
<Label
htmlFor="standard"
className="text-base font-medium cursor-pointer"
>
No OCR
</Label>
<div className="text-sm text-muted-foreground">
Fast ingest for documents with selectable text. Images are
ignored.
</div>
<div className="">
<div className="flex items-center justify-between py-3 border-b border-border">
<div className="flex-1">
<Label
htmlFor="table-structure"
className="text-base font-medium cursor-pointer pb-3"
>
Table Structure
</Label>
<div className="text-sm text-muted-foreground">
Capture table structure during ingest.
</div>
</div>
<div className="flex items-center space-x-3">
<RadioGroupItem value="ocr" id="ocr" />
<div className="flex-1">
<Label
htmlFor="ocr"
className="text-base font-medium cursor-pointer"
>
OCR
</Label>
<div className="text-sm text-muted-foreground">
Extracts text from images and scanned pages.
</div>
<Switch
id="table-structure"
checked={tableStructure}
onCheckedChange={handleTableStructureChange}
/>
</div>
<div className="flex items-center justify-between py-3 border-b border-border">
<div className="flex-1">
<Label
htmlFor="ocr"
className="text-base font-medium cursor-pointer pb-3"
>
OCR
</Label>
<div className="text-sm text-muted-foreground">
Extracts text from images/PDFs. Ingest is slower when
enabled.
</div>
</div>
<div className="flex items-center space-x-3">
<RadioGroupItem
value="picture_description"
id="picture_description"
/>
<div className="flex-1">
<Label
htmlFor="picture_description"
className="text-base font-medium cursor-pointer"
>
OCR + Captions
</Label>
<div className="text-sm text-muted-foreground">
Extracts text from images and scanned pages. Generates
short image captions.
</div>
<Switch
id="ocr"
checked={ocr}
onCheckedChange={handleOcrChange}
/>
</div>
<div className="flex items-center justify-between py-3">
<div className="flex-1">
<Label
htmlFor="picture-descriptions"
className="text-base font-medium cursor-pointer pb-3"
>
Picture Descriptions
</Label>
<div className="text-sm text-muted-foreground">
Adds captions for images. Ingest is slower when enabled.
</div>
</div>
<div className="flex items-center space-x-3">
<RadioGroupItem value="VLM" id="VLM" />
<div className="flex-1">
<Label
htmlFor="VLM"
className="text-base font-medium cursor-pointer"
>
VLM
</Label>
<div className="text-sm text-muted-foreground">
Extracts text from layout-aware parsing of text, tables,
and sections.
</div>
</div>
</div>
</RadioGroup>
<Switch
id="picture-descriptions"
checked={pictureDescriptions}
onCheckedChange={handlePictureDescriptionsChange}
/>
</div>
</div>
</div>
</CardContent>

View file

@ -1,29 +0,0 @@
"use client"
import * as React from "react"
import * as SwitchPrimitives from "@radix-ui/react-switch"
import { cn } from "@/lib/utils"
const Switch = React.forwardRef<
React.ElementRef<typeof SwitchPrimitives.Root>,
React.ComponentPropsWithoutRef<typeof SwitchPrimitives.Root>
>(({ className, ...props }, ref) => (
<SwitchPrimitives.Root
className={cn(
"peer inline-flex h-6 w-11 shrink-0 cursor-pointer items-center rounded-full border-2 border-transparent transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 focus-visible:ring-offset-background disabled:cursor-not-allowed disabled:opacity-50 data-[state=checked]:bg-primary data-[state=unchecked]:bg-input",
className
)}
{...props}
ref={ref}
>
<SwitchPrimitives.Thumb
className={cn(
"pointer-events-none block h-5 w-5 rounded-full bg-background shadow-lg ring-0 transition-transform data-[state=checked]:translate-x-5 data-[state=unchecked]:translate-x-0"
)}
/>
</SwitchPrimitives.Root>
))
Switch.displayName = SwitchPrimitives.Root.displayName
export { Switch }

View file

@ -12,7 +12,9 @@ export const DEFAULT_AGENT_SETTINGS = {
export const DEFAULT_KNOWLEDGE_SETTINGS = {
chunk_size: 1000,
chunk_overlap: 200,
processing_mode: "standard"
table_structure: false,
ocr: false,
picture_descriptions: false
} as const;
/**

View file

@ -16,35 +16,30 @@ logger = get_logger(__name__)
# Docling preset configurations
def get_docling_preset_configs():
"""Get docling preset configurations with platform-specific settings"""
def get_docling_preset_configs(table_structure=False, ocr=False, picture_descriptions=False):
"""Get docling preset configurations based on toggle settings
Args:
table_structure: Enable table structure parsing (default: False)
ocr: Enable OCR for text extraction from images (default: False)
picture_descriptions: Enable picture descriptions/captions (default: False)
"""
is_macos = platform.system() == "Darwin"
return {
"standard": {"do_ocr": False},
"ocr": {"do_ocr": True, "ocr_engine": "ocrmac" if is_macos else "easyocr"},
"picture_description": {
"do_ocr": True,
"ocr_engine": "ocrmac" if is_macos else "easyocr",
"do_picture_classification": True,
"do_picture_description": True,
"picture_description_local": {
"repo_id": "HuggingFaceTB/SmolVLM-256M-Instruct",
"prompt": "Describe this image in a few sentences.",
},
},
"VLM": {
"pipeline": "vlm",
"vlm_pipeline_model_local": {
"repo_id": "ds4sd/SmolDocling-256M-preview-mlx-bf16"
if is_macos
else "ds4sd/SmolDocling-256M-preview",
"response_format": "doctags",
"inference_framework": "mlx",
},
},
config = {
"do_ocr": ocr,
"ocr_engine": "ocrmac" if is_macos else "easyocr",
"do_table_structure": table_structure,
"do_picture_classification": picture_descriptions,
"do_picture_description": picture_descriptions,
"picture_description_local": {
"repo_id": "HuggingFaceTB/SmolVLM-256M-Instruct",
"prompt": "Describe this image in a few sentences.",
}
}
return config
async def get_settings(request, session_manager):
"""Get application settings"""
@ -70,7 +65,9 @@ async def get_settings(request, session_manager):
"embedding_model": knowledge_config.embedding_model,
"chunk_size": knowledge_config.chunk_size,
"chunk_overlap": knowledge_config.chunk_overlap,
"doclingPresets": knowledge_config.doclingPresets,
"table_structure": knowledge_config.table_structure,
"ocr": knowledge_config.ocr,
"picture_descriptions": knowledge_config.picture_descriptions,
},
"agent": {
"llm_model": agent_config.llm_model,
@ -177,7 +174,9 @@ async def update_settings(request, session_manager):
"system_prompt",
"chunk_size",
"chunk_overlap",
"doclingPresets",
"table_structure",
"ocr",
"picture_descriptions",
"embedding_model",
}
@ -256,32 +255,68 @@ async def update_settings(request, session_manager):
# Don't fail the entire settings update if flow update fails
# The config will still be saved
if "doclingPresets" in body:
preset_configs = get_docling_preset_configs()
valid_presets = list(preset_configs.keys())
if body["doclingPresets"] not in valid_presets:
if "table_structure" in body:
if not isinstance(body["table_structure"], bool):
return JSONResponse(
{
"error": f"doclingPresets must be one of: {', '.join(valid_presets)}"
},
status_code=400,
{"error": "table_structure must be a boolean"}, status_code=400
)
current_config.knowledge.doclingPresets = body["doclingPresets"]
current_config.knowledge.table_structure = body["table_structure"]
config_updated = True
# Also update the flow with the new docling preset
# Also update the flow with the new docling settings
try:
flows_service = _get_flows_service()
await flows_service.update_flow_docling_preset(
body["doclingPresets"], preset_configs[body["doclingPresets"]]
)
logger.info(
f"Successfully updated docling preset in flow to '{body['doclingPresets']}'"
preset_config = get_docling_preset_configs(
table_structure=body["table_structure"],
ocr=current_config.knowledge.ocr,
picture_descriptions=current_config.knowledge.picture_descriptions
)
await flows_service.update_flow_docling_preset("custom", preset_config)
logger.info(f"Successfully updated table_structure setting in flow")
except Exception as e:
logger.error(f"Failed to update docling preset in flow: {str(e)}")
# Don't fail the entire settings update if flow update fails
# The config will still be saved
logger.error(f"Failed to update docling settings in flow: {str(e)}")
if "ocr" in body:
if not isinstance(body["ocr"], bool):
return JSONResponse(
{"error": "ocr must be a boolean"}, status_code=400
)
current_config.knowledge.ocr = body["ocr"]
config_updated = True
# Also update the flow with the new docling settings
try:
flows_service = _get_flows_service()
preset_config = get_docling_preset_configs(
table_structure=current_config.knowledge.table_structure,
ocr=body["ocr"],
picture_descriptions=current_config.knowledge.picture_descriptions
)
await flows_service.update_flow_docling_preset("custom", preset_config)
logger.info(f"Successfully updated ocr setting in flow")
except Exception as e:
logger.error(f"Failed to update docling settings in flow: {str(e)}")
if "picture_descriptions" in body:
if not isinstance(body["picture_descriptions"], bool):
return JSONResponse(
{"error": "picture_descriptions must be a boolean"}, status_code=400
)
current_config.knowledge.picture_descriptions = body["picture_descriptions"]
config_updated = True
# Also update the flow with the new docling settings
try:
flows_service = _get_flows_service()
preset_config = get_docling_preset_configs(
table_structure=current_config.knowledge.table_structure,
ocr=current_config.knowledge.ocr,
picture_descriptions=body["picture_descriptions"]
)
await flows_service.update_flow_docling_preset("custom", preset_config)
logger.info(f"Successfully updated picture_descriptions setting in flow")
except Exception as e:
logger.error(f"Failed to update docling settings in flow: {str(e)}")
if "chunk_size" in body:
if not isinstance(body["chunk_size"], int) or body["chunk_size"] <= 0:
@ -625,48 +660,56 @@ def _get_flows_service():
async def update_docling_preset(request, session_manager):
"""Update docling preset in the ingest flow"""
"""Update docling settings in the ingest flow - deprecated endpoint, use /settings instead"""
try:
# Parse request body
body = await request.json()
# Validate preset parameter
if "preset" not in body:
return JSONResponse(
{"error": "preset parameter is required"}, status_code=400
)
# Support old preset-based API for backwards compatibility
if "preset" in body:
# Map old presets to new toggle settings
preset_map = {
"standard": {"table_structure": False, "ocr": False, "picture_descriptions": False},
"ocr": {"table_structure": False, "ocr": True, "picture_descriptions": False},
"picture_description": {"table_structure": False, "ocr": True, "picture_descriptions": True},
"VLM": {"table_structure": False, "ocr": False, "picture_descriptions": False},
}
preset = body["preset"]
preset_configs = get_docling_preset_configs()
preset = body["preset"]
if preset not in preset_map:
return JSONResponse(
{"error": f"Invalid preset '{preset}'. Valid presets: {', '.join(preset_map.keys())}"},
status_code=400,
)
if preset not in preset_configs:
valid_presets = list(preset_configs.keys())
return JSONResponse(
{
"error": f"Invalid preset '{preset}'. Valid presets: {', '.join(valid_presets)}"
},
status_code=400,
)
settings = preset_map[preset]
else:
# Support new toggle-based API
settings = {
"table_structure": body.get("table_structure", False),
"ocr": body.get("ocr", False),
"picture_descriptions": body.get("picture_descriptions", False),
}
# Get the preset configuration
preset_config = preset_configs[preset]
preset_config = get_docling_preset_configs(**settings)
# Use the helper function to update the flow
flows_service = _get_flows_service()
await flows_service.update_flow_docling_preset(preset, preset_config)
await flows_service.update_flow_docling_preset("custom", preset_config)
logger.info(f"Successfully updated docling preset to '{preset}' in ingest flow")
logger.info(f"Successfully updated docling settings in ingest flow")
return JSONResponse(
{
"message": f"Successfully updated docling preset to '{preset}'",
"preset": preset,
"message": f"Successfully updated docling settings",
"settings": settings,
"preset_config": preset_config,
}
)
except Exception as e:
logger.error("Failed to update docling preset", error=str(e))
logger.error("Failed to update docling settings", error=str(e))
return JSONResponse(
{"error": f"Failed to update docling preset: {str(e)}"}, status_code=500
{"error": f"Failed to update docling settings: {str(e)}"}, status_code=500
)

View file

@ -27,7 +27,9 @@ class KnowledgeConfig:
embedding_model: str = "text-embedding-3-small"
chunk_size: int = 1000
chunk_overlap: int = 200
doclingPresets: str = "standard"
table_structure: bool = False
ocr: bool = False
picture_descriptions: bool = False
@dataclass