Merge branch 'main' into duplicate_dev

2025-10-27 15:58:36 +08:00 · 2025-10-27 15:58:36 +08:00 · 42f4eeb39b
commit 42f4eeb39b
parent a5d1ce584e 69b4cda242
225 changed files with 14455 additions and 7108 deletions
--- a/.clinerules/01-basic.md
+++ b/.clinerules/01-basic.md
@ -0,0 +1,207 @@
+# LightRAG Project Intelligence (.clinerules)
+
+## Project Overview
+LightRAG is a mature, production-ready Retrieval-Augmented Generation (RAG) system with comprehensive knowledge graph capabilities. The system has evolved from experimental to production-ready status with extensive functionality across all major components.
+
+## Current System State (August 15, 2025)
+- **Status**: Production Ready - Stable and Mature
+- **Configuration**: Gemini 2.5 Flash + BAAI/bge-m3 embeddings via custom endpoints
+- **Storage**: Default in-memory with file persistence (JsonKVStorage, NetworkXStorage, NanoVectorDBStorage)
+- **Language**: Chinese for summaries
+- **Workspace**: `space1` for data isolation
+- **Authentication**: JWT-based with admin/user accounts
+
+## Critical Implementation Patterns
+
+### 1. Embedding Format Compatibility (CRITICAL)
+**Pattern**: Always handle both base64 and raw array embedding formats
+**Location**: `lightrag/llm/openai.py` - `openai_embed` function
+**Issue**: Custom OpenAI-compatible endpoints return embeddings as raw arrays, not base64 strings
+**Solution**:
+```python
+np.array(dp.embedding, dtype=np.float32) if isinstance(dp.embedding, list)
+else np.frombuffer(base64.b64decode(dp.embedding), dtype=np.float32)
+```
+**Impact**: Document processing fails completely without this dual format support
+
+### 2. Async Pattern Consistency (CRITICAL)
+**Pattern**: Always await coroutines before calling methods on the result
+**Common Error**: `coroutine.method()` instead of `(await coroutine).method()`
+**Locations**: MongoDB implementations, Neo4j operations
+**Example**: `await self._data.list_indexes()` then `await cursor.to_list()`
+
+### 3. Storage Layer Data Compatibility (CRITICAL)
+**Pattern**: Always filter deprecated/incompatible fields during deserialization
+**Common Fields to Remove**: `content`, `_id` (MongoDB), database-specific fields
+**Implementation**: `data.pop('field_name', None)` before creating dataclass objects
+**Locations**: All storage implementations (JSON, Redis, MongoDB, PostgreSQL)
+
+### 4. Lock Key Generation (CRITICAL)
+**Pattern**: Always sort relationship pairs for consistent lock keys
+**Implementation**: `sorted_key_parts = sorted([src, tgt])` then `f"{sorted_key_parts[0]}-{sorted_key_parts[1]}"`
+**Impact**: Prevents deadlocks in concurrent relationship processing
+
+### 5. Event Loop Management (CRITICAL)
+**Pattern**: Handle event loop mismatches during shutdown gracefully
+**Implementation**: Timeout + specific RuntimeError handling for "attached to a different loop"
+**Location**: Neo4j storage finalization
+**Impact**: Prevents application shutdown failures
+
+## Architecture Patterns
+
+### 1. Dependency Injection
+**Pattern**: Pass configuration through object constructors, not direct imports
+**Example**: OllamaAPI receives configuration through LightRAG object
+**Benefit**: Better testability and modularity
+
+### 2. Memory Bank Documentation
+**Pattern**: Maintain comprehensive memory bank for development continuity
+**Structure**: Core files (projectbrief.md, activeContext.md, progress.md, etc.)
+**Purpose**: Essential for context preservation across development sessions
+
+### 3. Configuration Management
+**Pattern**: Centralize defaults in constants.py, use environment variables for runtime config
+**Implementation**: Default values in constants, override via .env file
+**Benefit**: Consistent configuration across components
+
+## Development Workflow Patterns
+
+### 1. Frontend Development (CRITICAL)
+**Package Manager**: **ALWAYS USE BUN** - Never use npm or yarn unless Bun is unavailable
+**Commands**:
+- `bun install` - Install dependencies
+- `bun run dev` - Start development server
+- `bun run build` - Build for production
+- `bun run lint` - Run linting
+- `bun test` - Run tests
+- `bun run preview` - Preview production build
+
+**Pattern**: All frontend operations must use Bun commands
+**Fallback**: Only use npm/yarn if Bun installation fails
+**Testing**: Use `bun test` for all frontend testing
+
+### 2. Bug Fix Approach
+1. **Identify root cause** - Don't just fix symptoms
+2. **Implement robust solution** - Handle edge cases and format variations
+3. **Maintain backward compatibility** - Preserve existing functionality
+4. **Add comprehensive error handling** - Graceful degradation
+5. **Document the fix** - Update memory bank with technical details
+
+### 3. Feature Implementation
+1. **Follow existing patterns** - Maintain architectural consistency
+2. **Use dependency injection** - Avoid direct imports between modules
+3. **Implement comprehensive error handling** - Handle all failure modes
+4. **Add proper logging** - Debug and warning messages
+5. **Update documentation** - Memory bank and code comments
+6. **Comment Language** - Use English for comments and documentation
+
+### 4. Performance Optimization
+1. **Profile before optimizing** - Identify actual bottlenecks
+2. **Maintain algorithmic correctness** - Don't sacrifice functionality for speed
+3. **Use appropriate data structures** - Match structure to access patterns
+4. **Implement caching strategically** - Cache expensive operations
+5. **Monitor memory usage** - Prevent memory leaks
+
+## Technology Stack Intelligence
+
+### 1. LLM Integration
+- **Primary**: Gemini 2.5 Flash via custom endpoint
+- **Embedding**: BAAI/bge-m3 via custom endpoint
+- **Reranking**: BAAI/bge-reranker-v2-m3
+- **Pattern**: Always handle multiple provider formats
+
+### 2. Storage Backends
+- **Default**: In-memory with file persistence
+- **Production Options**: PostgreSQL, MongoDB, Redis, Neo4j
+- **Pattern**: Abstract storage interface with multiple implementations
+
+### 3. API Architecture
+- **Framework**: FastAPI with Gunicorn for production
+- **Authentication**: JWT-based with role support
+- **Compatibility**: Ollama-compatible endpoints for easy integration
+
+### 4. Frontend
+- **Framework**: React with TypeScript
+- **Package Manager**: **BUN (REQUIRED)** - Always use Bun for all frontend operations
+- **Build Tool**: Vite with Bun runtime
+- **Visualization**: Sigma.js for graph rendering
+- **State Management**: React hooks with context
+- **Internationalization**: i18next for multi-language support
+
+## Common Pitfalls and Solutions
+
+### 1. Embedding Format Issues
+**Pitfall**: Assuming all endpoints return base64-encoded embeddings
+**Solution**: Always check format and handle both base64 and raw arrays
+
+### 2. Async/Await Patterns
+**Pitfall**: Calling methods on coroutines instead of awaited results
+**Solution**: Always await coroutines before accessing their methods
+
+### 3. Data Model Evolution
+**Pitfall**: Breaking changes when removing fields from dataclasses
+**Solution**: Filter deprecated fields during deserialization, don't break storage
+
+### 4. Concurrency Issues
+**Pitfall**: Inconsistent lock key generation causing deadlocks
+**Solution**: Always sort keys for deterministic lock ordering
+
+### 5. Event Loop Management
+**Pitfall**: Event loop mismatches during shutdown
+**Solution**: Implement timeout and specific error handling for loop issues
+
+## Performance Considerations
+
+### 1. Query Context Building
+- **Algorithm**: Linear gradient weighted polling for fair resource allocation
+- **Optimization**: Round-robin merging to eliminate mode bias
+- **Pattern**: Smart chunk selection based on cross-entity occurrence
+
+### 2. Graph Operations
+- **Optimization**: Batch operations where possible
+- **Pattern**: Use appropriate indexing for large datasets
+- **Consideration**: Memory usage with large graphs
+
+### 3. LLM Request Management
+- **Pattern**: Priority-based queue for request ordering
+- **Optimization**: Connection pooling and retry mechanisms
+- **Consideration**: Rate limiting and cost management
+
+## Security Patterns
+
+### 1. Authentication
+- **Implementation**: JWT tokens with role-based access
+- **Pattern**: Stateless authentication with configurable expiration
+- **Security**: Proper token validation and refresh mechanisms
+
+### 2. API Security
+- **Pattern**: Input validation and sanitization
+- **Implementation**: FastAPI dependency injection for auth
+- **Consideration**: Rate limiting and abuse prevention
+
+## Maintenance Guidelines
+
+### 1. Memory Bank Updates
+- **Trigger**: After significant changes or bug fixes
+- **Pattern**: Update activeContext.md and progress.md
+- **Purpose**: Maintain development continuity
+
+### 2. Configuration Management
+- **Pattern**: Environment-based configuration with sensible defaults
+- **Implementation**: .env files with example templates
+- **Consideration**: Security for production deployments
+
+### 3. Error Handling
+- **Pattern**: Comprehensive logging with appropriate levels
+- **Implementation**: Graceful degradation where possible
+- **Consideration**: User-friendly error messages
+
+## Project Evolution Notes
+
+The project has evolved from experimental to production-ready status. Key milestones:
+- **Early 2025**: Basic RAG implementation
+- **Mid 2025**: Multiple storage backends and LLM providers
+- **July 2025**: Major query optimization and algorithm improvements
+- **August 2025**: Production-ready stable state
+
+The system now supports enterprise-level deployments with comprehensive functionality across all components.
--- a/.dockerignore
+++ b/.dockerignore
@ -28,6 +28,12 @@ Makefile
 # Exclude other projects
 /tests
 /scripts
+/data
+/dickens
+/reproduce
+/output_complete
+/rag_storage
+/inputs

 # Python version manager file
 .python-version
--- a/.github/workflows/docker-build-lite.yml
+++ b/.github/workflows/docker-build-lite.yml
@ -0,0 +1,84 @@
+name: Build Lite Docker Image
+
+on:
+  workflow_dispatch:
+    inputs:
+      _notes_:
+        description: '⚠️ Create lite Docker images only after non-trivial version releases.'
+        required: false
+        type: boolean
+        default: false
+
+permissions:
+  contents: read
+  packages: write
+
+jobs:
+  build-and-push-lite:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Get latest tag
+        id: get_tag
+        run: |
+          LATEST_TAG=$(git describe --tags --abbrev=0 2>/dev/null || echo "")
+          if [ -z "$LATEST_TAG" ]; then
+            LATEST_TAG="sha-$(git rev-parse --short HEAD)"
+            echo "No tags found, using commit SHA: $LATEST_TAG"
+          else
+            echo "Latest tag found: $LATEST_TAG"
+          fi
+          echo "tag=$LATEST_TAG" >> $GITHUB_OUTPUT
+
+      - name: Prepare lite tag
+        id: lite_tag
+        run: |
+          LITE_TAG="${{ steps.get_tag.outputs.tag }}-lite"
+          echo "Lite image tag: $LITE_TAG"
+          echo "lite_tag=$LITE_TAG" >> $GITHUB_OUTPUT
+
+      - name: Update version in __init__.py
+        run: |
+          sed -i "s/__version__ = \".*\"/__version__ = \"${{ steps.get_tag.outputs.tag }}\"/" lightrag/__init__.py
+          cat lightrag/__init__.py | grep __version__
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+
+      - name: Login to GitHub Container Registry
+        uses: docker/login-action@v3
+        with:
+          registry: ghcr.io
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Extract metadata for Docker
+        id: meta
+        uses: docker/metadata-action@v5
+        with:
+          images: ghcr.io/${{ github.repository }}
+          tags: |
+            type=raw,value=${{ steps.lite_tag.outputs.lite_tag }}
+            type=raw,value=lite
+
+      - name: Build and push lite Docker image
+        uses: docker/build-push-action@v5
+        with:
+          context: .
+          file: ./Dockerfile.lite
+          platforms: linux/amd64,linux/arm64
+          push: true
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
+          cache-from: type=gha
+          cache-to: type=gha,mode=min
+
+      - name: Output image details
+        run: |
+          echo "Lite Docker image built and pushed successfully!"
+          echo "Image tag: ghcr.io/${{ github.repository }}:${{ steps.lite_tag.outputs.lite_tag }}"
+          echo "Base Git tag used: ${{ steps.get_tag.outputs.tag }}"
--- a/.github/workflows/docker-build-manual.yml
+++ b/.github/workflows/docker-build-manual.yml
@ -2,6 +2,12 @@ name: Build Test Docker Image manually

 on:
  workflow_dispatch:
+    inputs:
+      _notes_:
+        description: '⚠️ Please create a new git tag before building the docker image.'
+        required: false
+        type: boolean
+        default: false

 permissions:
  contents: read
@ -58,6 +64,7 @@ jobs:
        uses: docker/build-push-action@v5
        with:
          context: .
+          file: ./Dockerfile
          platforms: linux/amd64,linux/arm64
          push: true
          tags: ${{ steps.meta.outputs.tags }}
--- a/.github/workflows/docker-publish.yml
+++ b/.github/workflows/docker-publish.yml
@ -35,6 +35,18 @@ jobs:
          echo "Found tag: $TAG"
          echo "tag=$TAG" >> $GITHUB_OUTPUT

+      - name: Check if pre-release
+        id: check_prerelease
+        run: |
+          TAG="${{ steps.get_tag.outputs.tag }}"
+          if [[ "$TAG" == *"rc"* ]] || [[ "$TAG" == *"dev"* ]]; then
+            echo "is_prerelease=true" >> $GITHUB_OUTPUT
+            echo "This is a pre-release version: $TAG"
+          else
+            echo "is_prerelease=false" >> $GITHUB_OUTPUT
+            echo "This is a stable release: $TAG"
+          fi
+
      - name: Update version in __init__.py
        run: |
          sed -i "s/__version__ = \".*\"/__version__ = \"${{ steps.get_tag.outputs.tag }}\"/" lightrag/__init__.py
@ -48,12 +60,13 @@ jobs:
          images: ghcr.io/${{ github.repository }}
          tags: |
            type=raw,value=${{ steps.get_tag.outputs.tag }}
-            type=raw,value=latest
+            type=raw,value=latest,enable=${{ steps.check_prerelease.outputs.is_prerelease == 'false' }}

      - name: Build and push Docker image
        uses: docker/build-push-action@v5
        with:
          context: .
+          file: ./Dockerfile
          platforms: linux/amd64,linux/arm64
          push: true
          tags: ${{ steps.meta.outputs.tags }}
--- a/.github/workflows/pypi-publish.yml
+++ b/.github/workflows/pypi-publish.yml
@ -17,6 +17,29 @@ jobs:
        with:
          fetch-depth: 0  # Fetch all history for tags

+      # Build frontend WebUI
+      - name: Setup Bun
+        uses: oven-sh/setup-bun@v1
+        with:
+          bun-version: latest
+
+      - name: Build Frontend WebUI
+        run: |
+          cd lightrag_webui
+          bun install --frozen-lockfile
+          bun run build
+          cd ..
+
+      - name: Verify Frontend Build
+        run: |
+          if [ ! -f "lightrag/api/webui/index.html" ]; then
+            echo "❌ Error: Frontend build failed - index.html not found"
+            exit 1
+          fi
+          echo "✅ Frontend build verified"
+          echo "Frontend files:"
+          ls -lh lightrag/api/webui/ | head -10
+
      - uses: actions/setup-python@v5
        with:
          python-version: "3.x"
--- a/.gitignore
+++ b/.gitignore
@ -9,10 +9,10 @@ __pycache__/

 # Virtual Environment
 .venv/
-env/
 venv/
-*.env*
-.env_example
+
+# Enviroment Variable Files
+.env

 # Build / Distribution
 dist/
@ -66,10 +66,11 @@ download_models_hf.py
 lightrag-dev/
 gui/

+# Frontend build output (built during PyPI release)
+lightrag/api/webui/
+
 # unit-test files
 test_*

 # Cline files
-memory-bank
 memory-bank/
-.clinerules
--- a/AGENTS.md
+++ b/AGENTS.md
@ -0,0 +1,39 @@
+# Repository Guidelines
+
+LightRAG is an advanced Retrieval-Augmented Generation (RAG) framework designed to enhance information retrieval and generation through graph-based knowledge representation.
+
+## Project Structure & Module Organization
+- `lightrag/`: Core Python package with orchestrators (`lightrag/lightrag.py`), storage adapters in `kg/`, LLM bindings in `llm/`, and helpers such as `operate.py` and `utils_*.py`.
+- `lightrag-api/`: FastAPI service (`lightrag_server.py`) with routers under `routers/` and Gunicorn launcher `run_with_gunicorn.py`.
+- `lightrag_webui/`: React 19 + TypeScript client driven by Bun + Vite; UI components live in `src/`.
+- Tests live in `tests/` and root-level `test_*.py`. Working datasets stay in `inputs/`, `rag_storage/`, `temp/`; deployment collateral lives in `docs/`, `k8s-deploy/`, and `docker-compose.yml`.
+
+## Build, Test, and Development Commands
+- `python -m venv .venv && source .venv/bin/activate`: set up the Python runtime.
+- `pip install -e .` / `pip install -e .[api]`: install the package and API extras in editable mode.
+- `lightrag-server` or `uvicorn lightrag.api.lightrag_server:app --reload`: start the API locally; ensure `.env` is present.
+- `python -m pytest tests` or `python test_graph_storage.py`: run the full suite or a targeted script.
+- `ruff check .`: lint Python sources before committing.
+- `bun install`, `bun run dev`, `bun run build`, `bun test`: manage the web UI workflow (Bun is mandatory).
+
+## Coding Style & Naming Conventions
+- Backend code follow PEP 8 with four-space indentation, annotate functions, and reach for dataclasses when modelling state.
+- Use `lightrag.utils.logger` instead of `print`; respect logger configuration flags.
+- Extend storage or pipeline abstractions via `lightrag.base` and keep reusable helpers in the existing `utils_*.py`.
+- Python modules remain lowercase with underscores; React components use `PascalCase.tsx` and hooks-first patterns.
+- Front-end code should remain in TypeScript with two-space indentation, rely on functional React components with hooks, and follow Tailwind utility style.
+
+## Testing Guidelines
+- Add pytest cases beside the affected module or the relevant `test_*.py`; functions should start with `test_`.
+- Export required `LIGHTRAG_*` environment variables before running integration or storage tests.
+- For UI updates, pair code with Vitest specs and run `bun test`.
+
+## Commit & Pull Request Guidelines
+- Use concise, imperative commit subjects (e.g., `Fix lock key normalization`) and add body context only when necessary.
+- PRs should include a summary, operational impact, linked issues, and screenshots or API samples for user-facing work.
+- Verify `ruff check .`, `python -m pytest`, and affected Bun commands succeed before requesting review; note the runs in the PR text.
+
+## Security & Configuration Tips
+- Copy `.env.example` and `config.ini.example`; never commit secrets or real connection strings.
+- Configure storage backends through `LIGHTRAG_*` variables and validate them with `docker-compose` services when needed.
+- Treat `lightrag.log*` as local artefacts; purge sensitive information before sharing logs or outputs.
--- a/106
+++ b/106
@ -1,63 +1,101 @@
-# Build stage
-FROM python:3.12-slim AS builder
+# Frontend build stage
+FROM oven/bun:1 AS frontend-builder

 WORKDIR /app

-# Upgrade pip、setuptools and wheel to the latest version
-RUN pip install --upgrade pip setuptools wheel
+# Copy frontend source code
+COPY lightrag_webui/ ./lightrag_webui/

-# Install Rust and required build dependencies
-RUN apt-get update && apt-get install -y \
-    curl \
-    build-essential \
-    pkg-config \
+# Build frontend assets for inclusion in the API package
+RUN cd lightrag_webui \
+    && bun install --frozen-lockfile \
+    && bun run build
+
+# Python build stage - using uv for faster package installation
+FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim AS builder
+
+ENV DEBIAN_FRONTEND=noninteractive
+ENV UV_SYSTEM_PYTHON=1
+ENV UV_COMPILE_BYTECODE=1
+
+WORKDIR /app
+
+# Install system deps (Rust is required by some wheels)
+RUN apt-get update \
+    && apt-get install -y --no-install-recommends \
+        curl \
+        build-essential \
+        pkg-config \
    && rm -rf /var/lib/apt/lists/* \
-    && curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y \
-    && . $HOME/.cargo/env
+    && curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y

-# Copy pyproject.toml and source code for dependency installation
+ENV PATH="/root/.cargo/bin:/root/.local/bin:${PATH}"
+
+# Ensure shared data directory exists for uv caches
+RUN mkdir -p /root/.local/share/uv
+
+# Copy project metadata and sources
 COPY pyproject.toml .
 COPY setup.py .
+COPY uv.lock .
+
+# Install base, API, and offline extras without the project to improve caching
+RUN uv sync --frozen --no-dev --extra api --extra offline --no-install-project --no-editable
+
+# Copy project sources after dependency layer
 COPY lightrag/ ./lightrag/

-# Install dependencies
-ENV PATH="/root/.cargo/bin:${PATH}"
-RUN pip install --user --no-cache-dir --use-pep517 .
-RUN pip install --user --no-cache-dir --use-pep517 .[api]
+# Include pre-built frontend assets from the previous stage
+COPY --from=frontend-builder /app/lightrag/api/webui ./lightrag/api/webui

-# Install depndencies for default storage
-RUN pip install --user --no-cache-dir nano-vectordb networkx
-# Install depndencies for default LLM
-RUN pip install --user --no-cache-dir openai ollama tiktoken
-# Install depndencies for default document loader
-RUN pip install --user --no-cache-dir pypdf2 python-docx python-pptx openpyxl
+# Sync project in non-editable mode and ensure pip is available for runtime installs
+RUN uv sync --frozen --no-dev --extra api --extra offline --no-editable \
+    && /app/.venv/bin/python -m ensurepip --upgrade
+
+# Prepare offline cache directory and pre-populate tiktoken data
+# Use uv run to execute commands from the virtual environment
+RUN mkdir -p /app/data/tiktoken \
+    && uv run lightrag-download-cache --cache-dir /app/data/tiktoken || status=$?; \
+    if [ -n "${status:-}" ] && [ "$status" -ne 0 ] && [ "$status" -ne 2 ]; then exit "$status"; fi

 # Final stage
 FROM python:3.12-slim

 WORKDIR /app

-# Upgrade pip and setuptools
-RUN pip install --upgrade pip setuptools wheel
+# Install uv for package management
+COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

-# Copy only necessary files from builder
+ENV UV_SYSTEM_PYTHON=1
+
+# Copy installed packages and application code
 COPY --from=builder /root/.local /root/.local
-COPY ./lightrag ./lightrag
+COPY --from=builder /app/.venv /app/.venv
+COPY --from=builder /app/lightrag ./lightrag
+COPY pyproject.toml .
 COPY setup.py .
+COPY uv.lock .

-RUN pip install --use-pep517 ".[api]"
-# Make sure scripts in .local are usable
-ENV PATH=/root/.local/bin:$PATH
+# Ensure the installed scripts are on PATH
+ENV PATH=/app/.venv/bin:/root/.local/bin:$PATH

-# Create necessary directories
-RUN mkdir -p /app/data/rag_storage /app/data/inputs
+# Install dependencies with uv sync (uses locked versions from uv.lock)
+# And ensure pip is available for runtime installs
+RUN uv sync --frozen --no-dev --extra api --extra offline --no-editable \
+    && /app/.venv/bin/python -m ensurepip --upgrade

-# Docker data directories
+# Create persistent data directories AFTER package installation
+RUN mkdir -p /app/data/rag_storage /app/data/inputs /app/data/tiktoken
+
+# Copy offline cache into the newly created directory
+COPY --from=builder /app/data/tiktoken /app/data/tiktoken
+
+# Point to the prepared cache
+ENV TIKTOKEN_CACHE_DIR=/app/data/tiktoken
 ENV WORKING_DIR=/app/data/rag_storage
 ENV INPUT_DIR=/app/data/inputs

-# Expose the default port
+# Expose API port
 EXPOSE 9621

-# Set entrypoint
 ENTRYPOINT ["python", "-m", "lightrag.api.lightrag_server"]
--- a/Dockerfile.lite
+++ b/Dockerfile.lite
@ -0,0 +1,102 @@
+# Frontend build stage
+FROM oven/bun:1 AS frontend-builder
+
+WORKDIR /app
+
+# Copy frontend source code
+COPY lightrag_webui/ ./lightrag_webui/
+
+# Build frontend assets for inclusion in the API package
+RUN cd lightrag_webui \
+    && bun install --frozen-lockfile \
+    && bun run build
+
+# Python build stage - using uv for package installation
+FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim AS builder
+
+ENV DEBIAN_FRONTEND=noninteractive
+ENV UV_SYSTEM_PYTHON=1
+ENV UV_COMPILE_BYTECODE=1
+
+WORKDIR /app
+
+# Install system dependencies required by some wheels
+RUN apt-get update \
+    && apt-get install -y --no-install-recommends \
+        curl \
+        build-essential \
+        pkg-config \
+    && rm -rf /var/lib/apt/lists/* \
+    && curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
+
+ENV PATH="/root/.cargo/bin:/root/.local/bin:${PATH}"
+
+# Ensure shared data directory exists for uv caches
+RUN mkdir -p /root/.local/share/uv
+
+# Copy project metadata and sources
+COPY pyproject.toml .
+COPY setup.py .
+COPY uv.lock .
+
+# Install project dependencies (base + API extras) without the project to improve caching
+RUN uv sync --frozen --no-dev --extra api --no-install-project --no-editable
+
+# Copy project sources after dependency layer
+COPY lightrag/ ./lightrag/
+
+# Include pre-built frontend assets from the previous stage
+COPY --from=frontend-builder /app/lightrag/api/webui ./lightrag/api/webui
+
+# Sync project in non-editable mode and ensure pip is available for runtime installs
+RUN uv sync --frozen --no-dev --extra api --no-editable \
+    && /app/.venv/bin/python -m ensurepip --upgrade
+
+# Prepare tiktoken cache directory and pre-populate tokenizer data
+# Ignore exit code 2 which indicates assets already cached
+RUN mkdir -p /app/data/tiktoken \
+    && uv run lightrag-download-cache --cache-dir /app/data/tiktoken || status=$?; \
+    if [ -n "${status:-}" ] && [ "$status" -ne 0 ] && [ "$status" -ne 2 ]; then exit "$status"; fi
+
+# Final stage
+FROM python:3.12-slim
+
+WORKDIR /app
+
+# Install uv for package management
+COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
+
+ENV UV_SYSTEM_PYTHON=1
+
+# Copy installed packages and application code
+COPY --from=builder /root/.local /root/.local
+COPY --from=builder /app/.venv /app/.venv
+COPY --from=builder /app/lightrag ./lightrag
+COPY pyproject.toml .
+COPY setup.py .
+COPY uv.lock .
+
+# Ensure the installed scripts are on PATH
+ENV PATH=/app/.venv/bin:/root/.local/bin:$PATH
+
+# Sync dependencies inside the final image using uv
+# And ensure pip is available for runtime installs
+RUN uv sync --frozen --no-dev --extra api --no-editable \
+    && /app/.venv/bin/python -m ensurepip --upgrade
+
+# Create persistent data directories
+RUN mkdir -p /app/data/rag_storage /app/data/inputs /app/data/tiktoken
+
+# Copy cached tokenizer assets prepared in the builder stage
+COPY --from=builder /app/data/tiktoken /app/data/tiktoken
+
+# Docker data directories
+ENV TIKTOKEN_CACHE_DIR=/app/data/tiktoken
+ENV WORKING_DIR=/app/data/rag_storage
+ENV INPUT_DIR=/app/data/inputs
+
+# Expose API port
+EXPOSE 9621
+
+# Set entrypoint
+ENTRYPOINT ["python", "-m", "lightrag.api.lightrag_server"]
--- a/README-zh.md
+++ b/README-zh.md
@ -352,7 +352,8 @@ class QueryParam:

    user_prompt: str | None = None
    """User-provided prompt for the query.
-    If proivded, this will be use instead of the default vaulue from prompt template.
+    Addition instructions for LLM. If provided, this will be inject into the prompt template.
+    It's purpose is the let user customize the way LLM generate the response.
    """

    enable_rerank: bool = True
@ -895,6 +896,10 @@ maxclients 500

 为了保持对遗留数据的兼容，在未配置工作空间时PostgreSQL非图存储的工作空间为`default`，PostgreSQL AGE图存储的工作空间为空，Neo4j图存储的默认工作空间为`base`。对于所有的外部存储，系统都提供了专用的工作空间环境变量，用于覆盖公共的 `WORKSPACE`环境变量配置。这些适用于指定存储类型的工作空间环境变量为：`REDIS_WORKSPACE`, `MILVUS_WORKSPACE`, `QDRANT_WORKSPACE`, `MONGODB_WORKSPACE`, `POSTGRES_WORKSPACE`, `NEO4J_WORKSPACE`。

+### AGENTS.md – 自动编程引导文件
+
+AGENTS.md 是一种简洁、开放的格式，用于指导自动编程代理完成工作（https://agents.md/）。它为 LightRAG 项目提供了一个专属且可预测的上下文与指令位置，帮助 AI 代码代理更好地开展工作。不同的 AI 代码代理不应各自维护独立的引导文件。如果某个 AI 代理无法自动识别 AGENTS.md，可使用符号链接来解决。建立符号链接后，可通过配置本地的 `.gitignore_global` 文件防止其被提交至 Git 仓库。
+
 ## 编辑实体和关系

 LightRAG现在支持全面的知识图谱管理功能，允许您在知识图谱中创建、编辑和删除实体和关系。
--- a/README.md
+++ b/README.md
@ -84,6 +84,8 @@

 ## Installation

+> **📦 Offline Deployment**: For offline or air-gapped environments, see the [Offline Deployment Guide](./docs/OfflineDeployment.md) for instructions on pre-installing all dependencies and cache files.
+
 ### Install LightRAG Server

 The LightRAG Server is designed to provide Web UI and API support. The Web UI facilitates document indexing, knowledge graph exploration, and a simple RAG query interface. LightRAG Server also provide an Ollama compatible interfaces, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat bot, such as Open WebUI, to access LightRAG easily.
@ -353,7 +355,8 @@ class QueryParam:

    user_prompt: str | None = None
    """User-provided prompt for the query.
-    If proivded, this will be use instead of the default vaulue from prompt template.
+    Addition instructions for LLM. If provided, this will be inject into the prompt template.
+    It's purpose is the let user customize the way LLM generate the response.
    """

    enable_rerank: bool = True
@ -936,6 +939,10 @@ The `workspace` parameter ensures data isolation between different LightRAG inst

 To maintain compatibility with legacy data, the default workspace for PostgreSQL non-graph storage is `default` and, for PostgreSQL AGE graph storage is null, for Neo4j graph storage is `base` when no workspace is configured. For all external storages, the system provides dedicated workspace environment variables to override the common `WORKSPACE` environment variable configuration. These storage-specific workspace environment variables are: `REDIS_WORKSPACE`, `MILVUS_WORKSPACE`, `QDRANT_WORKSPACE`, `MONGODB_WORKSPACE`, `POSTGRES_WORKSPACE`, `NEO4J_WORKSPACE`.

+### AGENTS.md -- Guiding Coding Agents
+
+AGENTS.md is a simple, open format for guiding coding agents (https://agents.md/). It is a dedicated, predictable place to provide the context and instructions to help AI coding agents work on LightRAG project. Different AI coders should not maintain separate guidance files individually. If any AI coder cannot automatically recognize AGENTS.md, symbolic links can be used as a solution. After establishing symbolic links, you can prevent them from being committed to the Git repository by configuring your local `.gitignore_global`.
+
 ## Edit Entities and Relations

 LightRAG now supports comprehensive knowledge graph management capabilities, allowing you to create, edit, and delete entities and relationships within your knowledge graph.
--- a/docker-build-push.sh
+++ b/docker-build-push.sh
@ -0,0 +1,77 @@
+#!/bin/bash
+set -e
+
+# Configuration
+IMAGE_NAME="ghcr.io/hkuds/lightrag"
+DOCKERFILE="Dockerfile"
+TAG="latest"
+
+# Get version from git tags
+VERSION=$(git describe --tags --abbrev=0 2>/dev/null || echo "dev")
+
+echo "=================================="
+echo "  Multi-Architecture Docker Build"
+echo "=================================="
+echo "Image: ${IMAGE_NAME}:${TAG}"
+echo "Version: ${VERSION}"
+echo "Platforms: linux/amd64, linux/arm64"
+echo "=================================="
+echo ""
+
+# Check Docker login status (skip if CR_PAT is set for CI/CD)
+if [ -z "$CR_PAT" ]; then
+    if ! docker info 2>/dev/null | grep -q "Username"; then
+        echo "⚠️  Warning: Not logged in to Docker registry"
+        echo "Please login first: docker login ghcr.io"
+        echo "Or set CR_PAT environment variable for automated login"
+        echo ""
+        read -p "Continue anyway? (y/n) " -n 1 -r
+        echo
+        if [[ ! $REPLY =~ ^[Yy]$ ]]; then
+            exit 1
+        fi
+    fi
+else
+    echo "Using CR_PAT environment variable for authentication"
+fi
+
+# Check if buildx builder exists, create if not
+if ! docker buildx ls | grep -q "desktop-linux"; then
+    echo "Creating buildx builder..."
+    docker buildx create --name desktop-linux --use
+    docker buildx inspect --bootstrap
+else
+    echo "Using existing buildx builder: desktop-linux"
+    docker buildx use desktop-linux
+fi
+
+echo ""
+echo "Building and pushing multi-architecture image..."
+echo ""
+
+# Build and push
+docker buildx build \
+  --platform linux/amd64,linux/arm64 \
+  --file ${DOCKERFILE} \
+  --tag ${IMAGE_NAME}:${TAG} \
+  --tag ${IMAGE_NAME}:${VERSION} \
+  --push \
+  .
+
+echo ""
+echo "✓ Build and push complete!"
+echo ""
+echo "Images pushed:"
+echo "  - ${IMAGE_NAME}:${TAG}"
+echo "  - ${IMAGE_NAME}:${VERSION}"
+echo ""
+echo "Verifying multi-architecture manifest..."
+echo ""
+
+# Verify
+docker buildx imagetools inspect ${IMAGE_NAME}:${TAG}
+
+echo ""
+echo "✓ Verification complete!"
+echo ""
+echo "Pull with: docker pull ${IMAGE_NAME}:${TAG}"
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -12,13 +12,10 @@ services:
    volumes:
      - ./data/rag_storage:/app/data/rag_storage
      - ./data/inputs:/app/data/inputs
-      - ./data/tiktoken:/app/data/tiktoken
      - ./config.ini:/app/config.ini
      - ./.env:/app/.env
    env_file:
      - .env
-    environment:
-      - TIKTOKEN_CACHE_DIR=/app/data/tiktoken
    restart: unless-stopped
    extra_hosts:
      - "host.docker.internal:host-gateway"
--- a/docs/DockerDeployment.md
+++ b/docs/DockerDeployment.md
@ -1,17 +1,11 @@
-# LightRAG
+# LightRAG Docker Deployment

 A lightweight Knowledge Graph Retrieval-Augmented Generation system with multiple LLM backend support.

-## 🚀 Installation
+## 🚀 Preparation

-### Prerequisites
- Python 3.10+
- Git
- Docker (optional for Docker deployment)
+### Clone the repository:

-### Native Installation
-
-1. Clone the repository:
 ```bash
 # Linux/MacOS
 git clone https://github.com/HKUDS/LightRAG.git
@ -23,7 +17,8 @@ git clone https://github.com/HKUDS/LightRAG.git
 cd LightRAG
 ```

-2. Configure your environment:
+### Configure your environment:
+
 ```bash
 # Linux/MacOS
 cp .env.example .env
@ -35,141 +30,92 @@ Copy-Item .env.example .env
 # Edit .env with your preferred configuration
 ```

-3. Create and activate virtual environment:
-```bash
-# Linux/MacOS
-python -m venv venv
-source venv/bin/activate
-```
-```powershell
-# Windows PowerShell
-python -m venv venv
-.\venv\Scripts\Activate
-```
+LightRAG can be configured using environment variables in the `.env` file:

-4. Install dependencies:
-```bash
-# Both platforms
-pip install -r requirements.txt
-```
+**Server Configuration**
+
+- `HOST`: Server host (default: 0.0.0.0)
+- `PORT`: Server port (default: 9621)
+
+**LLM Configuration**
+
+- `LLM_BINDING`: LLM backend to use (lollms/ollama/openai)
+- `LLM_BINDING_HOST`: LLM server host URL
+- `LLM_MODEL`: Model name to use
+
+**Embedding Configuration**
+
+- `EMBEDDING_BINDING`: Embedding backend (lollms/ollama/openai)
+- `EMBEDDING_BINDING_HOST`: Embedding server host URL
+- `EMBEDDING_MODEL`: Embedding model name
+
+**RAG Configuration**
+
+- `MAX_ASYNC`: Maximum async operations
+- `MAX_TOKENS`: Maximum token size
+- `EMBEDDING_DIM`: Embedding dimensions

 ## 🐳 Docker Deployment

 Docker instructions work the same on all platforms with Docker Desktop installed.

-1. Build and start the container:
+### Start LightRAG  server:
+
 ```bash
 docker-compose up -d
 ```

-### Configuration Options
+LightRAG Server uses the following paths for data storage:

-LightRAG can be configured using environment variables in the `.env` file:
-
-#### Server Configuration
- `HOST`: Server host (default: 0.0.0.0)
- `PORT`: Server port (default: 9621)
-
-#### LLM Configuration
- `LLM_BINDING`: LLM backend to use (lollms/ollama/openai)
- `LLM_BINDING_HOST`: LLM server host URL
- `LLM_MODEL`: Model name to use
-
-#### Embedding Configuration
- `EMBEDDING_BINDING`: Embedding backend (lollms/ollama/openai)
- `EMBEDDING_BINDING_HOST`: Embedding server host URL
- `EMBEDDING_MODEL`: Embedding model name
-
-#### RAG Configuration
- `MAX_ASYNC`: Maximum async operations
- `MAX_TOKENS`: Maximum token size
- `EMBEDDING_DIM`: Embedding dimensions
-
-#### Security
- `LIGHTRAG_API_KEY`: API key for authentication
-
-### Data Storage Paths
-
-The system uses the following paths for data storage:
 ```
 data/
 ├── rag_storage/    # RAG data persistence
 └── inputs/         # Input documents
 ```

-### Example Deployments
-
-1. Using with Ollama:
-```env
-LLM_BINDING=ollama
-LLM_BINDING_HOST=http://host.docker.internal:11434
-LLM_MODEL=mistral
-EMBEDDING_BINDING=ollama
-EMBEDDING_BINDING_HOST=http://host.docker.internal:11434
-EMBEDDING_MODEL=bge-m3
-```
-
-you can't just use localhost from docker, that's why you need to use host.docker.internal which is defined in the docker compose file and should allow you to access the localhost services.
-
-2. Using with OpenAI:
-```env
-LLM_BINDING=openai
-LLM_MODEL=gpt-3.5-turbo
-EMBEDDING_BINDING=openai
-EMBEDDING_MODEL=text-embedding-ada-002
-OPENAI_API_KEY=your-api-key
-```
-
-### API Usage
-
-Once deployed, you can interact with the API at `http://localhost:9621`
-
-Example query using PowerShell:
-```powershell
-$headers = @{
-    "X-API-Key" = "your-api-key"
-    "Content-Type" = "application/json"
-}
-$body = @{
-    query = "your question here"
-} | ConvertTo-Json
-
-Invoke-RestMethod -Uri "http://localhost:9621/query" -Method Post -Headers $headers -Body $body
-```
-
-Example query using curl:
-```bash
-curl -X POST "http://localhost:9621/query" \
-     -H "X-API-Key: your-api-key" \
-     -H "Content-Type: application/json" \
-     -d '{"query": "your question here"}'
-```
-
-## 🔒 Security
-
-Remember to:
-1. Set a strong API key in production
-2. Use SSL in production environments
-3. Configure proper network security
-
-## 📦 Updates
+### Updates

 To update the Docker container:
 ```bash
 docker-compose pull
-docker-compose up -d --build
+docker-compose down
+docker-compose up
 ```

-To update native installation:
+### Offline deployment
+
+Software packages requiring `transformers`, `torch`, or `cuda` will is not preinstalled in the dokcer images. Consequently, document extraction tools such as Docling, as well as local LLM models like Hugging Face and LMDeploy, can not be used in an off line enviroment. These high-compute-resource-demanding services should not be integrated into LightRAG. Docling will be decoupled and deployed as a standalone service.
+
+## 📦 Build Docker Images
+
+### For local development and testing
+
 ```bash
-# Linux/MacOS
-git pull
-source venv/bin/activate
-pip install -r requirements.txt
+# Build and run with docker-compose
+docker compose up --build
 ```
-```powershell
-# Windows PowerShell
-git pull
-.\venv\Scripts\Activate
-pip install -r requirements.txt
+
+### For production release
+
+ **multi-architecture build and push**:
+
+```bash
+# Use the provided build script
+./docker-build-push.sh
 ```
+
+**The build script will**:
+
+- Check Docker registry login status
+- Create/use buildx builder automatically
+- Build for both AMD64 and ARM64 architectures
+- Push to GitHub Container Registry (ghcr.io)
+- Verify the multi-architecture manifest
+
+**Prerequisites**:
+
+Before building multi-architecture images, ensure you have:
+
+- Docker 20.10+ with Buildx support
+- Sufficient disk space (20GB+ recommended for offline image)
+- Registry access credentials (if pushing images)
--- a/docs/FrontendBuildGuide.md
+++ b/docs/FrontendBuildGuide.md
@ -0,0 +1,207 @@
+# Frontend Build Guide
+
+## Overview
+
+The LightRAG project includes a React-based WebUI frontend. This guide explains how frontend building works in different scenarios.
+
+## Key Principle
+
+- **Git Repository**: Frontend build results are **NOT** included (kept clean)
+- **PyPI Package**: Frontend build results **ARE** included (ready to use)
+- **Build Tool**: Uses **Bun** (not npm/yarn)
+
+## Installation Scenarios
+
+### 1. End Users (From PyPI) ✨
+
+**Command:**
+```bash
+pip install lightrag-hku[api]
+```
+
+**What happens:**
+- Frontend is already built and included in the package
+- No additional steps needed
+- Web interface works immediately
+
+---
+
+### 2. Development Mode (Recommended for Contributors) 🔧
+
+**Command:**
+```bash
+# Clone the repository
+git clone https://github.com/HKUDS/LightRAG.git
+cd LightRAG
+
+# Install in editable mode (no frontend build required yet)
+pip install -e ".[api]"
+
+# Build frontend when needed (can be done anytime)
+cd lightrag_webui
+bun install --frozen-lockfile
+bun run build
+cd ..
+```
+
+**Advantages:**
+- Install first, build later (flexible workflow)
+- Changes take effect immediately (symlink mode)
+- Frontend can be rebuilt anytime without reinstalling
+
+**How it works:**
+- Creates symlinks to source directory
+- Frontend build output goes to `lightrag/api/webui/`
+- Changes are immediately visible in installed package
+
+---
+
+### 3. Normal Installation (Testing Package Build) 📦
+
+**Command:**
+```bash
+# Clone the repository
+git clone https://github.com/HKUDS/LightRAG.git
+cd LightRAG
+
+# ⚠️ MUST build frontend FIRST
+cd lightrag_webui
+bun install --frozen-lockfile
+bun run build
+cd ..
+
+# Now install
+pip install ".[api]"
+```
+
+**What happens:**
+- Frontend files are **copied** to site-packages
+- Post-build modifications won't affect installed package
+- Requires rebuild + reinstall to update
+
+**When to use:**
+- Testing complete installation process
+- Verifying package configuration
+- Simulating PyPI user experience
+
+---
+
+### 4. Creating Distribution Package 🚀
+
+**Command:**
+```bash
+# Build frontend first
+cd lightrag_webui
+bun install --frozen-lockfile --production
+bun run build
+cd ..
+
+# Create distribution packages
+python -m build
+
+# Output: dist/lightrag_hku-*.whl and dist/lightrag_hku-*.tar.gz
+```
+
+**What happens:**
+- `setup.py` checks if frontend is built
+- If missing, installation fails with helpful error message
+- Generated package includes all frontend files
+
+---
+
+## GitHub Actions (Automated Release)
+
+When creating a release on GitHub:
+
+1. **Automatically builds frontend** using Bun
+2. **Verifies** build completed successfully
+3. **Creates Python package** with frontend included
+4. **Publishes to PyPI** using existing trusted publisher setup
+
+**No manual intervention required!**
+
+---
+
+## Quick Reference
+
+| Scenario | Command | Frontend Required | Can Build After |
+|----------|---------|-------------------|-----------------|
+| From PyPI | `pip install lightrag-hku[api]` | Included | No (already installed) |
+| Development | `pip install -e ".[api]"` | No | ✅ Yes (anytime) |
+| Normal Install | `pip install ".[api]"` | ✅ Yes (before) | No (must reinstall) |
+| Create Package | `python -m build` | ✅ Yes (before) | N/A |
+
+---
+
+## Bun Installation
+
+If you don't have Bun installed:
+
+```bash
+# macOS/Linux
+curl -fsSL https://bun.sh/install | bash
+
+# Windows
+powershell -c "irm bun.sh/install.ps1 | iex"
+```
+
+Official documentation: https://bun.sh
+
+---
+
+## File Structure
+
+```
+LightRAG/
+├── lightrag_webui/          # Frontend source code
+│   ├── src/                 # React components
+│   ├── package.json         # Dependencies
+│   └── vite.config.ts       # Build configuration
+│       └── outDir: ../lightrag/api/webui  # Build output
+│
+├── lightrag/
+│   └── api/
+│       └── webui/           # Frontend build output (gitignored)
+│           ├── index.html   # Built files (after running bun run build)
+│           └── assets/      # Built assets
+│
+├── setup.py                 # Build checks
+├── pyproject.toml           # Package configuration
+└── .gitignore               # Excludes lightrag/api/webui/* (except .gitkeep)
+```
+
+---
+
+## Troubleshooting
+
+### Q: I installed in development mode but the web interface doesn't work
+
+**A:** Build the frontend:
+```bash
+cd lightrag_webui && bun run build
+```
+
+### Q: I built the frontend but it's not in my installed package
+
+**A:** You probably used `pip install .` after building. Either:
+- Use `pip install -e ".[api]"` for development
+- Or reinstall: `pip uninstall lightrag-hku && pip install ".[api]"`
+
+### Q: Where are the built frontend files?
+
+**A:** In `lightrag/api/webui/` after running `bun run build`
+
+### Q: Can I use npm or yarn instead of Bun?
+
+**A:** The project is configured for Bun. While npm/yarn might work, Bun is recommended per project standards.
+
+---
+
+## Summary
+
+✅ **PyPI users**: No action needed, frontend included
+✅ **Developers**: Use `pip install -e ".[api]"`, build frontend when needed
+✅ **CI/CD**: Automatic build in GitHub Actions
+✅ **Git**: Frontend build output never committed
+
+For questions or issues, please open a GitHub issue.
--- a/docs/OfflineDeployment.md
+++ b/docs/OfflineDeployment.md
@ -0,0 +1,317 @@
+# LightRAG Offline Deployment Guide
+
+This guide provides comprehensive instructions for deploying LightRAG in offline environments where internet access is limited or unavailable.
+
+If you deploy LightRAG using Docker, there is no need to refer to this document, as the LightRAG Docker image is pre-configured for offline operation.
+
+> Software packages requiring `transformers`, `torch`, or `cuda` will not be included in the offline dependency group. Consequently, document extraction tools such as Docling, as well as local LLM models like Hugging Face and LMDeploy, are outside the scope of offline installation support. These high-compute-resource-demanding services should not be integrated into LightRAG. Docling will be decoupled and deployed as a standalone service.
+
+## Table of Contents
+
+- [Overview](#overview)
+- [Quick Start](#quick-start)
+- [Layered Dependencies](#layered-dependencies)
+- [Tiktoken Cache Management](#tiktoken-cache-management)
+- [Complete Offline Deployment Workflow](#complete-offline-deployment-workflow)
+- [Troubleshooting](#troubleshooting)
+
+## Overview
+
+LightRAG uses dynamic package installation (`pipmaster`) for optional features based on file types and configurations. In offline environments, these dynamic installations will fail. This guide shows you how to pre-install all necessary dependencies and cache files.
+
+### What Gets Dynamically Installed?
+
+LightRAG dynamically installs packages for:
+
+- **Document Processing**: `docling`, `pypdf2`, `python-docx`, `python-pptx`, `openpyxl`
+- **Storage Backends**: `redis`, `neo4j`, `pymilvus`, `pymongo`, `asyncpg`, `qdrant-client`
+- **LLM Providers**: `openai`, `anthropic`, `ollama`, `zhipuai`, `aioboto3`, `voyageai`, `llama-index`, `lmdeploy`, `transformers`, `torch`
+- Tiktoken Models**: BPE encoding models downloaded from OpenAI CDN
+
+## Quick Start
+
+### Option 1: Using pip with Offline Extras
+
+```bash
+# Online environment: Install all offline dependencies
+pip install lightrag-hku[offline]
+
+# Download tiktoken cache
+lightrag-download-cache
+
+# Create offline package
+pip download lightrag-hku[offline] -d ./offline-packages
+tar -czf lightrag-offline.tar.gz ./offline-packages ~/.tiktoken_cache
+
+# Transfer to offline server
+scp lightrag-offline.tar.gz user@offline-server:/path/to/
+
+# Offline environment: Install
+tar -xzf lightrag-offline.tar.gz
+pip install --no-index --find-links=./offline-packages lightrag-hku[offline]
+export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache
+```
+
+### Option 2: Using Requirements Files
+
+```bash
+# Online environment: Download packages
+pip download -r requirements-offline.txt -d ./packages
+
+# Transfer to offline server
+tar -czf packages.tar.gz ./packages
+scp packages.tar.gz user@offline-server:/path/to/
+
+# Offline environment: Install
+tar -xzf packages.tar.gz
+pip install --no-index --find-links=./packages -r requirements-offline.txt
+```
+
+## Layered Dependencies
+
+LightRAG provides flexible dependency groups for different use cases:
+
+### Available Dependency Groups
+
+| Group | Description | Use Case |
+|-------|-------------|----------|
+| `offline-docs` | Document processing | PDF, DOCX, PPTX, XLSX files |
+| `offline-storage` | Storage backends | Redis, Neo4j, MongoDB, PostgreSQL, etc. |
+| `offline-llm` | LLM providers | OpenAI, Anthropic, Ollama, etc. |
+| `offline` | All of the above | Complete offline deployment |
+
+> Software packages requiring `transformers`, `torch`, or `cuda` will not be included in the offline dependency group.
+
+### Installation Examples
+
+```bash
+# Install only document processing dependencies
+pip install lightrag-hku[offline-docs]
+
+# Install document processing and storage backends
+pip install lightrag-hku[offline-docs,offline-storage]
+
+# Install all offline dependencies
+pip install lightrag-hku[offline]
+```
+
+### Using Individual Requirements Files
+
+```bash
+# Document processing only
+pip install -r requirements-offline-docs.txt
+
+# Storage backends only
+pip install -r requirements-offline-storage.txt
+
+# LLM providers only
+pip install -r requirements-offline-llm.txt
+
+# All offline dependencies
+pip install -r requirements-offline.txt
+```
+
+## Tiktoken Cache Management
+
+Tiktoken downloads BPE encoding models on first use. In offline environments, you must pre-download these models.
+
+### Using the CLI Command
+
+After installing LightRAG, use the built-in command:
+
+```bash
+# Download to default location (~/.tiktoken_cache)
+lightrag-download-cache
+
+# Download to specific directory
+lightrag-download-cache --cache-dir ./tiktoken_cache
+
+# Download specific models only
+lightrag-download-cache --models gpt-4o-mini gpt-4
+```
+
+### Default Models Downloaded
+
+- `gpt-4o-mini` (LightRAG default)
+- `gpt-4o`
+- `gpt-4`
+- `gpt-3.5-turbo`
+- `text-embedding-ada-002`
+- `text-embedding-3-small`
+- `text-embedding-3-large`
+
+### Setting Cache Location in Offline Environment
+
+```bash
+# Option 1: Environment variable (temporary)
+export TIKTOKEN_CACHE_DIR=/path/to/tiktoken_cache
+
+# Option 2: Add to ~/.bashrc or ~/.zshrc (persistent)
+echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc
+source ~/.bashrc
+
+# Option 3: Copy to default location
+cp -r /path/to/tiktoken_cache ~/.tiktoken_cache/
+```
+
+## Complete Offline Deployment Workflow
+
+### Step 1: Prepare in Online Environment
+
+```bash
+# 1. Install LightRAG with offline dependencies
+pip install lightrag-hku[offline]
+
+# 2. Download tiktoken cache
+lightrag-download-cache --cache-dir ./offline_cache/tiktoken
+
+# 3. Download all Python packages
+pip download lightrag-hku[offline] -d ./offline_cache/packages
+
+# 4. Create archive for transfer
+tar -czf lightrag-offline-complete.tar.gz ./offline_cache
+
+# 5. Verify contents
+tar -tzf lightrag-offline-complete.tar.gz | head -20
+```
+
+### Step 2: Transfer to Offline Environment
+
+```bash
+# Using scp
+scp lightrag-offline-complete.tar.gz user@offline-server:/tmp/
+
+# Or using USB/physical media
+# Copy lightrag-offline-complete.tar.gz to USB drive
+```
+
+### Step 3: Install in Offline Environment
+
+```bash
+# 1. Extract archive
+cd /tmp
+tar -xzf lightrag-offline-complete.tar.gz
+
+# 2. Install Python packages
+pip install --no-index \
+    --find-links=/tmp/offline_cache/packages \
+    lightrag-hku[offline]
+
+# 3. Set up tiktoken cache
+mkdir -p ~/.tiktoken_cache
+cp -r /tmp/offline_cache/tiktoken/* ~/.tiktoken_cache/
+export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache
+
+# 4. Add to shell profile for persistence
+echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc
+```
+
+### Step 4: Verify Installation
+
+```bash
+# Test Python import
+python -c "from lightrag import LightRAG; print('✓ LightRAG imported')"
+
+# Test tiktoken
+python -c "from lightrag.utils import TiktokenTokenizer; t = TiktokenTokenizer(); print('✓ Tiktoken working')"
+
+# Test optional dependencies (if installed)
+python -c "import docling; print('✓ Docling available')"
+python -c "import redis; print('✓ Redis available')"
+```
+
+## Troubleshooting
+
+### Issue: Tiktoken fails with network error
+
+**Problem**: `Unable to load tokenizer for model gpt-4o-mini`
+
+**Solution**:
+```bash
+# Ensure TIKTOKEN_CACHE_DIR is set
+echo $TIKTOKEN_CACHE_DIR
+
+# Verify cache files exist
+ls -la ~/.tiktoken_cache/
+
+# If empty, you need to download cache in online environment first
+```
+
+### Issue: Dynamic package installation fails
+
+**Problem**: `Error installing package xxx`
+
+**Solution**:
+```bash
+# Pre-install the specific package you need
+# For document processing:
+pip install lightrag-hku[offline-docs]
+
+# For storage backends:
+pip install lightrag-hku[offline-storage]
+
+# For LLM providers:
+pip install lightrag-hku[offline-llm]
+```
+
+### Issue: Missing dependencies at runtime
+
+**Problem**: `ModuleNotFoundError: No module named 'xxx'`
+
+**Solution**:
+```bash
+# Check what you have installed
+pip list | grep -i xxx
+
+# Install missing component
+pip install lightrag-hku[offline]  # Install all offline deps
+```
+
+### Issue: Permission denied on tiktoken cache
+
+**Problem**: `PermissionError: [Errno 13] Permission denied`
+
+**Solution**:
+```bash
+# Ensure cache directory has correct permissions
+chmod 755 ~/.tiktoken_cache
+chmod 644 ~/.tiktoken_cache/*
+
+# Or use a user-writable directory
+export TIKTOKEN_CACHE_DIR=~/my_tiktoken_cache
+mkdir -p ~/my_tiktoken_cache
+```
+
+## Best Practices
+
+1. **Test in Online Environment First**: Always test your complete setup in an online environment before going offline.
+
+2. **Keep Cache Updated**: Periodically update your offline cache when new models are released.
+
+3. **Document Your Setup**: Keep notes on which optional dependencies you actually need.
+
+4. **Version Pinning**: Consider pinning specific versions in production:
+   ```bash
+   pip freeze > requirements-production.txt
+   ```
+
+5. **Minimal Installation**: Only install what you need:
+   ```bash
+   # If you only process PDFs with OpenAI
+   pip install lightrag-hku[offline-docs]
+   # Then manually add: pip install openai
+   ```
+
+## Additional Resources
+
+- [LightRAG GitHub Repository](https://github.com/HKUDS/LightRAG)
+- [Docker Deployment Guide](./DockerDeployment.md)
+- [API Documentation](../lightrag/api/README.md)
+
+## Support
+
+If you encounter issues not covered in this guide:
+
+1. Check the [GitHub Issues](https://github.com/HKUDS/LightRAG/issues)
+2. Review the [project documentation](../README.md)
+3. Create a new issue with your offline deployment details
--- a/docs/UV_LOCK_GUIDE.md
+++ b/docs/UV_LOCK_GUIDE.md
@ -0,0 +1,170 @@
+# uv.lock Update Guide
+
+## What is uv.lock?
+
+`uv.lock` is uv's lock file. It captures the exact version of every dependency, including transitive ones, much like:
+- Node.js `package-lock.json`
+- Rust `Cargo.lock`
+- Python Poetry `poetry.lock`
+
+Keeping `uv.lock` in version control guarantees that everyone installs the same dependency set.
+
+## When does uv.lock change?
+
+### Situations where it does *not* change automatically
+
+- Running `uv sync --frozen`
+- Building Docker images that call `uv sync --frozen`
+- Editing source code without touching dependency metadata
+
+### Situations where it will change
+
+1. **`uv lock` or `uv lock --upgrade`**
+
+   ```bash
+   uv lock                # Resolve according to current constraints
+   uv lock --upgrade      # Re-resolve and upgrade to the newest compatible releases
+   ```
+
+   Use these commands after modifying `pyproject.toml`, when you want fresh dependency versions, or if the lock file was deleted or corrupted.
+
+2. **`uv add`**
+
+   ```bash
+    uv add requests           # Adds the dependency and updates both files
+    uv add --dev pytest       # Adds a dev dependency
+   ```
+
+   `uv add` edits `pyproject.toml` and refreshes `uv.lock` in one step.
+
+3. **`uv remove`**
+
+   ```bash
+   uv remove requests
+   ```
+
+   This removes the dependency from `pyproject.toml` and rewrites `uv.lock`.
+
+4. **`uv sync` without `--frozen`**
+
+   ```bash
+   uv sync
+   ```
+
+   Normally this only installs what is already locked. However, if `pyproject.toml` and `uv.lock` disagree or the lock file is missing, uv will regenerate and update `uv.lock`. In CI and production builds you should prefer `uv sync --frozen` to prevent unintended updates.
+
+## Example workflows
+
+### Scenario 1: Add a new dependency
+
+```bash
+# Recommended: let uv handle both files
+uv add fastapi
+git add pyproject.toml uv.lock
+git commit -m "Add fastapi dependency"
+
+# Manual alternative
+# 1. Edit pyproject.toml
+# 2. Regenerate the lock file
+uv lock
+git add pyproject.toml uv.lock
+git commit -m "Add fastapi dependency"
+```
+
+### Scenario 2: Relax or tighten a version constraint
+
+```bash
+# 1. Edit the requirement in pyproject.toml,
+#    e.g. openai>=1.0.0,<2.0.0 -> openai>=1.5.0,<2.0.0
+
+# 2. Re-resolve the lock file
+uv lock
+
+# 3. Commit both files
+git add pyproject.toml uv.lock
+git commit -m "Update openai to >=1.5.0"
+```
+
+### Scenario 3: Upgrade everything to the newest compatible versions
+
+```bash
+uv lock --upgrade
+git diff uv.lock
+git add uv.lock
+git commit -m "Upgrade dependencies to latest compatible versions"
+```
+
+### Scenario 4: Teammate syncing the project
+
+```bash
+git pull               # Fetch latest code and lock file
+uv sync --frozen       # Install exactly what uv.lock specifies
+```
+
+## Using uv.lock in Docker
+
+```dockerfile
+RUN uv sync --frozen --no-dev --extra api
+```
+
+`--frozen` guarantees reproducible builds because uv will refuse to deviate from the locked versions.
+`--extra api` install API server
+
+## Generating a lock file that includes offline dependencies
+
+If you need `uv.lock` to capture the optional offline stacks, regenerate it with the relevant extras enabled:
+
+```bash
+uv lock --extra api --extra offline
+```
+
+This command resolves the base project requirements plus both the `api` and `offline` optional dependency sets, ensuring downstream `uv sync --frozen --extra api --extra offline` installs work without further resolution.
+
+## Frequently asked questions
+
+- **`uv.lock` is almost 1 MB. Does that matter?**
+  No. The file is read only during dependency resolution.
+
+- **Should we commit `uv.lock`?**
+  Yes. Commit it so collaborators and CI jobs share the same dependency graph.
+
+- **Deleted the lock file by accident?**
+  Run `uv lock` to regenerate it from `pyproject.toml`.
+
+- **Can `uv.lock` and `requirements.txt` coexist?**
+  They can, but maintaining both is redundant. Prefer relying on `uv.lock` alone whenever possible.
+
+- **How do I inspect locked versions?**
+  ```bash
+  uv tree
+  grep -A5 'name = "openai"' uv.lock
+  ```
+
+## Best practices
+
+### Recommended
+
+1. Commit `uv.lock` alongside `pyproject.toml`.
+2. Use `uv sync --frozen` in CI, Docker, and other reproducible environments.
+3. Use plain `uv sync` during local development if you want uv to reconcile the lock for you.
+4. Run `uv lock --upgrade` periodically to pick up the latest compatible releases.
+5. Regenerate the lock file immediately after changing dependency constraints.
+
+### Avoid
+
+1. Running `uv sync` without `--frozen` in CI or production pipelines.
+2. Editing `uv.lock` by hand—uv will overwrite manual edits.
+3. Ignoring lock file diffs in code reviews—unexpected dependency changes can break builds.
+
+## Summary
+
+| Command               | Updates `uv.lock` | Typical use                               |
+|-----------------------|-------------------|-------------------------------------------|
+| `uv lock`             | ✅ Yes            | After editing constraints                 |
+| `uv lock --upgrade`   | ✅ Yes            | Upgrade to the newest compatible versions |
+| `uv add <pkg>`        | ✅ Yes            | Add a dependency                          |
+| `uv remove <pkg>`     | ✅ Yes            | Remove a dependency                       |
+| `uv sync`             | ⚠️ Maybe          | Local development; can regenerate the lock |
+| `uv sync --frozen`    | ❌ No             | CI/CD, Docker, reproducible builds        |
+
+Remember: `uv.lock` only changes when you run a command that tells it to. Keep it in sync with your project and commit it whenever it changes.
--- a/env.example
+++ b/env.example
@ -23,13 +23,13 @@ WEBUI_DESCRIPTION="Simple and Fast Graph Based RAG System"
 # WORKING_DIR=<absolute_path_for_working_dir>

 ### Tiktoken cache directory (Store cached files in this folder for offline deployment)
-# TIKTOKEN_CACHE_DIR=./temp/tiktoken
+# TIKTOKEN_CACHE_DIR=/app/data/tiktoken

 ### Ollama Emulating Model and Tag
 # OLLAMA_EMULATING_MODEL_NAME=lightrag
 OLLAMA_EMULATING_MODEL_TAG=latest

-### Max nodes return from grap retrieval in webui
+### Max nodes return from graph retrieval in webui
 # MAX_GRAPH_NODES=1000

 ### Logging level
@ -56,29 +56,24 @@ OLLAMA_EMULATING_MODEL_TAG=latest
 ######################################################################################
 ### Query Configuration
 ###
-### How to control the context lenght sent to LLM:
+### How to control the context length sent to LLM:
 ###    MAX_ENTITY_TOKENS + MAX_RELATION_TOKENS < MAX_TOTAL_TOKENS
-###    Chunk_Tokens = MAX_TOTAL_TOKENS - Actual_Entity_Tokens - Actual_Reation_Tokens
+###    Chunk_Tokens = MAX_TOTAL_TOKENS - Actual_Entity_Tokens - Actual_Relation_Tokens
 ######################################################################################
-# LLM responde cache for query (Not valid for streaming response)
+# LLM response cache for query (Not valid for streaming response)
 ENABLE_LLM_CACHE=true
 # COSINE_THRESHOLD=0.2
 ### Number of entities or relations retrieved from KG
 # TOP_K=40
-### Maxmium number or chunks for naive vector search
+### Maximum number or chunks for naive vector search
 # CHUNK_TOP_K=20
-### control the actual enties send to LLM
+### control the actual entities send to LLM
 # MAX_ENTITY_TOKENS=6000
 ### control the actual relations send to LLM
 # MAX_RELATION_TOKENS=8000
-### control the maximum tokens send to LLM (include entities, raltions and chunks)
+### control the maximum tokens send to LLM (include entities, relations and chunks)
 # MAX_TOTAL_TOKENS=30000

-### maximum number of related chunks per source entity or relation
-###     The chunk picker uses this value to determine the total number of chunks selected from KG(knowledge graph)
-###     Higher values increase re-ranking time
-# RELATED_CHUNK_NUMBER=5
-
 ### chunk selection strategies
 ###     VECTOR: Pick KG chunks by vector similarity, delivered chunks to the LLM aligning more closely with naive retrieval
 ###     WEIGHT: Pick KG chunks by entity and chunk weight, delivered more solely KG related chunks to the LLM
@ -93,7 +88,7 @@ ENABLE_LLM_CACHE=true
 RERANK_BINDING=null
 ### Enable rerank by default in query params when RERANK_BINDING is not null
 # RERANK_BY_DEFAULT=True
-### rerank score chunk filter(set to 0.0 to keep all chunks, 0.6 or above if LLM is not strong enought)
+### rerank score chunk filter(set to 0.0 to keep all chunks, 0.6 or above if LLM is not strong enough)
 # MIN_RERANK_SCORE=0.0

 ### For local deployment with vLLM
@ -131,7 +126,7 @@ SUMMARY_LANGUAGE=English
 # CHUNK_SIZE=1200
 # CHUNK_OVERLAP_SIZE=100

-### Number of summary semgments or tokens to trigger LLM summary on entity/relation merge (at least 3 is recommented)
+### Number of summary segments or tokens to trigger LLM summary on entity/relation merge (at least 3 is recommended)
 # FORCE_LLM_SUMMARY_ON_MERGE=8
 ### Max description token size to trigger LLM summary
 # SUMMARY_MAX_TOKENS = 1200
@ -140,6 +135,22 @@ SUMMARY_LANGUAGE=English
 ### Maximum context size sent to LLM for description summary
 # SUMMARY_CONTEXT_SIZE=12000

+### control the maximum chunk_ids stored in vector and graph db
+# MAX_SOURCE_IDS_PER_ENTITY=300
+# MAX_SOURCE_IDS_PER_RELATION=300
+### control chunk_ids limitation method: FIFO, KEEP
+###    FIFO: First in first out
+###    KEEP: Keep oldest (less merge action and faster)
+# SOURCE_IDS_LIMIT_METHOD=FIFO
+
+# Maximum number of file paths stored in entity/relation file_path field (For displayed only, does not affect query performance)
+# MAX_FILE_PATHS=100
+
+### maximum number of related chunks per source entity or relation
+###     The chunk picker uses this value to determine the total number of chunks selected from KG(knowledge graph)
+###     Higher values increase re-ranking time
+# RELATED_CHUNK_NUMBER=5
+
 ###############################
 ### Concurrency Configuration
 ###############################
@ -179,7 +190,7 @@ LLM_BINDING_API_KEY=your_api_key
 # OPENAI_LLM_TEMPERATURE=0.9
 ### Set the max_tokens to mitigate endless output of some LLM (less than LLM_TIMEOUT * llm_output_tokens/second, i.e. 9000 = 180s * 50 tokens/s)
 ### Typically, max_tokens does not include prompt content, though some models, such as Gemini Models, are exceptions
-### For vLLM/SGLang doployed models, or most of OpenAI compatible API provider
+### For vLLM/SGLang deployed models, or most of OpenAI compatible API provider
 # OPENAI_LLM_MAX_TOKENS=9000
 ### For OpenAI o1-mini or newer modles
 OPENAI_LLM_MAX_COMPLETION_TOKENS=9000
@ -193,10 +204,11 @@ OPENAI_LLM_MAX_COMPLETION_TOKENS=9000
 # OPENAI_LLM_REASONING_EFFORT=minimal
 ### OpenRouter Specific Parameters
 # OPENAI_LLM_EXTRA_BODY='{"reasoning": {"enabled": false}}'
-### Qwen3 Specific Parameters depoly by vLLM
+### Qwen3 Specific Parameters deploy by vLLM
 # OPENAI_LLM_EXTRA_BODY='{"chat_template_kwargs": {"enable_thinking": false}}'

 ### use the following command to see all support options for Ollama LLM
+### If LightRAG deployed in Docker uses host.docker.internal instead of localhost in LLM_BINDING_HOST
 ### lightrag-server --llm-binding ollama --help
 ### Ollama Server Specific Parameters
 ### OLLAMA_LLM_NUM_CTX must be provided, and should at least larger than MAX_TOTAL_TOKENS + 2000
@ -218,7 +230,7 @@ EMBEDDING_BINDING=ollama
 EMBEDDING_MODEL=bge-m3:latest
 EMBEDDING_DIM=1024
 EMBEDDING_BINDING_API_KEY=your_api_key
-# If the embedding service is deployed within the same Docker stack, use host.docker.internal instead of localhost
+# If LightRAG deployed in Docker uses host.docker.internal instead of localhost
 EMBEDDING_BINDING_HOST=http://localhost:11434

 ### OpenAI compatible (VoyageAI embedding openai compatible)
@ -247,8 +259,8 @@ OLLAMA_EMBEDDING_NUM_CTX=8192
 ### lightrag-server --embedding-binding ollama --help

 ####################################################################
-### WORKSPACE setting workspace name for all storage types
-### in the purpose of isolating data from LightRAG instances.
+### WORKSPACE sets workspace name for all storage types
+### for the purpose of isolating data from LightRAG instances.
 ### Valid workspace name constraints: a-z, A-Z, 0-9, and _
 ####################################################################
 # WORKSPACE=space1
@ -303,6 +315,16 @@ POSTGRES_HNSW_M=16
 POSTGRES_HNSW_EF=200
 POSTGRES_IVFFLAT_LISTS=100

+### PostgreSQL Connection Retry Configuration (Network Robustness)
+### Number of retry attempts (1-10, default: 3)
+### Initial retry backoff in seconds (0.1-5.0, default: 0.5)
+### Maximum retry backoff in seconds (backoff-60.0, default: 5.0)
+### Connection pool close timeout in seconds (1.0-30.0, default: 5.0)
+# POSTGRES_CONNECTION_RETRIES=3
+# POSTGRES_CONNECTION_RETRY_BACKOFF=0.5
+# POSTGRES_CONNECTION_RETRY_BACKOFF_MAX=5.0
+# POSTGRES_POOL_CLOSE_TIMEOUT=5.0
+
 ### PostgreSQL SSL Configuration (Optional)
 # POSTGRES_SSL_MODE=require
 # POSTGRES_SSL_CERT=/path/to/client-cert.pem
@ -310,6 +332,14 @@ POSTGRES_IVFFLAT_LISTS=100
 # POSTGRES_SSL_ROOT_CERT=/path/to/ca-cert.pem
 # POSTGRES_SSL_CRL=/path/to/crl.pem

+### PostgreSQL Server Settings (for Supabase Supavisor)
+# Use this to pass extra options to the PostgreSQL connection string.
+# For Supabase, you might need to set it like this:
+# POSTGRES_SERVER_SETTINGS="options=reference%3D[project-ref]"
+
+# Default is 100 set to 0 to disable
+# POSTGRES_STATEMENT_CACHE_SIZE=100
+
 ### Neo4j Configuration
 NEO4J_URI=neo4j+s://xxxxxxxx.databases.neo4j.io
 NEO4J_USERNAME=neo4j
--- a/k8s-deploy/lightrag/Chart.yaml
+++ b/k8s-deploy/lightrag/Chart.yaml
@ -2,7 +2,7 @@ apiVersion: v2
 name: lightrag
 description: A Helm chart for LightRAG, an efficient and lightweight RAG system
 type: application
-version: 0.1.0
+version: 0.1.1
 appVersion: "1.0.0"
 maintainers:
  - name: LightRAG Team
--- a/k8s-deploy/lightrag/templates/deployment.yaml
+++ b/k8s-deploy/lightrag/templates/deployment.yaml
@ -43,6 +43,22 @@ spec:
            - name: env-file
              mountPath: /app/.env
              subPath: .env
+          {{- $envFrom := default (dict) .Values.envFrom }}
+          {{- $envFromEntries := list }}
+          {{- range (default (list) (index $envFrom "secrets")) }}
+          {{- $envFromEntries = append $envFromEntries (dict "secretRef" (dict "name" .name)) }}
+          {{- end }}
+          {{- range (default (list) (index $envFrom "configmaps")) }}
+          {{- $envFromEntries = append $envFromEntries (dict "configMapRef" (dict "name" .name)) }}
+          {{- end }}
+          {{- if gt (len $envFromEntries) 0 }}
+          envFrom:
+{{- toYaml $envFromEntries | nindent 12 }}
+          {{- end }}
+      {{- with .Values.image.imagePullSecrets }}
+      imagePullSecrets:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
      volumes:
        - name: env-file
          secret:
@ -60,3 +76,6 @@ spec:
        - name: inputs
          emptyDir: {}
        {{- end }}
+
+  strategy:
+    {{- toYaml .Values.updateStrategy | nindent 4 }}
--- a/k8s-deploy/lightrag/values.yaml
+++ b/k8s-deploy/lightrag/values.yaml
@ -3,6 +3,23 @@ replicaCount: 1
 image:
  repository: ghcr.io/hkuds/lightrag
  tag: latest
+  # Optionally specify imagePullSecrets if your image is in a private registry
+  # example:
+  # imagePullSecrets:
+  #   - name: my-registry-secret
+  imagePullSecrets: []
+
+# Specify a deployment strategy
+# example:
+# updateStrategy:
+#   type: RollingUpdate
+#   rollingUpdate:
+#     maxUnavailable: 25%
+#     maxSurge: 25%
+# Default for now should be Recreate as any RollingUpdate will cause issues with
+# multiple instances trying to access the same persistent storage if not using RWX volumes.
+updateStrategy:
+  type: Recreate

 service:
  type: ClusterIP
@ -23,6 +40,13 @@ persistence:
  inputs:
    size: 5Gi

+# Allow specifying additional environment variables from ConfigMaps or Secrets created outside of this chart
+envFrom:
+  configmaps: []
+    # - name: my-shiny-configmap-1
+  secrets: []
+    # - name: my-shiny-secret-1
+
 env:
  HOST: 0.0.0.0
  PORT: 9621
@ -38,8 +62,8 @@ env:
  EMBEDDING_BINDING_API_KEY:
  LIGHTRAG_KV_STORAGE: PGKVStorage
  LIGHTRAG_VECTOR_STORAGE: PGVectorStorage
-#  LIGHTRAG_KV_STORAGE: RedisKVStorage
-#  LIGHTRAG_VECTOR_STORAGE: QdrantVectorDBStorage
+  #  LIGHTRAG_KV_STORAGE: RedisKVStorage
+  #  LIGHTRAG_VECTOR_STORAGE: QdrantVectorDBStorage
  LIGHTRAG_GRAPH_STORAGE: Neo4JStorage
  LIGHTRAG_DOC_STATUS_STORAGE: PGDocStatusStorage
  # Replace with your POSTGRES credentials
--- a/lightrag/init.py
+++ b/lightrag/init.py
@ -1,5 +1,5 @@
 from .lightrag import LightRAG as LightRAG, QueryParam as QueryParam

-__version__ = "1.4.9"
+__version__ = "1.4.9.5"
 __author__ = "Zirui Guo"
 __url__ = "https://github.com/HKUDS/LightRAG"
--- a/lightrag/api/README-zh.md
+++ b/lightrag/api/README-zh.md
@ -21,15 +21,24 @@ pip install "lightrag-hku[api]"
 * 从源代码安装

 ```bash
-# 克隆仓库
+# Clone the repository
 git clone https://github.com/HKUDS/lightrag.git

-# 切换到仓库目录
+# Change to the repository directory
 cd lightrag

-# 如有必要，创建 Python 虚拟环境
-# 以可编辑模式安装并支持 API
+# Create a Python virtual environment
+uv venv --seed --python 3.12
+source .venv/bin/activate
+
+# Install in editable mode with API support
 pip install -e ".[api]"
+
+# Build front-end artifacts
+cd lightrag_webui
+bun install --frozen-lockfile
+bun run build
+cd ..
 ```

 ### 启动 LightRAG 服务器前的准备
@ -109,28 +118,10 @@ lightrag-gunicorn --workers 4

 ### 使用 Docker 启动 LightRAG 服务器

-* 配置 .env 文件：
-    通过复制示例文件 [`env.example`](env.example) 创建个性化的 .env 文件，并根据实际需求设置 LLM 及 Embedding 参数。
-* 创建一个名为 docker-compose.yml 的文件：
-
-```yaml
-services:
-  lightrag:
-    container_name: lightrag
-    image: ghcr.io/hkuds/lightrag:latest
-    ports:
-      - "${PORT:-9621}:9621"
-    volumes:
-      - ./data/rag_storage:/app/data/rag_storage
-      - ./data/inputs:/app/data/inputs
-      - ./config.ini:/app/config.ini
-      - ./.env:/app/.env
-    env_file:
-      - .env
-    restart: unless-stopped
-    extra_hosts:
-      - "host.docker.internal:host-gateway"
-```
+使用 Docker Compose 是部署和运行 LightRAG Server 最便捷的方式。
+- 创建一个项目目录。
+- 将 LightRAG 仓库中的 `docker-compose.yml` 文件复制到您的项目目录中。
+- 准备 `.env` 文件：复制示例文件 [`env.example`](https://ai.znipower.com:5013/c/env.example) 创建自定义的 `.env` 文件，并根据您的具体需求配置 LLM 和嵌入参数。

 * 通过以下命令启动 LightRAG 服务器：

@ -138,7 +129,11 @@ services:
 docker compose up
 # 如果希望启动后让程序退到后台运行，需要在命令的最后添加 -d 参数
 ```
-> 可以通过以下链接获取官方的docker compose文件：[docker-compose.yml]( https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml) 。如需获取LightRAG的历史版本镜像，可以访问以下链接: [LightRAG Docker Images]( https://github.com/HKUDS/LightRAG/pkgs/container/lightrag)
+> 可以通过以下链接获取官方的docker compose文件：[docker-compose.yml]( https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml) 。如需获取LightRAG的历史版本镜像，可以访问以下链接: [LightRAG Docker Images]( https://github.com/HKUDS/LightRAG/pkgs/container/lightrag). 如需获取更多关于docker部署的信息，请参阅 [DockerDeployment.md](./../../docs/DockerDeployment.md).
+
+### 离线部署
+
+官方的 LightRAG Docker 镜像完全兼容离线或隔离网络环境。如需搭建自己的离线部署环境，请参考 [离线部署指南](./../../docs/OfflineDeployment.md)。

 ### 启动多个LightRAG实例

@ -278,7 +273,17 @@ LIGHTRAG_API_KEY=your-secure-api-key-here
 WHITELIST_PATHS=/health,/api/*
 ```

-> 健康检查和 Ollama 模拟端点默认不进行 API 密钥检查。
+> 健康检查和 Ollama 模拟端点默认不进行 API 密钥检查。为了安全原因，如果不需要提供Ollama服务，应该把`/api/*`从WHITELIST_PATHS中移除。
+
+API Key使用的请求头是 `X-API-Key` 。以下是使用API访问LightRAG Server的一个例子：
+
+```
+curl -X 'POST' \
+  'http://localhost:9621/documents/scan' \
+  -H 'accept: application/json' \
+  -H 'X-API-Key: your-secure-api-key-here-123' \
+  -d ''
+```

 * 账户凭证（Web 界面需要登录后才能访问）

--- a/lightrag/api/README.md
+++ b/lightrag/api/README.md
@ -27,9 +27,18 @@ git clone https://github.com/HKUDS/lightrag.git
 # Change to the repository directory
 cd lightrag

-# create a Python virtual environment if necessary
+# Create a Python virtual environment
+uv venv --seed --python 3.12
+source .venv/bin/activate
+
 # Install in editable mode with API support
 pip install -e ".[api]"
+
+# Build front-end artifacts
+cd lightrag_webui
+bun install --frozen-lockfile
+bun run build
+cd ..
 ```

 ### Before Starting LightRAG Server
@ -110,29 +119,13 @@ During startup, configurations in the `.env` file can be overridden by command-l

 ### Launching LightRAG Server with Docker

-* Prepare the .env file:
-    Create a personalized .env file by copying the sample file [`env.example`](env.example). Configure the LLM and embedding parameters according to your requirements.
+Using Docker Compose is the most convenient way to deploy and run the LightRAG Server.

-* Create a file named `docker-compose.yml`:
+* Create a project directory.

-```yaml
-services:
-  lightrag:
-    container_name: lightrag
-    image: ghcr.io/hkuds/lightrag:latest
-    ports:
-      - "${PORT:-9621}:9621"
-    volumes:
-      - ./data/rag_storage:/app/data/rag_storage
-      - ./data/inputs:/app/data/inputs
-      - ./config.ini:/app/config.ini
-      - ./.env:/app/.env
-    env_file:
-      - .env
-    restart: unless-stopped
-    extra_hosts:
-      - "host.docker.internal:host-gateway"
-```
+* Copy the `docker-compose.yml` file from the LightRAG repository into your project directory.
+
+* Prepare the `.env` file: Duplicate the sample file [`env.example`](https://ai.znipower.com:5013/c/env.example)to create a customized `.env` file, and configure the LLM and embedding parameters according to your specific requirements.

 * Start the LightRAG Server with the following command:

@ -141,7 +134,11 @@ docker compose up
 # If you want the program to run in the background after startup, add the -d parameter at the end of the command.
 ```

-> You can get the official docker compose file from here: [docker-compose.yml](https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml). For historical versions of LightRAG docker images, visit this link: [LightRAG Docker Images](https://github.com/HKUDS/LightRAG/pkgs/container/lightrag)
+You can get the official docker compose file from here: [docker-compose.yml](https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml). For historical versions of LightRAG docker images, visit this link: [LightRAG Docker Images](https://github.com/HKUDS/LightRAG/pkgs/container/lightrag). For more details about docker deployment, please refer to [DockerDeployment.md](./../../docs/DockerDeployment.md).
+
+### Offline Deployment
+
+Official LightRAG Docker images are fully compatible with offline or air-gapped environments. If you want to build up you own  offline enviroment, please refer to [Offline Deployment Guide](./../../docs/OfflineDeployment.md).

 ### Starting Multiple LightRAG Instances

@ -280,7 +277,17 @@ LIGHTRAG_API_KEY=your-secure-api-key-here
 WHITELIST_PATHS=/health,/api/*
 ```

-> Health check and Ollama emulation endpoints are excluded from API Key check by default.
+> Health check and Ollama emulation endpoints are excluded from API Key check by default. For security reasons, remove `/api/*` from `WHITELIST_PATHS` if the Ollama service is not required.
+
+The API key is passed using the request header `X-API-Key`. Below is an example of accessing the LightRAG Server via API:
+
+```
+curl -X 'POST' \
+  'http://localhost:9621/documents/scan' \
+  -H 'accept: application/json' \
+  -H 'X-API-Key: your-secure-api-key-here-123' \
+  -d ''
+```

 * Account credentials (the Web UI requires login before access can be granted):

--- a/lightrag/api/init.py
+++ b/lightrag/api/init.py
@ -1 +1 @@
-__api_version__ = "0230"
+__api_version__ = "0245"
--- a/lightrag/api/lightrag_server.py
+++ b/lightrag/api/lightrag_server.py
@ -145,7 +145,129 @@ class LLMConfigCache:
                self.ollama_embedding_options = {}


+def check_frontend_build():
+    """Check if frontend is built and optionally check if source is up-to-date"""
+    webui_dir = Path(__file__).parent / "webui"
+    index_html = webui_dir / "index.html"
+
+    # 1. Check if build files exist (required)
+    if not index_html.exists():
+        ASCIIColors.red("\n" + "=" * 80)
+        ASCIIColors.red("ERROR: Frontend Not Built")
+        ASCIIColors.red("=" * 80)
+        ASCIIColors.yellow("The WebUI frontend has not been built yet.")
+        ASCIIColors.yellow(
+            "Please build the frontend code first using the following commands:\n"
+        )
+        ASCIIColors.cyan("    cd lightrag_webui")
+        ASCIIColors.cyan("    bun install --frozen-lockfile")
+        ASCIIColors.cyan("    bun run build")
+        ASCIIColors.cyan("    cd ..")
+        ASCIIColors.yellow("\nThen restart the service.\n")
+        ASCIIColors.cyan(
+            "Note: Make sure you have Bun installed. Visit https://bun.sh for installation."
+        )
+        ASCIIColors.red("=" * 80 + "\n")
+        sys.exit(1)  # Exit immediately
+
+    # 2. Check if this is a development environment (source directory exists)
+    try:
+        source_dir = Path(__file__).parent.parent.parent / "lightrag_webui"
+        src_dir = source_dir / "src"
+
+        # Determine if this is a development environment: source directory exists and contains src directory
+        if not source_dir.exists() or not src_dir.exists():
+            # Production environment, skip source code check
+            logger.debug(
+                "Production environment detected, skipping source freshness check"
+            )
+            return
+
+        # Development environment, perform source code timestamp check
+        logger.debug("Development environment detected, checking source freshness")
+
+        # Source code file extensions (files to check)
+        source_extensions = {
+            ".ts",
+            ".tsx",
+            ".js",
+            ".jsx",
+            ".mjs",
+            ".cjs",  # TypeScript/JavaScript
+            ".css",
+            ".scss",
+            ".sass",
+            ".less",  # Style files
+            ".json",
+            ".jsonc",  # Configuration/data files
+            ".html",
+            ".htm",  # Template files
+            ".md",
+            ".mdx",  # Markdown
+        }
+
+        # Key configuration files (in lightrag_webui root directory)
+        key_files = [
+            source_dir / "package.json",
+            source_dir / "bun.lock",
+            source_dir / "vite.config.ts",
+            source_dir / "tsconfig.json",
+            source_dir / "tailwind.config.js",
+            source_dir / "index.html",
+        ]
+
+        # Get the latest modification time of source code
+        latest_source_time = 0
+
+        # Check source code files in src directory
+        for file_path in src_dir.rglob("*"):
+            if file_path.is_file():
+                # Only check source code files, ignore temporary files and logs
+                if file_path.suffix.lower() in source_extensions:
+                    mtime = file_path.stat().st_mtime
+                    latest_source_time = max(latest_source_time, mtime)
+
+        # Check key configuration files
+        for key_file in key_files:
+            if key_file.exists():
+                mtime = key_file.stat().st_mtime
+                latest_source_time = max(latest_source_time, mtime)
+
+        # Get build time
+        build_time = index_html.stat().st_mtime
+
+        # Compare timestamps (5 second tolerance to avoid file system time precision issues)
+        if latest_source_time > build_time + 5:
+            ASCIIColors.yellow("\n" + "=" * 80)
+            ASCIIColors.yellow("WARNING: Frontend Source Code Has Been Updated")
+            ASCIIColors.yellow("=" * 80)
+            ASCIIColors.yellow(
+                "The frontend source code is newer than the current build."
+            )
+            ASCIIColors.yellow(
+                "This might happen after 'git pull' or manual code changes.\n"
+            )
+            ASCIIColors.cyan(
+                "Recommended: Rebuild the frontend to use the latest changes:"
+            )
+            ASCIIColors.cyan("    cd lightrag_webui")
+            ASCIIColors.cyan("    bun install --frozen-lockfile")
+            ASCIIColors.cyan("    bun run build")
+            ASCIIColors.cyan("    cd ..")
+            ASCIIColors.yellow("\nThe server will continue with the current build.")
+            ASCIIColors.yellow("=" * 80 + "\n")
+        else:
+            logger.info("Frontend build is up-to-date")
+
+    except Exception as e:
+        # If check fails, log warning but don't affect startup
+        logger.warning(f"Failed to check frontend source freshness: {e}")
+
+
 def create_app(args):
+    # Check frontend build first
+    check_frontend_build()
+
    # Setup logging
    logger.setLevel(args.log_level)
    set_verbose_debug(args.verbose)
@ -223,14 +345,17 @@ def create_app(args):
            finalize_share_data()

    # Initialize FastAPI
+    base_description = (
+        "Providing API for LightRAG core, Web UI and Ollama Model Emulation"
+    )
+    swagger_description = (
+        base_description
+        + (" (API-Key Enabled)" if api_key else "")
+        + "\n\n[View ReDoc documentation](/redoc)"
+    )
    app_kwargs = {
        "title": "LightRAG Server API",
-        "description": (
-            "Providing API for LightRAG core, Web UI and Ollama Model Emulation"
-            + "(With authentication)"
-            if api_key
-            else ""
-        ),
+        "description": swagger_description,
        "version": __api_version__,
        "openapi_url": "/openapi.json",  # Explicitly set OpenAPI schema URL
        "docs_url": "/docs",  # Explicitly set docs URL
@ -786,7 +911,9 @@ def create_app(args):
        async def get_response(self, path: str, scope):
            response = await super().get_response(path, scope)

-            if path.endswith(".html"):
+            is_html = path.endswith(".html") or response.media_type == "text/html"
+
+            if is_html:
                response.headers["Cache-Control"] = (
                    "no-cache, no-store, must-revalidate"
                )
--- a/lightrag/api/routers/document_routes.py
+++ b/lightrag/api/routers/document_routes.py
@ -134,6 +134,55 @@ class ScanResponse(BaseModel):
        }


+class ReprocessResponse(BaseModel):
+    """Response model for reprocessing failed documents operation
+
+    Attributes:
+        status: Status of the reprocessing operation
+        message: Message describing the operation result
+        track_id: Tracking ID for monitoring reprocessing progress
+    """
+
+    status: Literal["reprocessing_started"] = Field(
+        description="Status of the reprocessing operation"
+    )
+    message: str = Field(description="Human-readable message describing the operation")
+    track_id: str = Field(
+        description="Tracking ID for monitoring reprocessing progress"
+    )
+
+    class Config:
+        json_schema_extra = {
+            "example": {
+                "status": "reprocessing_started",
+                "message": "Reprocessing of failed documents has been initiated in background",
+                "track_id": "retry_20250729_170612_def456",
+            }
+        }
+
+
+class CancelPipelineResponse(BaseModel):
+    """Response model for pipeline cancellation operation
+
+    Attributes:
+        status: Status of the cancellation request
+        message: Message describing the operation result
+    """
+
+    status: Literal["cancellation_requested", "not_busy"] = Field(
+        description="Status of the cancellation request"
+    )
+    message: str = Field(description="Human-readable message describing the operation")
+
+    class Config:
+        json_schema_extra = {
+            "example": {
+                "status": "cancellation_requested",
+                "message": "Pipeline cancellation has been requested. Documents will be marked as FAILED.",
+            }
+        }
+
+
 class InsertTextRequest(BaseModel):
    """Request model for inserting a single text document

@ -309,6 +358,10 @@ class DeleteDocRequest(BaseModel):
        default=False,
        description="Whether to delete the corresponding file in the upload directory.",
    )
+    delete_llm_cache: bool = Field(
+        default=False,
+        description="Whether to delete cached LLM extraction results for the documents.",
+    )

    @field_validator("doc_ids", mode="after")
    @classmethod
@ -379,7 +432,7 @@ class DocStatusResponse(BaseModel):
                "id": "doc_123456",
                "content_summary": "Research paper on machine learning",
                "content_length": 15240,
-                "status": "PROCESSED",
+                "status": "processed",
                "created_at": "2025-03-31T12:34:56",
                "updated_at": "2025-03-31T12:35:30",
                "track_id": "upload_20250729_170612_abc123",
@ -412,7 +465,7 @@ class DocsStatusesResponse(BaseModel):
                            "id": "doc_123",
                            "content_summary": "Pending document",
                            "content_length": 5000,
-                            "status": "PENDING",
+                            "status": "pending",
                            "created_at": "2025-03-31T10:00:00",
                            "updated_at": "2025-03-31T10:00:00",
                            "track_id": "upload_20250331_100000_abc123",
@ -422,12 +475,27 @@ class DocsStatusesResponse(BaseModel):
                            "file_path": "pending_doc.pdf",
                        }
                    ],
+                    "PREPROCESSED": [
+                        {
+                            "id": "doc_789",
+                            "content_summary": "Document pending final indexing",
+                            "content_length": 7200,
+                            "status": "preprocessed",
+                            "created_at": "2025-03-31T09:30:00",
+                            "updated_at": "2025-03-31T09:35:00",
+                            "track_id": "upload_20250331_093000_xyz789",
+                            "chunks_count": 10,
+                            "error": None,
+                            "metadata": None,
+                            "file_path": "preprocessed_doc.pdf",
+                        }
+                    ],
                    "PROCESSED": [
                        {
                            "id": "doc_456",
                            "content_summary": "Processed document",
                            "content_length": 8000,
-                            "status": "PROCESSED",
+                            "status": "processed",
                            "created_at": "2025-03-31T09:00:00",
                            "updated_at": "2025-03-31T09:05:00",
                            "track_id": "insert_20250331_090000_def456",
@ -599,6 +667,7 @@ class PaginatedDocsResponse(BaseModel):
                "status_counts": {
                    "PENDING": 10,
                    "PROCESSING": 5,
+                    "PREPROCESSED": 5,
                    "PROCESSED": 130,
                    "FAILED": 5,
                },
@ -621,6 +690,7 @@ class StatusCountsResponse(BaseModel):
                "status_counts": {
                    "PENDING": 10,
                    "PROCESSING": 5,
+                    "PREPROCESSED": 5,
                    "PROCESSED": 130,
                    "FAILED": 5,
                }
@ -1443,6 +1513,7 @@ async def background_delete_documents(
    doc_manager: DocumentManager,
    doc_ids: List[str],
    delete_file: bool = False,
+    delete_llm_cache: bool = False,
 ):
    """Background task to delete multiple documents"""
    from lightrag.kg.shared_storage import (
@ -1477,11 +1548,27 @@ async def background_delete_documents(
        )
        # Use slice assignment to clear the list in place
        pipeline_status["history_messages"][:] = ["Starting document deletion process"]
+        if delete_llm_cache:
+            pipeline_status["history_messages"].append(
+                "LLM cache cleanup requested for this deletion job"
+            )

    try:
        # Loop through each document ID and delete them one by one
        for i, doc_id in enumerate(doc_ids, 1):
+            # Check for cancellation at the start of each document deletion
            async with pipeline_status_lock:
+                if pipeline_status.get("cancellation_requested", False):
+                    cancel_msg = f"Deletion cancelled by user at document {i}/{total_docs}. {len(successful_deletions)} deleted, {total_docs - i + 1} remaining."
+                    logger.info(cancel_msg)
+                    pipeline_status["latest_message"] = cancel_msg
+                    pipeline_status["history_messages"].append(cancel_msg)
+                    # Add remaining documents to failed list with cancellation reason
+                    failed_deletions.extend(
+                        doc_ids[i - 1 :]
+                    )  # i-1 because enumerate starts at 1
+                    break  # Exit the loop, remaining documents unchanged
+
                start_msg = f"Deleting document {i}/{total_docs}: {doc_id}"
                logger.info(start_msg)
                pipeline_status["cur_batch"] = i
@ -1490,7 +1577,9 @@ async def background_delete_documents(

            file_path = "#"
            try:
-                result = await rag.adelete_by_doc_id(doc_id)
+                result = await rag.adelete_by_doc_id(
+                    doc_id, delete_llm_cache=delete_llm_cache
+                )
                file_path = (
                    getattr(result, "file_path", "-") if "result" in locals() else "-"
                )
@ -1642,6 +1731,10 @@ async def background_delete_documents(
        # Final summary and check for pending requests
        async with pipeline_status_lock:
            pipeline_status["busy"] = False
+            pipeline_status["pending_requests"] = False  # Reset pending requests flag
+            pipeline_status["cancellation_requested"] = (
+                False  # Always reset cancellation flag
+            )
            completion_msg = f"Deletion completed: {len(successful_deletions)} successful, {len(failed_deletions)} failed"
            pipeline_status["latest_message"] = completion_msg
            pipeline_status["history_messages"].append(completion_msg)
@ -1959,6 +2052,8 @@ def create_document_routes(
                rag.full_docs,
                rag.full_entities,
                rag.full_relations,
+                rag.entity_chunks,
+                rag.relation_chunks,
                rag.entities_vdb,
                rag.relationships_vdb,
                rag.chunks_vdb,
@ -2173,20 +2268,24 @@ def create_document_routes(
            logger.error(traceback.format_exc())
            raise HTTPException(status_code=500, detail=str(e))

+    # TODO: Deprecated, use /documents/paginated instead
    @router.get(
        "", response_model=DocsStatusesResponse, dependencies=[Depends(combined_auth)]
    )
    async def documents() -> DocsStatusesResponse:
        """
-        Get the status of all documents in the system.
+        Get the status of all documents in the system. This endpoint is deprecated; use /documents/paginated instead.
+        To prevent excessive resource consumption, a maximum of 1,000 records is returned.

        This endpoint retrieves the current status of all documents, grouped by their
-        processing status (PENDING, PROCESSING, PROCESSED, FAILED).
+        processing status (PENDING, PROCESSING, PREPROCESSED, PROCESSED, FAILED). The results are
+        limited to 1000 total documents with fair distribution across all statuses.

        Returns:
            DocsStatusesResponse: A response object containing a dictionary where keys are
                                DocStatus values and values are lists of DocStatusResponse
                                objects representing documents in each status category.
+                                Maximum 1000 documents total will be returned.

        Raises:
            HTTPException: If an error occurs while retrieving document statuses (500).
@ -2195,6 +2294,7 @@ def create_document_routes(
            statuses = (
                DocStatus.PENDING,
                DocStatus.PROCESSING,
+                DocStatus.PREPROCESSED,
                DocStatus.PROCESSED,
                DocStatus.FAILED,
            )
@ -2203,12 +2303,45 @@ def create_document_routes(
            results: List[Dict[str, DocProcessingStatus]] = await asyncio.gather(*tasks)

            response = DocsStatusesResponse()
+            total_documents = 0
+            max_documents = 1000

+            # Convert results to lists for easier processing
+            status_documents = []
            for idx, result in enumerate(results):
                status = statuses[idx]
+                docs_list = []
                for doc_id, doc_status in result.items():
+                    docs_list.append((doc_id, doc_status))
+                status_documents.append((status, docs_list))
+
+            # Fair distribution: round-robin across statuses
+            status_indices = [0] * len(
+                status_documents
+            )  # Track current index for each status
+            current_status_idx = 0
+
+            while total_documents < max_documents:
+                # Check if we have any documents left to process
+                has_remaining = False
+                for status_idx, (status, docs_list) in enumerate(status_documents):
+                    if status_indices[status_idx] < len(docs_list):
+                        has_remaining = True
+                        break
+
+                if not has_remaining:
+                    break
+
+                # Try to get a document from the current status
+                status, docs_list = status_documents[current_status_idx]
+                current_index = status_indices[current_status_idx]
+
+                if current_index < len(docs_list):
+                    doc_id, doc_status = docs_list[current_index]
+
                    if status not in response.statuses:
                        response.statuses[status] = []
+
                    response.statuses[status].append(
                        DocStatusResponse(
                            id=doc_id,
@ -2224,6 +2357,13 @@ def create_document_routes(
                            file_path=doc_status.file_path,
                        )
                    )
+
+                    status_indices[current_status_idx] += 1
+                    total_documents += 1
+
+                # Move to next status (round-robin)
+                current_status_idx = (current_status_idx + 1) % len(status_documents)
+
            return response
        except Exception as e:
            logger.error(f"Error GET /documents: {str(e)}")
@ -2253,21 +2393,20 @@ def create_document_routes(
        Delete documents and all their associated data by their IDs using background processing.

        Deletes specific documents and all their associated data, including their status,
-        text chunks, vector embeddings, and any related graph data.
+        text chunks, vector embeddings, and any related graph data. When requested,
+        cached LLM extraction responses are removed after graph deletion/rebuild completes.
        The deletion process runs in the background to avoid blocking the client connection.
-        It is disabled when llm cache for entity extraction is disabled.

        This operation is irreversible and will interact with the pipeline status.

        Args:
-            delete_request (DeleteDocRequest): The request containing the document IDs and delete_file options.
+            delete_request (DeleteDocRequest): The request containing the document IDs and deletion options.
            background_tasks: FastAPI BackgroundTasks for async processing

        Returns:
            DeleteDocByIdResponse: The result of the deletion operation.
                - status="deletion_started": The document deletion has been initiated in the background.
                - status="busy": The pipeline is busy with another operation.
-                - status="not_allowed": Operation not allowed when LLM cache for entity extraction is disabled.

        Raises:
            HTTPException:
@ -2275,15 +2414,6 @@ def create_document_routes(
        """
        doc_ids = delete_request.doc_ids

-        # The rag object is initialized from the server startup args,
-        # so we can access its properties here.
-        if not rag.enable_llm_cache_for_entity_extract:
-            return DeleteDocByIdResponse(
-                status="not_allowed",
-                message="Operation not allowed when LLM cache for entity extraction is disabled.",
-                doc_id=", ".join(delete_request.doc_ids),
-            )
-
        try:
            from lightrag.kg.shared_storage import get_namespace_data

@ -2304,6 +2434,7 @@ def create_document_routes(
                doc_manager,
                doc_ids,
                delete_request.delete_file,
+                delete_request.delete_llm_cache,
            )

            return DeleteDocByIdResponse(
@ -2613,4 +2744,111 @@ def create_document_routes(
            logger.error(traceback.format_exc())
            raise HTTPException(status_code=500, detail=str(e))

+    @router.post(
+        "/reprocess_failed",
+        response_model=ReprocessResponse,
+        dependencies=[Depends(combined_auth)],
+    )
+    async def reprocess_failed_documents(background_tasks: BackgroundTasks):
+        """
+        Reprocess failed and pending documents.
+
+        This endpoint triggers the document processing pipeline which automatically
+        picks up and reprocesses documents in the following statuses:
+        - FAILED: Documents that failed during previous processing attempts
+        - PENDING: Documents waiting to be processed
+        - PROCESSING: Documents with abnormally terminated processing (e.g., server crashes)
+
+        This is useful for recovering from server crashes, network errors, LLM service
+        outages, or other temporary failures that caused document processing to fail.
+
+        The processing happens in the background and can be monitored using the
+        returned track_id or by checking the pipeline status.
+
+        Returns:
+            ReprocessResponse: Response with status, message, and track_id
+
+        Raises:
+            HTTPException: If an error occurs while initiating reprocessing (500).
+        """
+        try:
+            # Generate track_id with "retry" prefix for retry operation
+            track_id = generate_track_id("retry")
+
+            # Start the reprocessing in the background
+            background_tasks.add_task(rag.apipeline_process_enqueue_documents)
+            logger.info(
+                f"Reprocessing of failed documents initiated with track_id: {track_id}"
+            )
+
+            return ReprocessResponse(
+                status="reprocessing_started",
+                message="Reprocessing of failed documents has been initiated in background",
+                track_id=track_id,
+            )
+
+        except Exception as e:
+            logger.error(f"Error initiating reprocessing of failed documents: {str(e)}")
+            logger.error(traceback.format_exc())
+            raise HTTPException(status_code=500, detail=str(e))
+
+    @router.post(
+        "/cancel_pipeline",
+        response_model=CancelPipelineResponse,
+        dependencies=[Depends(combined_auth)],
+    )
+    async def cancel_pipeline():
+        """
+        Request cancellation of the currently running pipeline.
+
+        This endpoint sets a cancellation flag in the pipeline status. The pipeline will:
+        1. Check this flag at key processing points
+        2. Stop processing new documents
+        3. Cancel all running document processing tasks
+        4. Mark all PROCESSING documents as FAILED with reason "User cancelled"
+
+        The cancellation is graceful and ensures data consistency. Documents that have
+        completed processing will remain in PROCESSED status.
+
+        Returns:
+            CancelPipelineResponse: Response with status and message
+                - status="cancellation_requested": Cancellation flag has been set
+                - status="not_busy": Pipeline is not currently running
+
+        Raises:
+            HTTPException: If an error occurs while setting cancellation flag (500).
+        """
+        try:
+            from lightrag.kg.shared_storage import (
+                get_namespace_data,
+                get_pipeline_status_lock,
+            )
+
+            pipeline_status = await get_namespace_data("pipeline_status")
+            pipeline_status_lock = get_pipeline_status_lock()
+
+            async with pipeline_status_lock:
+                if not pipeline_status.get("busy", False):
+                    return CancelPipelineResponse(
+                        status="not_busy",
+                        message="Pipeline is not currently running. No cancellation needed.",
+                    )
+
+                # Set cancellation flag
+                pipeline_status["cancellation_requested"] = True
+                cancel_msg = "Pipeline cancellation requested by user"
+                logger.info(cancel_msg)
+                pipeline_status["latest_message"] = cancel_msg
+                pipeline_status["history_messages"].append(cancel_msg)
+
+            return CancelPipelineResponse(
+                status="cancellation_requested",
+                message="Pipeline cancellation has been requested. Documents will be marked as FAILED.",
+            )
+
+        except Exception as e:
+            logger.error(f"Error requesting pipeline cancellation: {str(e)}")
+            logger.error(traceback.format_exc())
+            raise HTTPException(status_code=500, detail=str(e))
+
    return router
--- a/lightrag/api/routers/graph_routes.py
+++ b/lightrag/api/routers/graph_routes.py
@ -5,7 +5,7 @@ This module contains all graph-related routes for the LightRAG API.
 from typing import Optional, Dict, Any
 import traceback
 from fastapi import APIRouter, Depends, Query, HTTPException
-from pydantic import BaseModel
+from pydantic import BaseModel, Field

 from lightrag.utils import logger
 from ..utils_api import get_combined_auth_dependency
@ -25,6 +25,66 @@ class RelationUpdateRequest(BaseModel):
    updated_data: Dict[str, Any]


+class EntityMergeRequest(BaseModel):
+    entities_to_change: list[str] = Field(
+        ...,
+        description="List of entity names to be merged and deleted. These are typically duplicate or misspelled entities.",
+        min_length=1,
+        examples=[["Elon Msk", "Ellon Musk"]],
+    )
+    entity_to_change_into: str = Field(
+        ...,
+        description="Target entity name that will receive all relationships from the source entities. This entity will be preserved.",
+        min_length=1,
+        examples=["Elon Musk"],
+    )
+
+
+class EntityCreateRequest(BaseModel):
+    entity_name: str = Field(
+        ...,
+        description="Unique name for the new entity",
+        min_length=1,
+        examples=["Tesla"],
+    )
+    entity_data: Dict[str, Any] = Field(
+        ...,
+        description="Dictionary containing entity properties. Common fields include 'description' and 'entity_type'.",
+        examples=[
+            {
+                "description": "Electric vehicle manufacturer",
+                "entity_type": "ORGANIZATION",
+            }
+        ],
+    )
+
+
+class RelationCreateRequest(BaseModel):
+    source_entity: str = Field(
+        ...,
+        description="Name of the source entity. This entity must already exist in the knowledge graph.",
+        min_length=1,
+        examples=["Elon Musk"],
+    )
+    target_entity: str = Field(
+        ...,
+        description="Name of the target entity. This entity must already exist in the knowledge graph.",
+        min_length=1,
+        examples=["Tesla"],
+    )
+    relation_data: Dict[str, Any] = Field(
+        ...,
+        description="Dictionary containing relationship properties. Common fields include 'description', 'keywords', and 'weight'.",
+        examples=[
+            {
+                "description": "Elon Musk is the CEO of Tesla",
+                "keywords": "CEO, founder",
+                "weight": 1.0,
+            }
+        ],
+    )
+
+
 def create_graph_routes(rag, api_key: Optional[str] = None):
    combined_auth = get_combined_auth_dependency(api_key)

@ -225,4 +285,247 @@ def create_graph_routes(rag, api_key: Optional[str] = None):
                status_code=500, detail=f"Error updating relation: {str(e)}"
            )

+    @router.post("/graph/entity/create", dependencies=[Depends(combined_auth)])
+    async def create_entity(request: EntityCreateRequest):
+        """
+        Create a new entity in the knowledge graph
+
+        This endpoint creates a new entity node in the knowledge graph with the specified
+        properties. The system automatically generates vector embeddings for the entity
+        to enable semantic search and retrieval.
+
+        Request Body:
+            entity_name (str): Unique name identifier for the entity
+            entity_data (dict): Entity properties including:
+                - description (str): Textual description of the entity
+                - entity_type (str): Category/type of the entity (e.g., PERSON, ORGANIZATION, LOCATION)
+                - source_id (str): Related chunk_id from which the description originates
+                - Additional custom properties as needed
+
+        Response Schema:
+            {
+                "status": "success",
+                "message": "Entity 'Tesla' created successfully",
+                "data": {
+                    "entity_name": "Tesla",
+                    "description": "Electric vehicle manufacturer",
+                    "entity_type": "ORGANIZATION",
+                    "source_id": "chunk-123<SEP>chunk-456"
+                    ... (other entity properties)
+                }
+            }
+
+        HTTP Status Codes:
+            200: Entity created successfully
+            400: Invalid request (e.g., missing required fields, duplicate entity)
+            500: Internal server error
+
+        Example Request:
+            POST /graph/entity/create
+            {
+                "entity_name": "Tesla",
+                "entity_data": {
+                    "description": "Electric vehicle manufacturer",
+                    "entity_type": "ORGANIZATION"
+                }
+            }
+        """
+        try:
+            # Use the proper acreate_entity method which handles:
+            # - Graph lock for concurrency
+            # - Vector embedding creation in entities_vdb
+            # - Metadata population and defaults
+            # - Index consistency via _edit_entity_done
+            result = await rag.acreate_entity(
+                entity_name=request.entity_name,
+                entity_data=request.entity_data,
+            )
+
+            return {
+                "status": "success",
+                "message": f"Entity '{request.entity_name}' created successfully",
+                "data": result,
+            }
+        except ValueError as ve:
+            logger.error(
+                f"Validation error creating entity '{request.entity_name}': {str(ve)}"
+            )
+            raise HTTPException(status_code=400, detail=str(ve))
+        except Exception as e:
+            logger.error(f"Error creating entity '{request.entity_name}': {str(e)}")
+            logger.error(traceback.format_exc())
+            raise HTTPException(
+                status_code=500, detail=f"Error creating entity: {str(e)}"
+            )
+
+    @router.post("/graph/relation/create", dependencies=[Depends(combined_auth)])
+    async def create_relation(request: RelationCreateRequest):
+        """
+        Create a new relationship between two entities in the knowledge graph
+
+        This endpoint establishes an undirected relationship between two existing entities.
+        The provided source/target order is accepted for convenience, but the backend
+        stored edge is undirected and may be returned with the entities swapped.
+        Both entities must already exist in the knowledge graph. The system automatically
+        generates vector embeddings for the relationship to enable semantic search and graph traversal.
+
+        Prerequisites:
+            - Both source_entity and target_entity must exist in the knowledge graph
+            - Use /graph/entity/create to create entities first if they don't exist
+
+        Request Body:
+            source_entity (str): Name of the source entity (relationship origin)
+            target_entity (str): Name of the target entity (relationship destination)
+            relation_data (dict): Relationship properties including:
+                - description (str): Textual description of the relationship
+                - keywords (str): Comma-separated keywords describing the relationship type
+                - source_id (str): Related chunk_id from which the description originates
+                - weight (float): Relationship strength/importance (default: 1.0)
+                - Additional custom properties as needed
+
+        Response Schema:
+            {
+                "status": "success",
+                "message": "Relation created successfully between 'Elon Musk' and 'Tesla'",
+                "data": {
+                    "src_id": "Elon Musk",
+                    "tgt_id": "Tesla",
+                    "description": "Elon Musk is the CEO of Tesla",
+                    "keywords": "CEO, founder",
+                    "source_id": "chunk-123<SEP>chunk-456"
+                    "weight": 1.0,
+                    ... (other relationship properties)
+                }
+            }
+
+        HTTP Status Codes:
+            200: Relationship created successfully
+            400: Invalid request (e.g., missing entities, invalid data, duplicate relationship)
+            500: Internal server error
+
+        Example Request:
+            POST /graph/relation/create
+            {
+                "source_entity": "Elon Musk",
+                "target_entity": "Tesla",
+                "relation_data": {
+                    "description": "Elon Musk is the CEO of Tesla",
+                    "keywords": "CEO, founder",
+                    "weight": 1.0
+                }
+            }
+        """
+        try:
+            # Use the proper acreate_relation method which handles:
+            # - Graph lock for concurrency
+            # - Entity existence validation
+            # - Duplicate relation checks
+            # - Vector embedding creation in relationships_vdb
+            # - Index consistency via _edit_relation_done
+            result = await rag.acreate_relation(
+                source_entity=request.source_entity,
+                target_entity=request.target_entity,
+                relation_data=request.relation_data,
+            )
+
+            return {
+                "status": "success",
+                "message": f"Relation created successfully between '{request.source_entity}' and '{request.target_entity}'",
+                "data": result,
+            }
+        except ValueError as ve:
+            logger.error(
+                f"Validation error creating relation between '{request.source_entity}' and '{request.target_entity}': {str(ve)}"
+            )
+            raise HTTPException(status_code=400, detail=str(ve))
+        except Exception as e:
+            logger.error(
+                f"Error creating relation between '{request.source_entity}' and '{request.target_entity}': {str(e)}"
+            )
+            logger.error(traceback.format_exc())
+            raise HTTPException(
+                status_code=500, detail=f"Error creating relation: {str(e)}"
+            )
+
+    @router.post("/graph/entities/merge", dependencies=[Depends(combined_auth)])
+    async def merge_entities(request: EntityMergeRequest):
+        """
+        Merge multiple entities into a single entity, preserving all relationships
+
+        This endpoint consolidates duplicate or misspelled entities while preserving the entire
+        graph structure. It's particularly useful for cleaning up knowledge graphs after document
+        processing or correcting entity name variations.
+
+        What the Merge Operation Does:
+            1. Deletes the specified source entities from the knowledge graph
+            2. Transfers all relationships from source entities to the target entity
+            3. Intelligently merges duplicate relationships (if multiple sources have the same relationship)
+            4. Updates vector embeddings for accurate retrieval and search
+            5. Preserves the complete graph structure and connectivity
+            6. Maintains relationship properties and metadata
+
+        Use Cases:
+            - Fixing spelling errors in entity names (e.g., "Elon Msk" -> "Elon Musk")
+            - Consolidating duplicate entities discovered after document processing
+            - Merging name variations (e.g., "NY", "New York", "New York City")
+            - Cleaning up the knowledge graph for better query performance
+            - Standardizing entity names across the knowledge base
+
+        Request Body:
+            entities_to_change (list[str]): List of entity names to be merged and deleted
+            entity_to_change_into (str): Target entity that will receive all relationships
+
+        Response Schema:
+            {
+                "status": "success",
+                "message": "Successfully merged 2 entities into 'Elon Musk'",
+                "data": {
+                    "merged_entity": "Elon Musk",
+                    "deleted_entities": ["Elon Msk", "Ellon Musk"],
+                    "relationships_transferred": 15,
+                    ... (merge operation details)
+                }
+            }
+
+        HTTP Status Codes:
+            200: Entities merged successfully
+            400: Invalid request (e.g., empty entity list, target entity doesn't exist)
+            500: Internal server error
+
+        Example Request:
+            POST /graph/entities/merge
+            {
+                "entities_to_change": ["Elon Msk", "Ellon Musk"],
+                "entity_to_change_into": "Elon Musk"
+            }
+
+        Note:
+            - The target entity (entity_to_change_into) must exist in the knowledge graph
+            - Source entities will be permanently deleted after the merge
+            - This operation cannot be undone, so verify entity names before merging
+        """
+        try:
+            result = await rag.amerge_entities(
+                source_entities=request.entities_to_change,
+                target_entity=request.entity_to_change_into,
+            )
+            return {
+                "status": "success",
+                "message": f"Successfully merged {len(request.entities_to_change)} entities into '{request.entity_to_change_into}'",
+                "data": result,
+            }
+        except ValueError as ve:
+            logger.error(
+                f"Validation error merging entities {request.entities_to_change} into '{request.entity_to_change_into}': {str(ve)}"
+            )
+            raise HTTPException(status_code=400, detail=str(ve))
+        except Exception as e:
+            logger.error(
+                f"Error merging entities {request.entities_to_change} into '{request.entity_to_change_into}': {str(e)}"
+            )
+            logger.error(traceback.format_exc())
+            raise HTTPException(
+                status_code=500, detail=f"Error merging entities: {str(e)}"
+            )
+
    return router
--- a/lightrag/api/routers/ollama_api.py
+++ b/lightrag/api/routers/ollama_api.py
@ -483,6 +483,12 @@ class OllamaAPI:
                if not messages:
                    raise HTTPException(status_code=400, detail="No messages provided")

+                # Validate that the last message is from a user
+                if messages[-1].role != "user":
+                    raise HTTPException(
+                        status_code=400, detail="Last message must be from user role"
+                    )
+
                # Get the last message as query and previous messages as history
                query = messages[-1].content
                # Convert OllamaMessage objects to dictionaries
@ -499,7 +505,7 @@ class OllamaAPI:
                prompt_tokens = estimate_tokens(cleaned_query)

                param_dict = {
-                    "mode": mode,
+                    "mode": mode.value,
                    "stream": request.stream,
                    "only_need_context": only_need_context,
                    "conversation_history": conversation_history,
--- a/lightrag/api/routers/query_routes.py
+++ b/lightrag/api/routers/query_routes.py
@ -73,6 +73,16 @@ class QueryRequest(BaseModel):
        ge=1,
    )

+    hl_keywords: list[str] = Field(
+        default_factory=list,
+        description="List of high-level keywords to prioritize in retrieval. Leave empty to use the LLM to generate the keywords.",
+    )
+
+    ll_keywords: list[str] = Field(
+        default_factory=list,
+        description="List of low-level keywords to refine retrieval focus. Leave empty to use the LLM to generate the keywords.",
+    )
+
    conversation_history: Optional[List[Dict[str, Any]]] = Field(
        default=None,
        description="Stores past conversation history to maintain context. Format: [{'role': 'user/assistant', 'content': 'message'}].",
@ -88,6 +98,16 @@ class QueryRequest(BaseModel):
        description="Enable reranking for retrieved text chunks. If True but no rerank model is configured, a warning will be issued. Default is True.",
    )

+    include_references: Optional[bool] = Field(
+        default=True,
+        description="If True, includes reference list in responses. Affects /query and /query/stream endpoints. /query/data always includes references.",
+    )
+
+    stream: Optional[bool] = Field(
+        default=True,
+        description="If True, enables streaming output for real-time responses. Only affects /query/stream endpoint.",
+    )
+
    @field_validator("query", mode="after")
    @classmethod
    def query_strip_after(cls, query: str) -> str:
@ -101,10 +121,10 @@ class QueryRequest(BaseModel):
        if conversation_history is None:
            return None
        for msg in conversation_history:
-            if "role" not in msg or msg["role"] not in {"user", "assistant"}:
-                raise ValueError(
-                    "Each message must have a 'role' key with value 'user' or 'assistant'."
-                )
+            if "role" not in msg:
+                raise ValueError("Each message must have a 'role' key.")
+            if not isinstance(msg["role"], str) or not msg["role"].strip():
+                raise ValueError("Each message 'role' must be a non-empty string.")
        return conversation_history

    def to_query_params(self, is_stream: bool) -> "QueryParam":
@ -122,6 +142,10 @@ class QueryResponse(BaseModel):
    response: str = Field(
        description="The generated response",
    )
+    references: Optional[List[Dict[str, str]]] = Field(
+        default=None,
+        description="Reference list (Disabled when include_references=False, /query/data always includes references.)",
+    )


 class QueryDataResponse(BaseModel):
@ -135,78 +159,473 @@ class QueryDataResponse(BaseModel):
    )


+class StreamChunkResponse(BaseModel):
+    """Response model for streaming chunks in NDJSON format"""
+
+    references: Optional[List[Dict[str, str]]] = Field(
+        default=None,
+        description="Reference list (only in first chunk when include_references=True)",
+    )
+    response: Optional[str] = Field(
+        default=None, description="Response content chunk or complete response"
+    )
+    error: Optional[str] = Field(
+        default=None, description="Error message if processing fails"
+    )
+
+
 def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60):
    combined_auth = get_combined_auth_dependency(api_key)

    @router.post(
-        "/query", response_model=QueryResponse, dependencies=[Depends(combined_auth)]
+        "/query",
+        response_model=QueryResponse,
+        dependencies=[Depends(combined_auth)],
+        responses={
+            200: {
+                "description": "Successful RAG query response",
+                "content": {
+                    "application/json": {
+                        "schema": {
+                            "type": "object",
+                            "properties": {
+                                "response": {
+                                    "type": "string",
+                                    "description": "The generated response from the RAG system",
+                                },
+                                "references": {
+                                    "type": "array",
+                                    "items": {
+                                        "type": "object",
+                                        "properties": {
+                                            "reference_id": {"type": "string"},
+                                            "file_path": {"type": "string"},
+                                        },
+                                    },
+                                    "description": "Reference list (only included when include_references=True)",
+                                },
+                            },
+                            "required": ["response"],
+                        },
+                        "examples": {
+                            "with_references": {
+                                "summary": "Response with references",
+                                "description": "Example response when include_references=True",
+                                "value": {
+                                    "response": "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence, such as learning, reasoning, and problem-solving.",
+                                    "references": [
+                                        {
+                                            "reference_id": "1",
+                                            "file_path": "/documents/ai_overview.pdf",
+                                        },
+                                        {
+                                            "reference_id": "2",
+                                            "file_path": "/documents/machine_learning.txt",
+                                        },
+                                    ],
+                                },
+                            },
+                            "without_references": {
+                                "summary": "Response without references",
+                                "description": "Example response when include_references=False",
+                                "value": {
+                                    "response": "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence, such as learning, reasoning, and problem-solving."
+                                },
+                            },
+                            "different_modes": {
+                                "summary": "Different query modes",
+                                "description": "Examples of responses from different query modes",
+                                "value": {
+                                    "local_mode": "Focuses on specific entities and their relationships",
+                                    "global_mode": "Provides broader context from relationship patterns",
+                                    "hybrid_mode": "Combines local and global approaches",
+                                    "naive_mode": "Simple vector similarity search",
+                                    "mix_mode": "Integrates knowledge graph and vector retrieval",
+                                },
+                            },
+                        },
+                    }
+                },
+            },
+            400: {
+                "description": "Bad Request - Invalid input parameters",
+                "content": {
+                    "application/json": {
+                        "schema": {
+                            "type": "object",
+                            "properties": {"detail": {"type": "string"}},
+                        },
+                        "example": {
+                            "detail": "Query text must be at least 3 characters long"
+                        },
+                    }
+                },
+            },
+            500: {
+                "description": "Internal Server Error - Query processing failed",
+                "content": {
+                    "application/json": {
+                        "schema": {
+                            "type": "object",
+                            "properties": {"detail": {"type": "string"}},
+                        },
+                        "example": {
+                            "detail": "Failed to process query: LLM service unavailable"
+                        },
+                    }
+                },
+            },
+        },
    )
    async def query_text(request: QueryRequest):
        """
-        Handle a POST request at the /query endpoint to process user queries using RAG capabilities.
+        Comprehensive RAG query endpoint with non-streaming response. Parameter "stream" is ignored.
+
+        This endpoint performs Retrieval-Augmented Generation (RAG) queries using various modes
+        to provide intelligent responses based on your knowledge base.
+
+        **Query Modes:**
+        - **local**: Focuses on specific entities and their direct relationships
+        - **global**: Analyzes broader patterns and relationships across the knowledge graph
+        - **hybrid**: Combines local and global approaches for comprehensive results
+        - **naive**: Simple vector similarity search without knowledge graph
+        - **mix**: Integrates knowledge graph retrieval with vector search (recommended)
+        - **bypass**: Direct LLM query without knowledge retrieval
+
+        conversation_history parameteris sent to LLM only, does not affect retrieval results.
+
+        **Usage Examples:**
+
+        Basic query:
+        ```json
+        {
+            "query": "What is machine learning?",
+            "mode": "mix"
+        }
+        ```
+
+        Bypass initial LLM call by providing high-level and low-level keywords:
+        ```json
+        {
+            "query": "What is Retrieval-Augmented-Generation?",
+            "hl_keywords": ["machine learning", "information retrieval", "natural language processing"],
+            "ll_keywords": ["retrieval augmented generation", "RAG", "knowledge base"],
+            "mode": "mix"
+        }
+        ```
+
+        Advanced query with references:
+        ```json
+        {
+            "query": "Explain neural networks",
+            "mode": "hybrid",
+            "include_references": true,
+            "response_type": "Multiple Paragraphs",
+            "top_k": 10
+        }
+        ```
+
+        Conversation with history:
+        ```json
+        {
+            "query": "Can you give me more details?",
+            "conversation_history": [
+                {"role": "user", "content": "What is AI?"},
+                {"role": "assistant", "content": "AI is artificial intelligence..."}
+            ]
+        }
+        ```
+
+        Args:
+            request (QueryRequest): The request object containing query parameters:
+                - **query**: The question or prompt to process (min 3 characters)
+                - **mode**: Query strategy - "mix" recommended for best results
+                - **include_references**: Whether to include source citations
+                - **response_type**: Format preference (e.g., "Multiple Paragraphs")
+                - **top_k**: Number of top entities/relations to retrieve
+                - **conversation_history**: Previous dialogue context
+                - **max_total_tokens**: Token budget for the entire response

-        Parameters:
-            request (QueryRequest): The request object containing the query parameters.
        Returns:
-            QueryResponse: A Pydantic model containing the result of the query processing.
-                       If a string is returned (e.g., cache hit), it's directly returned.
-                       Otherwise, an async generator may be used to build the response.
+            QueryResponse: JSON response containing:
+                - **response**: The generated answer to your query
+                - **references**: Source citations (if include_references=True)

        Raises:
-            HTTPException: Raised when an error occurs during the request handling process,
-                       with status code 500 and detail containing the exception message.
+            HTTPException:
+                - 400: Invalid input parameters (e.g., query too short)
+                - 500: Internal processing error (e.g., LLM service unavailable)
        """
        try:
-            param = request.to_query_params(False)
-            response = await rag.aquery(request.query, param=param)
+            param = request.to_query_params(
+                False
+            )  # Ensure stream=False for non-streaming endpoint
+            # Force stream=False for /query endpoint regardless of include_references setting
+            param.stream = False

-            # If response is a string (e.g. cache hit), return directly
-            if isinstance(response, str):
-                return QueryResponse(response=response)
+            # Unified approach: always use aquery_llm for both cases
+            result = await rag.aquery_llm(request.query, param=param)

-            if isinstance(response, dict):
-                result = json.dumps(response, indent=2)
-                return QueryResponse(response=result)
+            # Extract LLM response and references from unified result
+            llm_response = result.get("llm_response", {})
+            references = result.get("data", {}).get("references", [])
+
+            # Get the non-streaming response content
+            response_content = llm_response.get("content", "")
+            if not response_content:
+                response_content = "No relevant context found for the query."
+
+            # Return response with or without references based on request
+            if request.include_references:
+                return QueryResponse(response=response_content, references=references)
            else:
-                return QueryResponse(response=str(response))
+                return QueryResponse(response=response_content, references=None)
        except Exception as e:
            trace_exception(e)
            raise HTTPException(status_code=500, detail=str(e))

-    @router.post("/query/stream", dependencies=[Depends(combined_auth)])
+    @router.post(
+        "/query/stream",
+        dependencies=[Depends(combined_auth)],
+        responses={
+            200: {
+                "description": "Flexible RAG query response - format depends on stream parameter",
+                "content": {
+                    "application/x-ndjson": {
+                        "schema": {
+                            "type": "string",
+                            "format": "ndjson",
+                            "description": "Newline-delimited JSON (NDJSON) format used for both streaming and non-streaming responses. For streaming: multiple lines with separate JSON objects. For non-streaming: single line with complete JSON object.",
+                            "example": '{"references": [{"reference_id": "1", "file_path": "/documents/ai.pdf"}]}\n{"response": "Artificial Intelligence is"}\n{"response": " a field of computer science"}\n{"response": " that focuses on creating intelligent machines."}',
+                        },
+                        "examples": {
+                            "streaming_with_references": {
+                                "summary": "Streaming mode with references (stream=true)",
+                                "description": "Multiple NDJSON lines when stream=True and include_references=True. First line contains references, subsequent lines contain response chunks.",
+                                "value": '{"references": [{"reference_id": "1", "file_path": "/documents/ai_overview.pdf"}, {"reference_id": "2", "file_path": "/documents/ml_basics.txt"}]}\n{"response": "Artificial Intelligence (AI) is a branch of computer science"}\n{"response": " that aims to create intelligent machines capable of performing"}\n{"response": " tasks that typically require human intelligence, such as learning,"}\n{"response": " reasoning, and problem-solving."}',
+                            },
+                            "streaming_without_references": {
+                                "summary": "Streaming mode without references (stream=true)",
+                                "description": "Multiple NDJSON lines when stream=True and include_references=False. Only response chunks are sent.",
+                                "value": '{"response": "Machine learning is a subset of artificial intelligence"}\n{"response": " that enables computers to learn and improve from experience"}\n{"response": " without being explicitly programmed for every task."}',
+                            },
+                            "non_streaming_with_references": {
+                                "summary": "Non-streaming mode with references (stream=false)",
+                                "description": "Single NDJSON line when stream=False and include_references=True. Complete response with references in one message.",
+                                "value": '{"references": [{"reference_id": "1", "file_path": "/documents/neural_networks.pdf"}], "response": "Neural networks are computational models inspired by biological neural networks that consist of interconnected nodes (neurons) organized in layers. They are fundamental to deep learning and can learn complex patterns from data through training processes."}',
+                            },
+                            "non_streaming_without_references": {
+                                "summary": "Non-streaming mode without references (stream=false)",
+                                "description": "Single NDJSON line when stream=False and include_references=False. Complete response only.",
+                                "value": '{"response": "Deep learning is a subset of machine learning that uses neural networks with multiple layers (hence deep) to model and understand complex patterns in data. It has revolutionized fields like computer vision, natural language processing, and speech recognition."}',
+                            },
+                            "error_response": {
+                                "summary": "Error during streaming",
+                                "description": "Error handling in NDJSON format when an error occurs during processing.",
+                                "value": '{"references": [{"reference_id": "1", "file_path": "/documents/ai.pdf"}]}\n{"response": "Artificial Intelligence is"}\n{"error": "LLM service temporarily unavailable"}',
+                            },
+                        },
+                    }
+                },
+            },
+            400: {
+                "description": "Bad Request - Invalid input parameters",
+                "content": {
+                    "application/json": {
+                        "schema": {
+                            "type": "object",
+                            "properties": {"detail": {"type": "string"}},
+                        },
+                        "example": {
+                            "detail": "Query text must be at least 3 characters long"
+                        },
+                    }
+                },
+            },
+            500: {
+                "description": "Internal Server Error - Query processing failed",
+                "content": {
+                    "application/json": {
+                        "schema": {
+                            "type": "object",
+                            "properties": {"detail": {"type": "string"}},
+                        },
+                        "example": {
+                            "detail": "Failed to process streaming query: Knowledge graph unavailable"
+                        },
+                    }
+                },
+            },
+        },
+    )
    async def query_text_stream(request: QueryRequest):
        """
-        This endpoint performs a retrieval-augmented generation (RAG) query and streams the response.
+        Advanced RAG query endpoint with flexible streaming response.
+
+        This endpoint provides the most flexible querying experience, supporting both real-time streaming
+        and complete response delivery based on your integration needs.
+
+        **Response Modes:**
+        - Real-time response delivery as content is generated
+        - NDJSON format: each line is a separate JSON object
+        - First line: `{"references": [...]}` (if include_references=True)
+        - Subsequent lines: `{"response": "content chunk"}`
+        - Error handling: `{"error": "error message"}`
+
+        > If stream parameter is False, or the query hit LLM cache, complete response delivered in a single streaming message.
+
+        **Response Format Details**
+        - **Content-Type**: `application/x-ndjson` (Newline-Delimited JSON)
+        - **Structure**: Each line is an independent, valid JSON object
+        - **Parsing**: Process line-by-line, each line is self-contained
+        - **Headers**: Includes cache control and connection management
+
+        **Query Modes (same as /query endpoint)**
+        - **local**: Entity-focused retrieval with direct relationships
+        - **global**: Pattern analysis across the knowledge graph
+        - **hybrid**: Combined local and global strategies
+        - **naive**: Vector similarity search only
+        - **mix**: Integrated knowledge graph + vector retrieval (recommended)
+        - **bypass**: Direct LLM query without knowledge retrieval
+
+        conversation_history parameteris sent to LLM only, does not affect retrieval results.
+
+        **Usage Examples**
+
+        Real-time streaming query:
+        ```json
+        {
+            "query": "Explain machine learning algorithms",
+            "mode": "mix",
+            "stream": true,
+            "include_references": true
+        }
+        ```
+
+        Bypass initial LLM call by providing high-level and low-level keywords:
+        ```json
+        {
+            "query": "What is Retrieval-Augmented-Generation?",
+            "hl_keywords": ["machine learning", "information retrieval", "natural language processing"],
+            "ll_keywords": ["retrieval augmented generation", "RAG", "knowledge base"],
+            "mode": "mix"
+        }
+        ```
+
+        Complete response query:
+        ```json
+        {
+            "query": "What is deep learning?",
+            "mode": "hybrid",
+            "stream": false,
+            "response_type": "Multiple Paragraphs"
+        }
+        ```
+
+        Conversation with context:
+        ```json
+        {
+            "query": "Can you elaborate on that?",
+            "stream": true,
+            "conversation_history": [
+                {"role": "user", "content": "What is neural network?"},
+                {"role": "assistant", "content": "A neural network is..."}
+            ]
+        }
+        ```
+
+        **Response Processing:**
+
+        ```python
+        async for line in response.iter_lines():
+            data = json.loads(line)
+            if "references" in data:
+                # Handle references (first message)
+                references = data["references"]
+            if "response" in data:
+                # Handle content chunk
+                content_chunk = data["response"]
+            if "error" in data:
+                # Handle error
+                error_message = data["error"]
+        ```
+
+        **Error Handling:**
+        - Streaming errors are delivered as `{"error": "message"}` lines
+        - Non-streaming errors raise HTTP exceptions
+        - Partial responses may be delivered before errors in streaming mode
+        - Always check for error objects when processing streaming responses

        Args:
-            request (QueryRequest): The request object containing the query parameters.
-            optional_api_key (Optional[str], optional): An optional API key for authentication. Defaults to None.
+            request (QueryRequest): The request object containing query parameters:
+                - **query**: The question or prompt to process (min 3 characters)
+                - **mode**: Query strategy - "mix" recommended for best results
+                - **stream**: Enable streaming (True) or complete response (False)
+                - **include_references**: Whether to include source citations
+                - **response_type**: Format preference (e.g., "Multiple Paragraphs")
+                - **top_k**: Number of top entities/relations to retrieve
+                - **conversation_history**: Previous dialogue context for multi-turn conversations
+                - **max_total_tokens**: Token budget for the entire response

        Returns:
-            StreamingResponse: A streaming response containing the RAG query results.
+            StreamingResponse: NDJSON streaming response containing:
+                - **Streaming mode**: Multiple JSON objects, one per line
+                  - References object (if requested): `{"references": [...]}`
+                  - Content chunks: `{"response": "chunk content"}`
+                  - Error objects: `{"error": "error message"}`
+                - **Non-streaming mode**: Single JSON object
+                  - Complete response: `{"references": [...], "response": "complete content"}`
+
+        Raises:
+            HTTPException:
+                - 400: Invalid input parameters (e.g., query too short, invalid mode)
+                - 500: Internal processing error (e.g., LLM service unavailable)
+
+        Note:
+            This endpoint is ideal for applications requiring flexible response delivery.
+            Use streaming mode for real-time interfaces and non-streaming for batch processing.
        """
        try:
-            param = request.to_query_params(True)
-            response = await rag.aquery(request.query, param=param)
+            # Use the stream parameter from the request, defaulting to True if not specified
+            stream_mode = request.stream if request.stream is not None else True
+            param = request.to_query_params(stream_mode)

            from fastapi.responses import StreamingResponse

+            # Unified approach: always use aquery_llm for all cases
+            result = await rag.aquery_llm(request.query, param=param)
+
            async def stream_generator():
-                if isinstance(response, str):
-                    # If it's a string, send it all at once
-                    yield f"{json.dumps({'response': response})}\n"
-                elif response is None:
-                    # Handle None response (e.g., when only_need_context=True but no context found)
-                    yield f"{json.dumps({'response': 'No relevant context found for the query.'})}\n"
+                # Extract references and LLM response from unified result
+                references = result.get("data", {}).get("references", [])
+                llm_response = result.get("llm_response", {})
+
+                if llm_response.get("is_streaming"):
+                    # Streaming mode: send references first, then stream response chunks
+                    if request.include_references:
+                        yield f"{json.dumps({'references': references})}\n"
+
+                    response_stream = llm_response.get("response_iterator")
+                    if response_stream:
+                        try:
+                            async for chunk in response_stream:
+                                if chunk:  # Only send non-empty content
+                                    yield f"{json.dumps({'response': chunk})}\n"
+                        except Exception as e:
+                            logging.error(f"Streaming error: {str(e)}")
+                            yield f"{json.dumps({'error': str(e)})}\n"
                else:
-                    # If it's an async generator, send chunks one by one
-                    try:
-                        async for chunk in response:
-                            if chunk:  # Only send non-empty content
-                                yield f"{json.dumps({'response': chunk})}\n"
-                    except Exception as e:
-                        logging.error(f"Streaming error: {str(e)}")
-                        yield f"{json.dumps({'error': str(e)})}\n"
+                    # Non-streaming mode: send complete response in one message
+                    response_content = llm_response.get("content", "")
+                    if not response_content:
+                        response_content = "No relevant context found for the query."
+
+                    # Create complete response object
+                    complete_response = {"response": response_content}
+                    if request.include_references:
+                        complete_response["references"] = references
+
+                    yield f"{json.dumps(complete_response)}\n"

            return StreamingResponse(
                stream_generator(),
@ -226,26 +645,400 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60):
        "/query/data",
        response_model=QueryDataResponse,
        dependencies=[Depends(combined_auth)],
+        responses={
+            200: {
+                "description": "Successful data retrieval response with structured RAG data",
+                "content": {
+                    "application/json": {
+                        "schema": {
+                            "type": "object",
+                            "properties": {
+                                "status": {
+                                    "type": "string",
+                                    "enum": ["success", "failure"],
+                                    "description": "Query execution status",
+                                },
+                                "message": {
+                                    "type": "string",
+                                    "description": "Status message describing the result",
+                                },
+                                "data": {
+                                    "type": "object",
+                                    "properties": {
+                                        "entities": {
+                                            "type": "array",
+                                            "items": {
+                                                "type": "object",
+                                                "properties": {
+                                                    "entity_name": {"type": "string"},
+                                                    "entity_type": {"type": "string"},
+                                                    "description": {"type": "string"},
+                                                    "source_id": {"type": "string"},
+                                                    "file_path": {"type": "string"},
+                                                    "reference_id": {"type": "string"},
+                                                },
+                                            },
+                                            "description": "Retrieved entities from knowledge graph",
+                                        },
+                                        "relationships": {
+                                            "type": "array",
+                                            "items": {
+                                                "type": "object",
+                                                "properties": {
+                                                    "src_id": {"type": "string"},
+                                                    "tgt_id": {"type": "string"},
+                                                    "description": {"type": "string"},
+                                                    "keywords": {"type": "string"},
+                                                    "weight": {"type": "number"},
+                                                    "source_id": {"type": "string"},
+                                                    "file_path": {"type": "string"},
+                                                    "reference_id": {"type": "string"},
+                                                },
+                                            },
+                                            "description": "Retrieved relationships from knowledge graph",
+                                        },
+                                        "chunks": {
+                                            "type": "array",
+                                            "items": {
+                                                "type": "object",
+                                                "properties": {
+                                                    "content": {"type": "string"},
+                                                    "file_path": {"type": "string"},
+                                                    "chunk_id": {"type": "string"},
+                                                    "reference_id": {"type": "string"},
+                                                },
+                                            },
+                                            "description": "Retrieved text chunks from vector database",
+                                        },
+                                        "references": {
+                                            "type": "array",
+                                            "items": {
+                                                "type": "object",
+                                                "properties": {
+                                                    "reference_id": {"type": "string"},
+                                                    "file_path": {"type": "string"},
+                                                },
+                                            },
+                                            "description": "Reference list for citation purposes",
+                                        },
+                                    },
+                                    "description": "Structured retrieval data containing entities, relationships, chunks, and references",
+                                },
+                                "metadata": {
+                                    "type": "object",
+                                    "properties": {
+                                        "query_mode": {"type": "string"},
+                                        "keywords": {
+                                            "type": "object",
+                                            "properties": {
+                                                "high_level": {
+                                                    "type": "array",
+                                                    "items": {"type": "string"},
+                                                },
+                                                "low_level": {
+                                                    "type": "array",
+                                                    "items": {"type": "string"},
+                                                },
+                                            },
+                                        },
+                                        "processing_info": {
+                                            "type": "object",
+                                            "properties": {
+                                                "total_entities_found": {
+                                                    "type": "integer"
+                                                },
+                                                "total_relations_found": {
+                                                    "type": "integer"
+                                                },
+                                                "entities_after_truncation": {
+                                                    "type": "integer"
+                                                },
+                                                "relations_after_truncation": {
+                                                    "type": "integer"
+                                                },
+                                                "final_chunks_count": {
+                                                    "type": "integer"
+                                                },
+                                            },
+                                        },
+                                    },
+                                    "description": "Query metadata including mode, keywords, and processing information",
+                                },
+                            },
+                            "required": ["status", "message", "data", "metadata"],
+                        },
+                        "examples": {
+                            "successful_local_mode": {
+                                "summary": "Local mode data retrieval",
+                                "description": "Example of structured data from local mode query focusing on specific entities",
+                                "value": {
+                                    "status": "success",
+                                    "message": "Query executed successfully",
+                                    "data": {
+                                        "entities": [
+                                            {
+                                                "entity_name": "Neural Networks",
+                                                "entity_type": "CONCEPT",
+                                                "description": "Computational models inspired by biological neural networks",
+                                                "source_id": "chunk-123",
+                                                "file_path": "/documents/ai_basics.pdf",
+                                                "reference_id": "1",
+                                            }
+                                        ],
+                                        "relationships": [
+                                            {
+                                                "src_id": "Neural Networks",
+                                                "tgt_id": "Machine Learning",
+                                                "description": "Neural networks are a subset of machine learning algorithms",
+                                                "keywords": "subset, algorithm, learning",
+                                                "weight": 0.85,
+                                                "source_id": "chunk-123",
+                                                "file_path": "/documents/ai_basics.pdf",
+                                                "reference_id": "1",
+                                            }
+                                        ],
+                                        "chunks": [
+                                            {
+                                                "content": "Neural networks are computational models that mimic the way biological neural networks work...",
+                                                "file_path": "/documents/ai_basics.pdf",
+                                                "chunk_id": "chunk-123",
+                                                "reference_id": "1",
+                                            }
+                                        ],
+                                        "references": [
+                                            {
+                                                "reference_id": "1",
+                                                "file_path": "/documents/ai_basics.pdf",
+                                            }
+                                        ],
+                                    },
+                                    "metadata": {
+                                        "query_mode": "local",
+                                        "keywords": {
+                                            "high_level": ["neural", "networks"],
+                                            "low_level": [
+                                                "computation",
+                                                "model",
+                                                "algorithm",
+                                            ],
+                                        },
+                                        "processing_info": {
+                                            "total_entities_found": 5,
+                                            "total_relations_found": 3,
+                                            "entities_after_truncation": 1,
+                                            "relations_after_truncation": 1,
+                                            "final_chunks_count": 1,
+                                        },
+                                    },
+                                },
+                            },
+                            "global_mode": {
+                                "summary": "Global mode data retrieval",
+                                "description": "Example of structured data from global mode query analyzing broader patterns",
+                                "value": {
+                                    "status": "success",
+                                    "message": "Query executed successfully",
+                                    "data": {
+                                        "entities": [],
+                                        "relationships": [
+                                            {
+                                                "src_id": "Artificial Intelligence",
+                                                "tgt_id": "Machine Learning",
+                                                "description": "AI encompasses machine learning as a core component",
+                                                "keywords": "encompasses, component, field",
+                                                "weight": 0.92,
+                                                "source_id": "chunk-456",
+                                                "file_path": "/documents/ai_overview.pdf",
+                                                "reference_id": "2",
+                                            }
+                                        ],
+                                        "chunks": [],
+                                        "references": [
+                                            {
+                                                "reference_id": "2",
+                                                "file_path": "/documents/ai_overview.pdf",
+                                            }
+                                        ],
+                                    },
+                                    "metadata": {
+                                        "query_mode": "global",
+                                        "keywords": {
+                                            "high_level": [
+                                                "artificial",
+                                                "intelligence",
+                                                "overview",
+                                            ],
+                                            "low_level": [],
+                                        },
+                                    },
+                                },
+                            },
+                            "naive_mode": {
+                                "summary": "Naive mode data retrieval",
+                                "description": "Example of structured data from naive mode using only vector search",
+                                "value": {
+                                    "status": "success",
+                                    "message": "Query executed successfully",
+                                    "data": {
+                                        "entities": [],
+                                        "relationships": [],
+                                        "chunks": [
+                                            {
+                                                "content": "Deep learning is a subset of machine learning that uses neural networks with multiple layers...",
+                                                "file_path": "/documents/deep_learning.pdf",
+                                                "chunk_id": "chunk-789",
+                                                "reference_id": "3",
+                                            }
+                                        ],
+                                        "references": [
+                                            {
+                                                "reference_id": "3",
+                                                "file_path": "/documents/deep_learning.pdf",
+                                            }
+                                        ],
+                                    },
+                                    "metadata": {
+                                        "query_mode": "naive",
+                                        "keywords": {"high_level": [], "low_level": []},
+                                    },
+                                },
+                            },
+                        },
+                    }
+                },
+            },
+            400: {
+                "description": "Bad Request - Invalid input parameters",
+                "content": {
+                    "application/json": {
+                        "schema": {
+                            "type": "object",
+                            "properties": {"detail": {"type": "string"}},
+                        },
+                        "example": {
+                            "detail": "Query text must be at least 3 characters long"
+                        },
+                    }
+                },
+            },
+            500: {
+                "description": "Internal Server Error - Data retrieval failed",
+                "content": {
+                    "application/json": {
+                        "schema": {
+                            "type": "object",
+                            "properties": {"detail": {"type": "string"}},
+                        },
+                        "example": {
+                            "detail": "Failed to retrieve data: Knowledge graph unavailable"
+                        },
+                    }
+                },
+            },
+        },
    )
    async def query_data(request: QueryRequest):
        """
-        Retrieve structured data without LLM generation.
+        Advanced data retrieval endpoint for structured RAG analysis.

-        This endpoint returns raw retrieval results including entities, relationships,
-        and text chunks that would be used for RAG, but without generating a final response.
-        All parameters are compatible with the regular /query endpoint.
+        This endpoint provides raw retrieval results without LLM generation, perfect for:
+        - **Data Analysis**: Examine what information would be used for RAG
+        - **System Integration**: Get structured data for custom processing
+        - **Debugging**: Understand retrieval behavior and quality
+        - **Research**: Analyze knowledge graph structure and relationships

-        Parameters:
-            request (QueryRequest): The request object containing the query parameters.
+        **Key Features:**
+        - No LLM generation - pure data retrieval
+        - Complete structured output with entities, relationships, and chunks
+        - Always includes references for citation
+        - Detailed metadata about processing and keywords
+        - Compatible with all query modes and parameters
+
+        **Query Mode Behaviors:**
+        - **local**: Returns entities and their direct relationships + related chunks
+        - **global**: Returns relationship patterns across the knowledge graph
+        - **hybrid**: Combines local and global retrieval strategies
+        - **naive**: Returns only vector-retrieved text chunks (no knowledge graph)
+        - **mix**: Integrates knowledge graph data with vector-retrieved chunks
+        - **bypass**: Returns empty data arrays (used for direct LLM queries)
+
+        **Data Structure:**
+        - **entities**: Knowledge graph entities with descriptions and metadata
+        - **relationships**: Connections between entities with weights and descriptions
+        - **chunks**: Text segments from documents with source information
+        - **references**: Citation information mapping reference IDs to file paths
+        - **metadata**: Processing information, keywords, and query statistics
+
+        **Usage Examples:**
+
+        Analyze entity relationships:
+        ```json
+        {
+            "query": "machine learning algorithms",
+            "mode": "local",
+            "top_k": 10
+        }
+        ```
+
+        Explore global patterns:
+        ```json
+        {
+            "query": "artificial intelligence trends",
+            "mode": "global",
+            "max_relation_tokens": 2000
+        }
+        ```
+
+        Vector similarity search:
+        ```json
+        {
+            "query": "neural network architectures",
+            "mode": "naive",
+            "chunk_top_k": 5
+        }
+        ```
+
+        Bypass initial LLM call by providing high-level and low-level keywords:
+        ```json
+        {
+            "query": "What is Retrieval-Augmented-Generation?",
+            "hl_keywords": ["machine learning", "information retrieval", "natural language processing"],
+            "ll_keywords": ["retrieval augmented generation", "RAG", "knowledge base"],
+            "mode": "mix"
+        }
+        ```
+
+        **Response Analysis:**
+        - **Empty arrays**: Normal for certain modes (e.g., naive mode has no entities/relationships)
+        - **Processing info**: Shows retrieval statistics and token usage
+        - **Keywords**: High-level and low-level keywords extracted from query
+        - **Reference mapping**: Links all data back to source documents
+
+        Args:
+            request (QueryRequest): The request object containing query parameters:
+                - **query**: The search query to analyze (min 3 characters)
+                - **mode**: Retrieval strategy affecting data types returned
+                - **top_k**: Number of top entities/relationships to retrieve
+                - **chunk_top_k**: Number of text chunks to retrieve
+                - **max_entity_tokens**: Token limit for entity context
+                - **max_relation_tokens**: Token limit for relationship context
+                - **max_total_tokens**: Overall token budget for retrieval

        Returns:
-            QueryDataResponse: A Pydantic model containing structured data with status,
-                             message, data (entities, relationships, chunks, references),
-                             and metadata.
+            QueryDataResponse: Structured JSON response containing:
+                - **status**: "success" or "failure"
+                - **message**: Human-readable status description
+                - **data**: Complete retrieval results with entities, relationships, chunks, references
+                - **metadata**: Query processing information and statistics

        Raises:
-            HTTPException: Raised when an error occurs during the request handling process,
-                         with status code 500 and detail containing the exception message.
+            HTTPException:
+                - 400: Invalid input parameters (e.g., query too short, invalid mode)
+                - 500: Internal processing error (e.g., knowledge graph unavailable)
+
+        Note:
+            This endpoint always includes references regardless of the include_references parameter,
+            as structured data analysis typically requires source attribution.
        """
        try:
            param = request.to_query_params(False)  # No streaming for data endpoint
--- a/lightrag/api/webui/assets/KaTeX_AMS-Regular-BQhdFMY1.woff2
+++ b/lightrag/api/webui/assets/KaTeX_AMS-Regular-BQhdFMY1.woff2
--- a/lightrag/api/webui/assets/KaTeX_AMS-Regular-DMm9YOAa.woff
+++ b/lightrag/api/webui/assets/KaTeX_AMS-Regular-DMm9YOAa.woff
--- a/lightrag/api/webui/assets/KaTeX_AMS-Regular-DRggAlZN.ttf
+++ b/lightrag/api/webui/assets/KaTeX_AMS-Regular-DRggAlZN.ttf
--- a/lightrag/api/webui/assets/KaTeX_Caligraphic-Bold-ATXxdsX0.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Caligraphic-Bold-ATXxdsX0.ttf
--- a/lightrag/api/webui/assets/KaTeX_Caligraphic-Bold-BEiXGLvX.woff
+++ b/lightrag/api/webui/assets/KaTeX_Caligraphic-Bold-BEiXGLvX.woff
--- a/lightrag/api/webui/assets/KaTeX_Caligraphic-Bold-Dq_IR9rO.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Caligraphic-Bold-Dq_IR9rO.woff2
--- a/lightrag/api/webui/assets/KaTeX_Caligraphic-Regular-CTRA-rTL.woff
+++ b/lightrag/api/webui/assets/KaTeX_Caligraphic-Regular-CTRA-rTL.woff
--- a/lightrag/api/webui/assets/KaTeX_Caligraphic-Regular-Di6jR-x-.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Caligraphic-Regular-Di6jR-x-.woff2
--- a/lightrag/api/webui/assets/KaTeX_Caligraphic-Regular-wX97UBjC.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Caligraphic-Regular-wX97UBjC.ttf
--- a/lightrag/api/webui/assets/KaTeX_Fraktur-Bold-BdnERNNW.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Fraktur-Bold-BdnERNNW.ttf
--- a/lightrag/api/webui/assets/KaTeX_Fraktur-Bold-BsDP51OF.woff
+++ b/lightrag/api/webui/assets/KaTeX_Fraktur-Bold-BsDP51OF.woff
--- a/lightrag/api/webui/assets/KaTeX_Fraktur-Bold-CL6g_b3V.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Fraktur-Bold-CL6g_b3V.woff2
--- a/lightrag/api/webui/assets/KaTeX_Fraktur-Regular-CB_wures.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Fraktur-Regular-CB_wures.ttf
--- a/lightrag/api/webui/assets/KaTeX_Fraktur-Regular-CTYiF6lA.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Fraktur-Regular-CTYiF6lA.woff2
--- a/lightrag/api/webui/assets/KaTeX_Fraktur-Regular-Dxdc4cR9.woff
+++ b/lightrag/api/webui/assets/KaTeX_Fraktur-Regular-Dxdc4cR9.woff
--- a/lightrag/api/webui/assets/KaTeX_Main-Bold-Cx986IdX.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Main-Bold-Cx986IdX.woff2
--- a/lightrag/api/webui/assets/KaTeX_Main-Bold-Jm3AIy58.woff
+++ b/lightrag/api/webui/assets/KaTeX_Main-Bold-Jm3AIy58.woff
--- a/lightrag/api/webui/assets/KaTeX_Main-Bold-waoOVXN0.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Main-Bold-waoOVXN0.ttf
--- a/lightrag/api/webui/assets/KaTeX_Main-BoldItalic-DxDJ3AOS.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Main-BoldItalic-DxDJ3AOS.woff2
--- a/lightrag/api/webui/assets/KaTeX_Main-BoldItalic-DzxPMmG6.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Main-BoldItalic-DzxPMmG6.ttf
--- a/lightrag/api/webui/assets/KaTeX_Main-BoldItalic-SpSLRI95.woff
+++ b/lightrag/api/webui/assets/KaTeX_Main-BoldItalic-SpSLRI95.woff
--- a/lightrag/api/webui/assets/KaTeX_Main-Italic-3WenGoN9.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Main-Italic-3WenGoN9.ttf
--- a/lightrag/api/webui/assets/KaTeX_Main-Italic-BMLOBm91.woff
+++ b/lightrag/api/webui/assets/KaTeX_Main-Italic-BMLOBm91.woff
--- a/lightrag/api/webui/assets/KaTeX_Main-Italic-NWA7e6Wa.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Main-Italic-NWA7e6Wa.woff2
--- a/lightrag/api/webui/assets/KaTeX_Main-Regular-B22Nviop.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Main-Regular-B22Nviop.woff2
--- a/lightrag/api/webui/assets/KaTeX_Main-Regular-Dr94JaBh.woff
+++ b/lightrag/api/webui/assets/KaTeX_Main-Regular-Dr94JaBh.woff
--- a/lightrag/api/webui/assets/KaTeX_Main-Regular-ypZvNtVU.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Main-Regular-ypZvNtVU.ttf
--- a/lightrag/api/webui/assets/KaTeX_Math-BoldItalic-B3XSjfu4.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Math-BoldItalic-B3XSjfu4.ttf
--- a/lightrag/api/webui/assets/KaTeX_Math-BoldItalic-CZnvNsCZ.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Math-BoldItalic-CZnvNsCZ.woff2
--- a/lightrag/api/webui/assets/KaTeX_Math-BoldItalic-iY-2wyZ7.woff
+++ b/lightrag/api/webui/assets/KaTeX_Math-BoldItalic-iY-2wyZ7.woff
--- a/lightrag/api/webui/assets/KaTeX_Math-Italic-DA0__PXp.woff
+++ b/lightrag/api/webui/assets/KaTeX_Math-Italic-DA0__PXp.woff
--- a/lightrag/api/webui/assets/KaTeX_Math-Italic-flOr_0UB.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Math-Italic-flOr_0UB.ttf
--- a/lightrag/api/webui/assets/KaTeX_Math-Italic-t53AETM-.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Math-Italic-t53AETM-.woff2
--- a/lightrag/api/webui/assets/KaTeX_SansSerif-Bold-CFMepnvq.ttf
+++ b/lightrag/api/webui/assets/KaTeX_SansSerif-Bold-CFMepnvq.ttf
--- a/lightrag/api/webui/assets/KaTeX_SansSerif-Bold-D1sUS0GD.woff2
+++ b/lightrag/api/webui/assets/KaTeX_SansSerif-Bold-D1sUS0GD.woff2
--- a/lightrag/api/webui/assets/KaTeX_SansSerif-Bold-DbIhKOiC.woff
+++ b/lightrag/api/webui/assets/KaTeX_SansSerif-Bold-DbIhKOiC.woff
--- a/lightrag/api/webui/assets/KaTeX_SansSerif-Italic-C3H0VqGB.woff2
+++ b/lightrag/api/webui/assets/KaTeX_SansSerif-Italic-C3H0VqGB.woff2
--- a/lightrag/api/webui/assets/KaTeX_SansSerif-Italic-DN2j7dab.woff
+++ b/lightrag/api/webui/assets/KaTeX_SansSerif-Italic-DN2j7dab.woff
--- a/lightrag/api/webui/assets/KaTeX_SansSerif-Italic-YYjJ1zSn.ttf
+++ b/lightrag/api/webui/assets/KaTeX_SansSerif-Italic-YYjJ1zSn.ttf
--- a/lightrag/api/webui/assets/KaTeX_SansSerif-Regular-BNo7hRIc.ttf
+++ b/lightrag/api/webui/assets/KaTeX_SansSerif-Regular-BNo7hRIc.ttf
--- a/lightrag/api/webui/assets/KaTeX_SansSerif-Regular-CS6fqUqJ.woff
+++ b/lightrag/api/webui/assets/KaTeX_SansSerif-Regular-CS6fqUqJ.woff
--- a/lightrag/api/webui/assets/KaTeX_SansSerif-Regular-DDBCnlJ7.woff2
+++ b/lightrag/api/webui/assets/KaTeX_SansSerif-Regular-DDBCnlJ7.woff2
--- a/lightrag/api/webui/assets/KaTeX_Script-Regular-C5JkGWo-.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Script-Regular-C5JkGWo-.ttf
--- a/lightrag/api/webui/assets/KaTeX_Script-Regular-D3wIWfF6.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Script-Regular-D3wIWfF6.woff2
--- a/lightrag/api/webui/assets/KaTeX_Script-Regular-D5yQViql.woff
+++ b/lightrag/api/webui/assets/KaTeX_Script-Regular-D5yQViql.woff
--- a/lightrag/api/webui/assets/KaTeX_Size1-Regular-C195tn64.woff
+++ b/lightrag/api/webui/assets/KaTeX_Size1-Regular-C195tn64.woff
--- a/lightrag/api/webui/assets/KaTeX_Size1-Regular-Dbsnue_I.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Size1-Regular-Dbsnue_I.ttf
--- a/lightrag/api/webui/assets/KaTeX_Size1-Regular-mCD8mA8B.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Size1-Regular-mCD8mA8B.woff2
--- a/lightrag/api/webui/assets/KaTeX_Size2-Regular-B7gKUWhC.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Size2-Regular-B7gKUWhC.ttf
--- a/lightrag/api/webui/assets/KaTeX_Size2-Regular-Dy4dx90m.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Size2-Regular-Dy4dx90m.woff2
--- a/lightrag/api/webui/assets/KaTeX_Size2-Regular-oD1tc_U0.woff
+++ b/lightrag/api/webui/assets/KaTeX_Size2-Regular-oD1tc_U0.woff
--- a/lightrag/api/webui/assets/KaTeX_Size3-Regular-CTq5MqoE.woff
+++ b/lightrag/api/webui/assets/KaTeX_Size3-Regular-CTq5MqoE.woff
--- a/lightrag/api/webui/assets/KaTeX_Size3-Regular-DgpXs0kz.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Size3-Regular-DgpXs0kz.ttf
--- a/lightrag/api/webui/assets/KaTeX_Size4-Regular-BF-4gkZK.woff
+++ b/lightrag/api/webui/assets/KaTeX_Size4-Regular-BF-4gkZK.woff
--- a/lightrag/api/webui/assets/KaTeX_Size4-Regular-DWFBv043.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Size4-Regular-DWFBv043.ttf
--- a/lightrag/api/webui/assets/KaTeX_Size4-Regular-Dl5lxZxV.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Size4-Regular-Dl5lxZxV.woff2
--- a/lightrag/api/webui/assets/KaTeX_Typewriter-Regular-C0xS9mPB.woff
+++ b/lightrag/api/webui/assets/KaTeX_Typewriter-Regular-C0xS9mPB.woff
--- a/lightrag/api/webui/assets/KaTeX_Typewriter-Regular-CO6r4hn1.woff2
+++ b/lightrag/api/webui/assets/KaTeX_Typewriter-Regular-CO6r4hn1.woff2
--- a/lightrag/api/webui/assets/KaTeX_Typewriter-Regular-D3Ib7_Hf.ttf
+++ b/lightrag/api/webui/assets/KaTeX_Typewriter-Regular-D3Ib7_Hf.ttf
--- a/lightrag/api/webui/assets/_basePickBy-CL3u5JqA.js
+++ b/lightrag/api/webui/assets/_basePickBy-CL3u5JqA.js
@ -1 +0,0 @@
-import{e as o,c as l,g as b,k as O,h as P,j as p,l as w,m as c,n as v,t as A,o as N}from"./_baseUniq-BcN6yDOS.js";import{a_ as g,aw as _,a$ as $,b0 as E,b1 as F,b2 as x,b3 as M,b4 as y,b5 as B,b6 as T}from"./mermaid-vendor-DB8JVoWC.js";var S=/\s/;function G(n){for(var r=n.length;r--&&S.test(n.charAt(r)););return r}var H=/^\s+/;function L(n){return n&&n.slice(0,G(n)+1).replace(H,"")}var m=NaN,R=/^[-+]0x[0-9a-f]+$/i,q=/^0b[01]+$/i,z=/^0o[0-7]+$/i,C=parseInt;function K(n){if(typeof n=="number")return n;if(o(n))return m;if(g(n)){var r=typeof n.valueOf=="function"?n.valueOf():n;n=g(r)?r+"":r}if(typeof n!="string")return n===0?n:+n;n=L(n);var t=q.test(n);return t||z.test(n)?C(n.slice(2),t?2:8):R.test(n)?m:+n}var W=1/0,X=17976931348623157e292;function Y(n){if(!n)return n===0?n:0;if(n=K(n),n===W||n===-1/0){var r=n<0?-1:1;return r*X}return n===n?n:0}function D(n){var r=Y(n),t=r%1;return r===r?t?r-t:r:0}function fn(n){var r=n==null?0:n.length;return r?l(n):[]}var I=Object.prototype,J=I.hasOwnProperty,dn=_(function(n,r){n=Object(n);var t=-1,e=r.length,i=e>2?r[2]:void 0;for(i&&$(r[0],r[1],i)&&(e=1);++t<e;)for(var f=r[t],a=E(f),s=-1,d=a.length;++s<d;){var u=a[s],h=n[u];(h===void 0||F(h,I[u])&&!J.call(n,u))&&(n[u]=f[u])}return n});function un(n){var r=n==null?0:n.length;return r?n[r-1]:void 0}function Q(n){return function(r,t,e){var i=Object(r);if(!x(r)){var f=b(t);r=O(r),t=function(s){return f(i[s],s,i)}}var a=n(r,t,e);return a>-1?i[f?r[a]:a]:void 0}}var U=Math.max;function Z(n,r,t){var e=n==null?0:n.length;if(!e)return-1;var i=t==null?0:D(t);return i<0&&(i=U(e+i,0)),P(n,b(r),i)}var hn=Q(Z);function V(n,r){var t=-1,e=x(n)?Array(n.length):[];return p(n,function(i,f,a){e[++t]=r(i,f,a)}),e}function gn(n,r){var t=M(n)?w:V;return t(n,b(r))}var j=Object.prototype,k=j.hasOwnProperty;function nn(n,r){return n!=null&&k.call(n,r)}function bn(n,r){return n!=null&&c(n,r,nn)}function rn(n,r){return n<r}function tn(n,r,t){for(var e=-1,i=n.length;++e<i;){var f=n[e],a=r(f);if(a!=null&&(s===void 0?a===a&&!o(a):t(a,s)))var s=a,d=f}return d}function mn(n){return n&&n.length?tn(n,y,rn):void 0}function an(n,r,t,e){if(!g(n))return n;r=v(r,n);for(var i=-1,f=r.length,a=f-1,s=n;s!=null&&++i<f;){var d=A(r[i]),u=t;if(d==="__proto__"||d==="constructor"||d==="prototype")return n;if(i!=a){var h=s[d];u=void 0,u===void 0&&(u=g(h)?h:B(r[i+1])?[]:{})}T(s,d,u),s=s[d]}return n}function on(n,r,t){for(var e=-1,i=r.length,f={};++e<i;){var a=r[e],s=N(n,a);t(s,a)&&an(f,v(a,n),s)}return f}export{rn as a,tn as b,V as c,on as d,mn as e,fn as f,hn as g,bn as h,dn as i,D as j,un as l,gn as m,Y as t};
--- a/lightrag/api/webui/assets/_baseUniq-BcN6yDOS.js
+++ b/lightrag/api/webui/assets/_baseUniq-BcN6yDOS.js
--- a/lightrag/api/webui/assets/architectureDiagram-SUXI7LT5-BmbvQJPc.js
+++ b/lightrag/api/webui/assets/architectureDiagram-SUXI7LT5-BmbvQJPc.js
--- a/lightrag/api/webui/assets/blockDiagram-6J76NXCF-B95RfZYi.js
+++ b/lightrag/api/webui/assets/blockDiagram-6J76NXCF-B95RfZYi.js
--- a/lightrag/api/webui/assets/c4Diagram-6F6E4RAY-C-cBwmFS.js
+++ b/lightrag/api/webui/assets/c4Diagram-6F6E4RAY-C-cBwmFS.js
--- a/lightrag/api/webui/assets/chunk-353BL4L5-UH80ea8s.js
+++ b/lightrag/api/webui/assets/chunk-353BL4L5-UH80ea8s.js
@ -1 +0,0 @@
-import{_ as l}from"./mermaid-vendor-DB8JVoWC.js";function m(e,c){var i,t,o;e.accDescr&&((i=c.setAccDescription)==null||i.call(c,e.accDescr)),e.accTitle&&((t=c.setAccTitle)==null||t.call(c,e.accTitle)),e.title&&((o=c.setDiagramTitle)==null||o.call(c,e.title))}l(m,"populateCommonDb");export{m as p};
--- a/lightrag/api/webui/assets/chunk-67H74DCK-CPRP2M6d.js
+++ b/lightrag/api/webui/assets/chunk-67H74DCK-CPRP2M6d.js
@ -1 +0,0 @@
-import{_ as n,a2 as x,j as l}from"./mermaid-vendor-DB8JVoWC.js";var c=n((a,t)=>{const e=a.append("rect");if(e.attr("x",t.x),e.attr("y",t.y),e.attr("fill",t.fill),e.attr("stroke",t.stroke),e.attr("width",t.width),e.attr("height",t.height),t.name&&e.attr("name",t.name),t.rx&&e.attr("rx",t.rx),t.ry&&e.attr("ry",t.ry),t.attrs!==void 0)for(const r in t.attrs)e.attr(r,t.attrs[r]);return t.class&&e.attr("class",t.class),e},"drawRect"),d=n((a,t)=>{const e={x:t.startx,y:t.starty,width:t.stopx-t.startx,height:t.stopy-t.starty,fill:t.fill,stroke:t.stroke,class:"rect"};c(a,e).lower()},"drawBackgroundRect"),g=n((a,t)=>{const e=t.text.replace(x," "),r=a.append("text");r.attr("x",t.x),r.attr("y",t.y),r.attr("class","legend"),r.style("text-anchor",t.anchor),t.class&&r.attr("class",t.class);const s=r.append("tspan");return s.attr("x",t.x+t.textMargin*2),s.text(e),r},"drawText"),h=n((a,t,e,r)=>{const s=a.append("image");s.attr("x",t),s.attr("y",e);const i=l.sanitizeUrl(r);s.attr("xlink:href",i)},"drawImage"),m=n((a,t,e,r)=>{const s=a.append("use");s.attr("x",t),s.attr("y",e);const i=l.sanitizeUrl(r);s.attr("xlink:href",`#${i}`)},"drawEmbeddedImage"),y=n(()=>({x:0,y:0,width:100,height:100,fill:"#EDF2AE",stroke:"#666",anchor:"start",rx:0,ry:0}),"getNoteRect"),p=n(()=>({x:0,y:0,width:100,height:100,"text-anchor":"start",style:"#666",textMargin:0,rx:0,ry:0,tspan:!0}),"getTextObj");export{d as a,p as b,m as c,c as d,h as e,g as f,y as g};
--- a/lightrag/api/webui/assets/chunk-AACKK3MU-Do4wGdaW.js
+++ b/lightrag/api/webui/assets/chunk-AACKK3MU-Do4wGdaW.js
@ -1 +0,0 @@
-import{_ as s}from"./mermaid-vendor-DB8JVoWC.js";var t,e=(t=class{constructor(i){this.init=i,this.records=this.init()}reset(){this.records=this.init()}},s(t,"ImperativeState"),t);export{e as I};
--- a/lightrag/api/webui/assets/chunk-BFAMUDN2-320t7cIN.js
+++ b/lightrag/api/webui/assets/chunk-BFAMUDN2-320t7cIN.js
@ -1 +0,0 @@
-import{_ as a,d as o}from"./mermaid-vendor-DB8JVoWC.js";var d=a((t,e)=>{let n;return e==="sandbox"&&(n=o("#i"+t)),(e==="sandbox"?o(n.nodes()[0].contentDocument.body):o("body")).select(`[id="${t}"]`)},"getDiagramElement");export{d as g};
--- a/lightrag/api/webui/assets/chunk-E2GYISFI-BdaD7Bwn.js
+++ b/lightrag/api/webui/assets/chunk-E2GYISFI-BdaD7Bwn.js
@ -1,15 +0,0 @@
-import{_ as e}from"./mermaid-vendor-DB8JVoWC.js";var l=e(()=>`
-  /* Font Awesome icon styling - consolidated */
-  .label-icon {
-    display: inline-block;
-    height: 1em;
-    overflow: visible;
-    vertical-align: -0.125em;
-  }
-  
-  .node .label-icon path {
-    fill: currentColor;
-    stroke: revert;
-    stroke-width: revert;
-  }
-`,"getIconStyles");export{l as g};
--- a/Show more
+++ b/Show more
				`@ -1 +0,0 @@`
				import{e as o,c as l,g as b,k as O,h as P,j as p,l as w,m as c,n as v,t as A,o as N}from"./_baseUniq-BcN6yDOS.js";import{a_ as g,aw as _,a$ as $,b0 as E,b1 as F,b2 as x,b3 as M,b4 as y,b5 as B,b6 as T}from"./mermaid-vendor-DB8JVoWC.js";var S=/\s/;function G(n){for(var r=n.length;r--&&S.test(n.charAt(r)););return r}var H=/^\s+/;function L(n){return n&&n.slice(0,G(n)+1).replace(H,"")}var m=NaN,R=/^[-+]0x[0-9a-f]+$/i,q=/^0b[01]+$/i,z=/^0o[0-7]+$/i,C=parseInt;function K(n){if(typeof n=="number")return n;if(o(n))return m;if(g(n)){var r=typeof n.valueOf=="function"?n.valueOf():n;n=g(r)?r+"":r}if(typeof n!="string")return n===0?n:+n;n=L(n);var t=q.test(n);return t\|\|z.test(n)?C(n.slice(2),t?2:8):R.test(n)?m:+n}var W=1/0,X=17976931348623157e292;function Y(n){if(!n)return n===0?n:0;if(n=K(n),n===W\|\|n===-1/0){var r=n<0?-1:1;return r*X}return n===n?n:0}function D(n){var r=Y(n),t=r%1;return r===r?t?r-t:r:0}function fn(n){var r=n==null?0:n.length;return r?l(n):[]}var I=Object.prototype,J=I.hasOwnProperty,dn=_(function(n,r){n=Object(n);var t=-1,e=r.length,i=e>2?r[2]:void 0;for(i&&$(r[0],r[1],i)&&(e=1);++t<e;)for(var f=r[t],a=E(f),s=-1,d=a.length;++s<d;){var u=a[s],h=n[u];(h===void 0\|\|F(h,I[u])&&!J.call(n,u))&&(n[u]=f[u])}return n});function un(n){var r=n==null?0:n.length;return r?n[r-1]:void 0}function Q(n){return function(r,t,e){var i=Object(r);if(!x(r)){var f=b(t);r=O(r),t=function(s){return f(i[s],s,i)}}var a=n(r,t,e);return a>-1?i[f?r[a]:a]:void 0}}var U=Math.max;function Z(n,r,t){var e=n==null?0:n.length;if(!e)return-1;var i=t==null?0:D(t);return i<0&&(i=U(e+i,0)),P(n,b(r),i)}var hn=Q(Z);function V(n,r){var t=-1,e=x(n)?Array(n.length):[];return p(n,function(i,f,a){e[++t]=r(i,f,a)}),e}function gn(n,r){var t=M(n)?w:V;return t(n,b(r))}var j=Object.prototype,k=j.hasOwnProperty;function nn(n,r){return n!=null&&k.call(n,r)}function bn(n,r){return n!=null&&c(n,r,nn)}function rn(n,r){return n<r}function tn(n,r,t){for(var e=-1,i=n.length;++e<i;){var f=n[e],a=r(f);if(a!=null&&(s===void 0?a===a&&!o(a):t(a,s)))var s=a,d=f}return d}function mn(n){return n&&n.length?tn(n,y,rn):void 0}function an(n,r,t,e){if(!g(n))return n;r=v(r,n);for(var i=-1,f=r.length,a=f-1,s=n;s!=null&&++i<f;){var d=A(r[i]),u=t;if(d==="__proto__"\|\|d==="constructor"\|\|d==="prototype")return n;if(i!=a){var h=s[d];u=void 0,u===void 0&&(u=g(h)?h:B(r[i+1])?[]:{})}T(s,d,u),s=s[d]}return n}function on(n,r,t){for(var e=-1,i=r.length,f={};++e<i;){var a=r[e],s=N(n,a);t(s,a)&&an(f,v(a,n),s)}return f}export{rn as a,tn as b,V as c,on as d,mn as e,fn as f,hn as g,bn as h,dn as i,D as j,un as l,gn as m,Y as t};
				`@ -1 +0,0 @@`
				`import{_ as l}from"./mermaid-vendor-DB8JVoWC.js";function m(e,c){var i,t,o;e.accDescr&&((i=c.setAccDescription)==null\|\|i.call(c,e.accDescr)),e.accTitle&&((t=c.setAccTitle)==null\|\|t.call(c,e.accTitle)),e.title&&((o=c.setDiagramTitle)==null\|\|o.call(c,e.title))}l(m,"populateCommonDb");export{m as p};`
				`@ -1 +0,0 @@`
				import{_ as n,a2 as x,j as l}from"./mermaid-vendor-DB8JVoWC.js";var c=n((a,t)=>{const e=a.append("rect");if(e.attr("x",t.x),e.attr("y",t.y),e.attr("fill",t.fill),e.attr("stroke",t.stroke),e.attr("width",t.width),e.attr("height",t.height),t.name&&e.attr("name",t.name),t.rx&&e.attr("rx",t.rx),t.ry&&e.attr("ry",t.ry),t.attrs!==void 0)for(const r in t.attrs)e.attr(r,t.attrs[r]);return t.class&&e.attr("class",t.class),e},"drawRect"),d=n((a,t)=>{const e={x:t.startx,y:t.starty,width:t.stopx-t.startx,height:t.stopy-t.starty,fill:t.fill,stroke:t.stroke,class:"rect"};c(a,e).lower()},"drawBackgroundRect"),g=n((a,t)=>{const e=t.text.replace(x," "),r=a.append("text");r.attr("x",t.x),r.attr("y",t.y),r.attr("class","legend"),r.style("text-anchor",t.anchor),t.class&&r.attr("class",t.class);const s=r.append("tspan");return s.attr("x",t.x+t.textMargin*2),s.text(e),r},"drawText"),h=n((a,t,e,r)=>{const s=a.append("image");s.attr("x",t),s.attr("y",e);const i=l.sanitizeUrl(r);s.attr("xlink:href",i)},"drawImage"),m=n((a,t,e,r)=>{const s=a.append("use");s.attr("x",t),s.attr("y",e);const i=l.sanitizeUrl(r);s.attr("xlink:href",`#${i}`)},"drawEmbeddedImage"),y=n(()=>({x:0,y:0,width:100,height:100,fill:"#EDF2AE",stroke:"#666",anchor:"start",rx:0,ry:0}),"getNoteRect"),p=n(()=>({x:0,y:0,width:100,height:100,"text-anchor":"start",style:"#666",textMargin:0,rx:0,ry:0,tspan:!0}),"getTextObj");export{d as a,p as b,m as c,c as d,h as e,g as f,y as g};
				`@ -1 +0,0 @@`
				`import{_ as s}from"./mermaid-vendor-DB8JVoWC.js";var t,e=(t=class{constructor(i){this.init=i,this.records=this.init()}reset(){this.records=this.init()}},s(t,"ImperativeState"),t);export{e as I};`
				`@ -1 +0,0 @@`
				import{_ as a,d as o}from"./mermaid-vendor-DB8JVoWC.js";var d=a((t,e)=>{let n;return e==="sandbox"&&(n=o("#i"+t)),(e==="sandbox"?o(n.nodes()[0].contentDocument.body):o("body")).select(`[id="${t}"]`)},"getDiagramElement");export{d as g};