No description
Find a file
clssck da9070ecf7 refactor: remove legacy storage implementations and k8s deployment
Remove deprecated storage backends and Kubernetes deployment configuration:
- Delete unused storage implementations: FAISS, JSON, Memgraph, Milvus, MongoDB, Nano Vector DB, Neo4j, NetworkX, Qdrant, Redis
- Remove Kubernetes deployment manifests and installation scripts
- Delete legacy examples for deprecated backends
- Consolidate to PostgreSQL-only storage backend
Streamline dependencies and add new capabilities:
- Remove deprecated code documentation and migration guides
- Add full-text search caching layer with FTS cache module
- Implement metrics collection and monitoring pipeline
- Add explain and metrics API routes
- Simplify configuration with PostgreSQL-focused setup
Update documentation and configuration:
- Rewrite README to focus on supported features
- Update environment and configuration examples
- Remove Kubernetes-specific documentation
- Add new utility scripts for PDF uploads and pipeline monitoring
2025-12-09 14:02:00 +01:00
.clinerules Add testing workflow guidelines to basic development rules 2025-11-18 11:54:19 +08:00
.github chore: sync with upstream (#4) 2025-12-03 13:16:28 +01:00
assets Update logo.png 2025-05-12 14:56:46 +08:00
docker/postgres-age-vector feat: add automatic entity resolution with 3-layer matching 2025-11-27 15:35:02 +01:00
docs refactor: move document deps to api group, remove dynamic imports 2025-11-17 12:54:32 +08:00
examples refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
lightrag refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
lightrag_webui feat(lightrag,lightrag_webui): add S3 storage integration and UI 2025-12-07 11:04:38 +01:00
README.assets Add chinese version of README 2025-03-25 12:51:05 +08:00
reproduce test(lightrag,api): add comprehensive test coverage and S3 support 2025-12-05 23:13:39 +01:00
tests refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
.dockerignore test: fix env handling, add type hints, improve docs 2025-12-03 15:02:11 +01:00
.gitattributes Update .gitattributes for webui files 2025-03-20 13:45:15 +08:00
.gitignore test(lightrag,examples,api): comprehensive ruff formatting and type hints 2025-12-05 15:17:06 +01:00
.pre-commit-config.yaml Revert "Add --show-diff-on-failure to ruff args in pre-commit config." 2025-03-25 13:10:59 +08:00
AGENTS.md refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
config.ini.example refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
docker-build-push.sh remove inherited workflows, keep only docker-publish 2025-11-28 09:10:38 +00:00
docker-compose.test.yml test(lightrag,api): add comprehensive test coverage and S3 support 2025-12-05 23:13:39 +01:00
docker-compose.yml Change default docker image to offline version 2025-10-16 16:52:01 +08:00
Dockerfile feat(postgres): add bulk operations and health check 2025-12-03 18:19:26 +00:00
Dockerfile.lite Add BuildKit cache mounts to optimize Docker build performance 2025-11-03 12:40:30 +08:00
env.example refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
LICENSE Update LICENSE 2025-04-16 15:50:53 +08:00
lightrag.service.example Refactor systemd service config to use environment variables 2025-10-29 20:14:17 +08:00
MANIFEST.in Include static files in package distribution 2025-10-30 10:50:28 +08:00
monitor_pipeline.py refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
pyproject.toml refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
pyrightconfig.json refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
README-zh.md refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
README.md refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
requirements-offline-llm.txt Update OpenAI client to use stable API and bump minimum version to 2.0.0 2025-11-21 12:55:44 +08:00
requirements-offline-storage.txt Update qdrant-client minimum version from 1.7.0 to 1.11.0 2025-11-10 11:54:48 +08:00
requirements-offline.txt Update OpenAI client to use stable API and bump minimum version to 2.0.0 2025-11-21 12:55:44 +08:00
ruff.toml test(lightrag,examples,api): comprehensive ruff formatting and type hints 2025-12-05 15:17:06 +01:00
SECURITY.md Fix linting 2025-05-12 23:27:41 +08:00
setup.py Refactor setup.py to utilize pyproject.toml for project installation. 2025-07-05 11:19:00 +08:00
upload_pdfs.py refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00
uv.lock refactor: remove legacy storage implementations and k8s deployment 2025-12-09 14:02:00 +01:00

LightRAG Logo

🚀 LightRAG: Specialized Production Fork

A production-ready fork of LightRAG featuring S3 storage integration, a modernized Web UI, and a robust API.

🔱 About This Fork

This repository is a specialized fork of LightRAG, designed to bridge the gap between research and production. While preserving the core "Simple and Fast" philosophy, we have added critical infrastructure components:

  • ☁️ S3 Storage Integration: Native support for S3-compatible object storage (AWS, MinIO, Cloudflare R2) for scalable document and artifact management.
  • 🖥️ Modern Web UI: A completely redesigned interface featuring:
    • S3 Browser: Integrated file management system.
    • File Viewers: Built-in PDF and text viewers.
    • Enhanced Layout: Resizable panes and improved UX.
  • 🔌 Robust API: Expanded REST endpoints supporting multipart uploads, bulk operations, and advanced search parameters.
  • 🛡️ Code Quality: Comprehensive type hinting (Pyright strict), Ruff formatting, and extensive test coverage for critical paths.

📖 Introduction to LightRAG

LightRAG incorporates graph structures into text indexing and retrieval processes. This innovative framework employs a dual-level retrieval system that enhances comprehensive information retrieval from low-level entities to high-level broader topics.

Algorithm Flowchart

LightRAG Indexing Flowchart Figure 1: LightRAG Indexing Flowchart (Source)

Quick Start

1. Installation

This project uses uv for fast and reliable package management.

Option A: Install from PyPI

uv pip install "lightrag-hku[api]"

Option B: Install from Source (Recommended for this Fork)

git clone https://github.com/YourUsername/LightRAG.git
cd LightRAG
uv sync --extra api
source .venv/bin/activate

2. Running the Server (UI + API)

The easiest way to experience the enhancements in this fork is via the LightRAG Server.

  1. Configure Environment:

    cp env.example .env
    # Edit .env to add your API keys (OpenAI/Azure/etc.) and S3 credentials
    
  2. Start the Server:

    lightrag-server
    
  3. Access the UI: Open http://localhost:9600 to view the Knowledge Graph, upload files via the S3 browser, and perform queries.

3. Python API Example

You can also use LightRAG directly in your Python code:

import os
import asyncio
from lightrag import LightRAG, QueryParam
from lightrag.llm.openai import gpt_4o_mini_complete, openai_embed

WORKING_DIR = "./rag_storage"
if not os.path.exists(WORKING_DIR):
    os.mkdir(WORKING_DIR)

async def main():
    # Initialize LightRAG
    rag = LightRAG(
        working_dir=WORKING_DIR,
        embedding_func=openai_embed,
        llm_model_func=gpt_4o_mini_complete,
    )
    await rag.initialize_storages()

    # Insert Document
    await rag.ainsert("LightRAG is a retrieval-augmented generation framework.")

    # Query
    print(await rag.aquery(
        "What is LightRAG?",
        param=QueryParam(mode="hybrid")
    ))

if __name__ == "__main__":
    asyncio.run(main())

📦 Features & Architecture

Storage Backends

LightRAG now targets a single, production-grade stack: PostgreSQL with pgvector and AGE-compatible graph support. Object storage remains pluggable (S3 or local).

Type Implementations
KV Storage PGKVStorage
Vector Storage PGVectorStorage (pgvector)
Graph Storage PGGraphStorage (AGE/PG)
Object Storage S3Storage (New), LocalFileStorage

Specialized API Routes

This fork exposes additional endpoints:

  • POST /documents/upload: Multipart file upload (supports PDF, TXT, MD).
  • GET /storage/list: List files in S3/Local storage.
  • GET /storage/content: Retrieve file content.

🛠️ Configuration

See env.example for a complete list of configuration options. Key variables for this fork:

# S3 Configuration (Optional)
S3_ENDPOINT_URL=https://<accountid>.r2.cloudflarestorage.com
S3_ACCESS_KEY_ID=<your_access_key>
S3_SECRET_ACCESS_KEY=<your_secret_key>
S3_BUCKET_NAME=lightrag-docs

📚 Documentation

🤝 Contribution

Contributions are welcome! Please ensure you:

  1. Install development dependencies: uv sync --extra test
  2. Run tests before submitting: pytest tests/
  3. Format code: ruff format .

📜 License

This project is licensed under the MIT License. See the LICENSE file for details.


Fork maintained by clssck. Based on the excellent work by the HKUDS team.