Merge branch 'main' into merge_lock_with_key

This commit is contained in:
yangdx 2025-07-08 12:46:31 +08:00
commit 56d43de58a
218 changed files with 11853 additions and 3160 deletions

View file

@ -3,6 +3,7 @@ name: Build and Push Docker Image
on:
release:
types: [published]
workflow_dispatch:
permissions:
contents: read
@ -38,6 +39,7 @@ jobs:
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}

3
.gitignore vendored
View file

@ -49,6 +49,7 @@ inputs/
rag_storage/
examples/input/
examples/output/
output*/
# Miscellaneous
.DS_Store
@ -59,6 +60,8 @@ ignore_this.txt
# Project-specific files
dickens*/
book.txt
LightRAG.pdf
download_models_hf.py
lightrag-dev/
gui/

View file

@ -1,9 +1,56 @@
# LightRAG: Simple and Fast Retrieval-Augmented Generation
<div align="center">
<img src="./README.assets/b2aaf634151b4706892693ffb43d9093.png" width="800" alt="LightRAG Diagram">
<div style="margin: 20px 0;">
<img src="./assets/logo.png" width="120" height="120" alt="LightRAG Logo" style="border-radius: 20px; box-shadow: 0 8px 32px rgba(0, 217, 255, 0.3);">
</div>
# 🚀 LightRAG: Simple and Fast Retrieval-Augmented Generation
<div align="center">
<a href="https://trendshift.io/repositories/13043" target="_blank"><img src="https://trendshift.io/api/badge/repositories/13043" alt="HKUDS%2FLightRAG | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</div>
<div align="center">
<div style="width: 100%; height: 2px; margin: 20px 0; background: linear-gradient(90deg, transparent, #00d9ff, transparent);"></div>
</div>
<div align="center">
<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); border-radius: 15px; padding: 25px; text-align: center;">
<p>
<a href='https://github.com/HKUDS/LightRAG'><img src='https://img.shields.io/badge/🔥项目-主页-00d9ff?style=for-the-badge&logo=github&logoColor=white&labelColor=1a1a2e'></a>
<a href='https://arxiv.org/abs/2410.05779'><img src='https://img.shields.io/badge/📄arXiv-2410.05779-ff6b6b?style=for-the-badge&logo=arxiv&logoColor=white&labelColor=1a1a2e'></a>
<a href="https://github.com/HKUDS/LightRAG/stargazers"><img src='https://img.shields.io/github/stars/HKUDS/LightRAG?color=00d9ff&style=for-the-badge&logo=star&logoColor=white&labelColor=1a1a2e' /></a>
</p>
<p>
<img src="https://img.shields.io/badge/🐍Python-3.10-4ecdc4?style=for-the-badge&logo=python&logoColor=white&labelColor=1a1a2e">
<a href="https://pypi.org/project/lightrag-hku/"><img src="https://img.shields.io/pypi/v/lightrag-hku.svg?style=for-the-badge&logo=pypi&logoColor=white&labelColor=1a1a2e&color=ff6b6b"></a>
</p>
<p>
<a href="https://discord.gg/yF2MmDJyGJ"><img src="https://img.shields.io/badge/💬Discord-社区-7289da?style=for-the-badge&logo=discord&logoColor=white&labelColor=1a1a2e"></a>
<a href="https://github.com/HKUDS/LightRAG/issues/285"><img src="https://img.shields.io/badge/💬微信群-交流-07c160?style=for-the-badge&logo=wechat&logoColor=white&labelColor=1a1a2e"></a>
</p>
<p>
<a href="README_zh.md"><img src="https://img.shields.io/badge/🇨🇳中文版-1a1a2e?style=for-the-badge"></a>
<a href="README.md"><img src="https://img.shields.io/badge/🇺🇸English-1a1a2e?style=for-the-badge"></a>
</p>
</div>
</div>
</div>
<div align="center" style="margin: 30px 0;">
<img src="https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif" width="800">
</div>
<div align="center" style="margin: 30px 0;">
<img src="./README.assets/b2aaf634151b4706892693ffb43d9093.png" width="800" alt="LightRAG Diagram">
</div>
---
## 🎉 新闻
- [X] [2025.06.05]🎯📢LightRAG现已集成RAG-Anything支持全面的多模态文档解析与RAG能力PDF、图片、Office文档、表格、公式等。详见下方[多模态处理模块](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#多模态文档处理rag-anything集成)。
- [X] [2025.03.18]🎯📢LightRAG现已支持引文功能。
- [X] [2025.02.05]🎯📢我们团队发布了[VideoRAG](https://github.com/HKUDS/VideoRAG),用于理解超长上下文视频。
- [X] [2025.01.13]🎯📢我们团队发布了[MiniRAG](https://github.com/HKUDS/MiniRAG)使用小型模型简化RAG。
@ -43,6 +90,8 @@ LightRAG服务器旨在提供Web UI和API支持。Web UI便于文档索引、知
```bash
pip install "lightrag-hku[api]"
cp env.example .env
lightrag-server
```
* 从源代码安装
@ -53,6 +102,8 @@ cd LightRAG
# 如有必要创建Python虚拟环境
# 以可编辑模式安装并支持API
pip install -e ".[api]"
cp env.example .env
lightrag-server
```
* 使用 Docker Compose 启动 LightRAG 服务器
@ -111,6 +162,8 @@ python examples/lightrag_openai_demo.py
## 使用LightRAG Core进行编程
> 如果您希望将LightRAG集成到您的项目中建议您使用LightRAG Server提供的REST API。LightRAG Core通常用于嵌入式应用或供希望进行研究与评估的学者使用。
### 一个简单程序
以下Python代码片段演示了如何初始化LightRAG、插入文本并进行查询
@ -708,6 +761,8 @@ async def initialize_rag():
<details>
<summary> <b>使用Faiss进行存储</b> </summary>
在使用Faiss向量数据库之前必须手工安装`faiss-cpu``faiss-gpu`
- 安装所需依赖:
@ -929,6 +984,94 @@ rag.insert_custom_kg(custom_kg)
</details>
## 删除功能
LightRAG提供了全面的删除功能允许您删除文档、实体和关系。
<details>
<summary> <b>删除实体</b> </summary>
您可以通过实体名称删除实体及其所有关联关系:
```python
# 删除实体及其所有关系(同步版本)
rag.delete_by_entity("Google")
# 异步版本
await rag.adelete_by_entity("Google")
```
删除实体时会:
- 从知识图谱中移除该实体节点
- 删除该实体的所有关联关系
- 从向量数据库中移除相关的嵌入向量
- 保持知识图谱的完整性
</details>
<details>
<summary> <b>删除关系</b> </summary>
您可以删除两个特定实体之间的关系:
```python
# 删除两个实体之间的关系(同步版本)
rag.delete_by_relation("Google", "Gmail")
# 异步版本
await rag.adelete_by_relation("Google", "Gmail")
```
删除关系时会:
- 移除指定的关系边
- 从向量数据库中删除关系的嵌入向量
- 保留两个实体节点及其他关系
</details>
<details>
<summary> <b>通过文档ID删除</b> </summary>
您可以通过文档ID删除整个文档及其相关的所有知识
```python
# 通过文档ID删除异步版本
await rag.adelete_by_doc_id("doc-12345")
```
通过文档ID删除时的优化处理
- **智能清理**:自动识别并删除仅属于该文档的实体和关系
- **保留共享知识**:如果实体或关系在其他文档中也存在,则会保留并重新构建描述
- **缓存优化**清理相关的LLM缓存以减少存储开销
- **增量重建**:从剩余文档重新构建受影响的实体和关系描述
删除过程包括:
1. 删除文档相关的所有文本块
2. 识别仅属于该文档的实体和关系并删除
3. 重新构建在其他文档中仍存在的实体和关系
4. 更新所有相关的向量索引
5. 清理文档状态记录
注意通过文档ID删除是一个异步操作因为它涉及复杂的知识图谱重构过程。
</details>
<details>
<summary> <b>删除注意事项</b> </summary>
**重要提醒:**
1. **不可逆操作**:所有删除操作都是不可逆的,请谨慎使用
2. **性能考虑**删除大量数据时可能需要一些时间特别是通过文档ID删除
3. **数据一致性**:删除操作会自动维护知识图谱和向量数据库之间的一致性
4. **备份建议**:在执行重要删除操作前建议备份数据
**批量删除建议:**
- 对于批量删除操作,建议使用异步方法以获得更好的性能
- 大规模删除时,考虑分批进行以避免系统负载过高
</details>
## 实体合并
<details>
@ -1000,6 +1143,120 @@ rag.merge_entities(
</details>
## 多模态文档处理RAG-Anything集成
LightRAG 现已与 [RAG-Anything](https://github.com/HKUDS/RAG-Anything) 实现无缝集成,这是一个专为 LightRAG 构建的**全能多模态文档处理RAG系统**。RAG-Anything 提供先进的解析和检索增强生成RAG能力让您能够无缝处理多模态文档并从各种文档格式中提取结构化内容——包括文本、图片、表格和公式——以集成到您的RAG流程中。
**主要特性:**
- **端到端多模态流程**:从文档摄取解析到智能多模态问答的完整工作流程
- **通用文档支持**无缝处理PDF、Office文档DOC/DOCX/PPT/PPTX/XLS/XLSX、图片和各种文件格式
- **专业内容分析**:针对图片、表格、数学公式和异构内容类型的专用处理器
- **多模态知识图谱**:自动实体提取和跨模态关系发现以增强理解
- **混合智能检索**:覆盖文本和多模态内容的高级搜索能力,具备上下文理解
**快速开始:**
1. 安装RAG-Anything
```bash
pip install raganything
```
2. 处理多模态文档:
<details>
<summary> <b> RAGAnything 使用示例 </b></summary>
```python
import asyncio
from raganything import RAGAnything
from lightrag import LightRAG
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.utils import EmbeddingFunc
import os
async def load_existing_lightrag():
# 首先,创建或加载现有的 LightRAG 实例
lightrag_working_dir = "./existing_lightrag_storage"
# 检查是否存在之前的 LightRAG 实例
if os.path.exists(lightrag_working_dir) and os.listdir(lightrag_working_dir):
print("✅ Found existing LightRAG instance, loading...")
else:
print("❌ No existing LightRAG instance found, will create new one")
# 使用您的配置创建/加载 LightRAG 实例
lightrag_instance = LightRAG(
working_dir=lightrag_working_dir,
llm_model_func=lambda prompt, system_prompt=None, history_messages=[], **kwargs: openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key="your-api-key",
**kwargs,
),
embedding_func=EmbeddingFunc(
embedding_dim=3072,
max_token_size=8192,
func=lambda texts: openai_embed(
texts,
model="text-embedding-3-large",
api_key=api_key,
base_url=base_url,
),
)
)
# 初始化存储(如果有现有数据,这将加载现有数据)
await lightrag_instance.initialize_storages()
# 现在使用现有的 LightRAG 实例初始化 RAGAnything
rag = RAGAnything(
lightrag=lightrag_instance, # 传递现有的 LightRAG 实例
# 仅需要视觉模型用于多模态处理
vision_model_func=lambda prompt, system_prompt=None, history_messages=[], image_data=None, **kwargs: openai_complete_if_cache(
"gpt-4o",
"",
system_prompt=None,
history_messages=[],
messages=[
{"role": "system", "content": system_prompt} if system_prompt else None,
{"role": "user", "content": [
{"type": "text", "text": prompt},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}}
]} if image_data else {"role": "user", "content": prompt}
],
api_key="your-api-key",
**kwargs,
) if image_data else openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key="your-api-key",
**kwargs,
)
# 注意working_dir、llm_model_func、embedding_func 等都从 lightrag_instance 继承
)
# 查询现有的知识库
result = await rag.query_with_multimodal(
"What data has been processed in this LightRAG instance?",
mode="hybrid"
)
print("Query result:", result)
# 向现有的 LightRAG 实例添加新的多模态文档
await rag.process_document_complete(
file_path="path/to/new/multimodal_document.pdf",
output_dir="./output"
)
if __name__ == "__main__":
asyncio.run(load_existing_lightrag())
```
</details>
如需详细文档和高级用法,请参阅 [RAG-Anything 仓库](https://github.com/HKUDS/RAG-Anything)。
## Token统计功能
<details>

523
README.md
View file

@ -1,45 +1,55 @@
<center><h2>🚀 LightRAG: Simple and Fast Retrieval-Augmented Generation</h2></center>
<div align="center">
<table border="0" width="100%">
<tr>
<td width="100" align="center">
<img src="./assets/logo.png" width="80" height="80" alt="lightrag">
</td>
<td>
<div>
<p>
<a href='https://lightrag.github.io'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
<a href='https://youtu.be/oageL-1I0GE'><img src='https://badges.aleen42.com/src/youtube.svg'></a>
<a href='https://arxiv.org/abs/2410.05779'><img src='https://img.shields.io/badge/arXiv-2410.05779-b31b1b'></a>
<a href='https://learnopencv.com/lightrag'><img src='https://img.shields.io/badge/LearnOpenCV-blue'></a>
</p>
<p>
<img src='https://img.shields.io/github/stars/hkuds/lightrag?color=green&style=social' />
<img src="https://img.shields.io/badge/python-3.10-blue">
<a href="https://pypi.org/project/lightrag-hku/"><img src="https://img.shields.io/pypi/v/lightrag-hku.svg"></a>
<a href="https://pepy.tech/project/lightrag-hku"><img src="https://static.pepy.tech/badge/lightrag-hku/month"></a>
</p>
<p>
<a href='https://discord.gg/yF2MmDJyGJ'><img src='https://discordapp.com/api/guilds/1296348098003734629/widget.png?style=shield'></a>
<a href='https://github.com/HKUDS/LightRAG/issues/285'><img src='https://img.shields.io/badge/群聊-wechat-green'></a>
</p>
<div style="margin: 20px 0;">
<img src="./assets/logo.png" width="120" height="120" alt="LightRAG Logo" style="border-radius: 20px; box-shadow: 0 8px 32px rgba(0, 217, 255, 0.3);">
</div>
</td>
</tr>
</table>
<img src="./README.assets/b2aaf634151b4706892693ffb43d9093.png" width="800" alt="LightRAG Diagram">
</div>
# 🚀 LightRAG: Simple and Fast Retrieval-Augmented Generation
<div align="center">
<a href="https://trendshift.io/repositories/13043" target="_blank"><img src="https://trendshift.io/api/badge/repositories/13043" alt="HKUDS%2FLightRAG | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</div>
## 🎉 News
<div align="center">
<div style="width: 100%; height: 2px; margin: 20px 0; background: linear-gradient(90deg, transparent, #00d9ff, transparent);"></div>
</div>
<div align="center">
<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); border-radius: 15px; padding: 25px; text-align: center;">
<p>
<a href='https://github.com/HKUDS/LightRAG'><img src='https://img.shields.io/badge/🔥Project-Page-00d9ff?style=for-the-badge&logo=github&logoColor=white&labelColor=1a1a2e'></a>
<a href='https://arxiv.org/abs/2410.05779'><img src='https://img.shields.io/badge/📄arXiv-2410.05779-ff6b6b?style=for-the-badge&logo=arxiv&logoColor=white&labelColor=1a1a2e'></a>
<a href="https://github.com/HKUDS/LightRAG/stargazers"><img src='https://img.shields.io/github/stars/HKUDS/LightRAG?color=00d9ff&style=for-the-badge&logo=star&logoColor=white&labelColor=1a1a2e' /></a>
</p>
<p>
<img src="https://img.shields.io/badge/🐍Python-3.10-4ecdc4?style=for-the-badge&logo=python&logoColor=white&labelColor=1a1a2e">
<a href="https://pypi.org/project/lightrag-hku/"><img src="https://img.shields.io/pypi/v/lightrag-hku.svg?style=for-the-badge&logo=pypi&logoColor=white&labelColor=1a1a2e&color=ff6b6b"></a>
</p>
<p>
<a href="https://discord.gg/yF2MmDJyGJ"><img src="https://img.shields.io/badge/💬Discord-Community-7289da?style=for-the-badge&logo=discord&logoColor=white&labelColor=1a1a2e"></a>
<a href="https://github.com/HKUDS/LightRAG/issues/285"><img src="https://img.shields.io/badge/💬WeChat-Group-07c160?style=for-the-badge&logo=wechat&logoColor=white&labelColor=1a1a2e"></a>
</p>
<p>
<a href="README-zh.md"><img src="https://img.shields.io/badge/🇨🇳中文版-1a1a2e?style=for-the-badge"></a>
<a href="README.md"><img src="https://img.shields.io/badge/🇺🇸English-1a1a2e?style=for-the-badge"></a>
</p>
</div>
</div>
</div>
<div align="center" style="margin: 30px 0;">
<img src="https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif" width="800">
</div>
<div align="center" style="margin: 30px 0;">
<img src="./README.assets/b2aaf634151b4706892693ffb43d9093.png" width="800" alt="LightRAG Diagram">
</div>
---
## 🎉 News
- [X] [2025.06.16]🎯📢Our team has released [RAG-Anything](https://github.com/HKUDS/RAG-Anything) an All-in-One Multimodal RAG System for seamless text, image, table, and equation processing.
- [X] [2025.06.05]🎯📢LightRAG now supports comprehensive multimodal data handling through [RAG-Anything](https://github.com/HKUDS/RAG-Anything) integration, enabling seamless document parsing and RAG capabilities across diverse formats including PDFs, images, Office documents, tables, and formulas. Please refer to the new [multimodal section](https://github.com/HKUDS/LightRAG/?tab=readme-ov-file#multimodal-document-processing-rag-anything-integration) for details.
- [X] [2025.03.18]🎯📢LightRAG now supports citation functionality, enabling proper source attribution.
- [X] [2025.02.05]🎯📢Our team has released [VideoRAG](https://github.com/HKUDS/VideoRAG) understanding extremely long-context videos.
- [X] [2025.01.13]🎯📢Our team has released [MiniRAG](https://github.com/HKUDS/MiniRAG) making RAG simpler with small models.
@ -79,6 +89,8 @@ The LightRAG Server is designed to provide Web UI and API support. The Web UI fa
```bash
pip install "lightrag-hku[api]"
cp env.example .env
lightrag-server
```
* Installation from Source
@ -89,6 +101,8 @@ cd LightRAG
# create a Python virtual enviroment if neccesary
# Install in editable mode with API support
pip install -e ".[api]"
cp env.example .env
lightrag-server
```
* Launching the LightRAG Server with Docker Compose
@ -147,6 +161,14 @@ For a streaming response implementation example, please see `examples/lightrag_o
## Programing with LightRAG Core
> If you would like to integrate LightRAG into your project, we recommend utilizing the REST API provided by the LightRAG Server. LightRAG Core is typically intended for embedded applications or for researchers who wish to conduct studies and evaluations.
### ⚠️ Important: Initialization Requirements
**LightRAG requires explicit initialization before use.** You must call both `await rag.initialize_storages()` and `await initialize_pipeline_status()` after creating a LightRAG instance, otherwise you will encounter errors like:
- `AttributeError: __aenter__` - if storages are not initialized
- `KeyError: 'history_messages'` - if pipeline status is not initialized
### A Simple Program
Use the below Python snippet to initialize LightRAG, insert text to it, and perform queries:
@ -171,8 +193,9 @@ async def initialize_rag():
embedding_func=openai_embed,
llm_model_func=gpt_4o_mini_complete,
)
await rag.initialize_storages()
await initialize_pipeline_status()
# IMPORTANT: Both initialization calls are required!
await rag.initialize_storages() # Initialize storage backends
await initialize_pipeline_status() # Initialize processing pipeline
return rag
async def main():
@ -182,7 +205,7 @@ async def main():
rag.insert("Your text")
# Perform hybrid search
mode="hybrid"
mode = "hybrid"
print(
await rag.query(
"What are the top themes in this story?",
@ -800,6 +823,8 @@ For production level scenarios you will most likely want to leverage an enterpri
<details>
<summary> <b>Using Faiss for Storage</b> </summary>
You must manually install faiss-cpu or faiss-gpu before using FAISS vector db.
Manually install `faiss-cpu` or `faiss-gpu` before using FAISS vector db.
- Install the required dependencies:
@ -898,59 +923,66 @@ All operations are available in both synchronous and asynchronous versions. The
```python
custom_kg = {
"chunks": [
{
"content": "Alice and Bob are collaborating on quantum computing research.",
"source_id": "doc-1"
}
],
"entities": [
{
"entity_name": "Alice",
"entity_type": "person",
"description": "Alice is a researcher specializing in quantum physics.",
"source_id": "doc-1"
},
{
"entity_name": "Bob",
"entity_type": "person",
"description": "Bob is a mathematician.",
"source_id": "doc-1"
},
{
"entity_name": "Quantum Computing",
"entity_type": "technology",
"description": "Quantum computing utilizes quantum mechanical phenomena for computation.",
"source_id": "doc-1"
}
],
"relationships": [
{
"src_id": "Alice",
"tgt_id": "Bob",
"description": "Alice and Bob are research partners.",
"keywords": "collaboration research",
"weight": 1.0,
"source_id": "doc-1"
},
{
"src_id": "Alice",
"tgt_id": "Quantum Computing",
"description": "Alice conducts research on quantum computing.",
"keywords": "research expertise",
"weight": 1.0,
"source_id": "doc-1"
},
{
"src_id": "Bob",
"tgt_id": "Quantum Computing",
"description": "Bob researches quantum computing.",
"keywords": "research application",
"weight": 1.0,
"source_id": "doc-1"
}
]
}
"chunks": [
{
"content": "Alice and Bob are collaborating on quantum computing research.",
"source_id": "doc-1",
"file_path": "test_file",
}
],
"entities": [
{
"entity_name": "Alice",
"entity_type": "person",
"description": "Alice is a researcher specializing in quantum physics.",
"source_id": "doc-1",
"file_path": "test_file"
},
{
"entity_name": "Bob",
"entity_type": "person",
"description": "Bob is a mathematician.",
"source_id": "doc-1",
"file_path": "test_file"
},
{
"entity_name": "Quantum Computing",
"entity_type": "technology",
"description": "Quantum computing utilizes quantum mechanical phenomena for computation.",
"source_id": "doc-1",
"file_path": "test_file"
}
],
"relationships": [
{
"src_id": "Alice",
"tgt_id": "Bob",
"description": "Alice and Bob are research partners.",
"keywords": "collaboration research",
"weight": 1.0,
"source_id": "doc-1",
"file_path": "test_file"
},
{
"src_id": "Alice",
"tgt_id": "Quantum Computing",
"description": "Alice conducts research on quantum computing.",
"keywords": "research expertise",
"weight": 1.0,
"source_id": "doc-1",
"file_path": "test_file"
},
{
"src_id": "Bob",
"tgt_id": "Quantum Computing",
"description": "Bob researches quantum computing.",
"keywords": "research application",
"weight": 1.0,
"source_id": "doc-1",
"file_path": "test_file"
}
]
}
rag.insert_custom_kg(custom_kg)
```
@ -971,6 +1003,89 @@ These operations maintain data consistency across both the graph database and ve
</details>
## Delete Functions
LightRAG provides comprehensive deletion capabilities, allowing you to delete documents, entities, and relationships.
<details>
<summary> <b>Delete Entities</b> </summary>
You can delete entities by their name along with all associated relationships:
```python
# Delete entity and all its relationships (synchronous version)
rag.delete_by_entity("Google")
# Asynchronous version
await rag.adelete_by_entity("Google")
```
When deleting an entity:
- Removes the entity node from the knowledge graph
- Deletes all associated relationships
- Removes related embedding vectors from the vector database
- Maintains knowledge graph integrity
</details>
<details>
<summary> <b>Delete Relations</b> </summary>
You can delete relationships between two specific entities:
```python
# Delete relationship between two entities (synchronous version)
rag.delete_by_relation("Google", "Gmail")
# Asynchronous version
await rag.adelete_by_relation("Google", "Gmail")
```
When deleting a relationship:
- Removes the specified relationship edge
- Deletes the relationship's embedding vector from the vector database
- Preserves both entity nodes and their other relationships
</details>
<details>
<summary> <b>Delete by Document ID</b> </summary>
You can delete an entire document and all its related knowledge through document ID:
```python
# Delete by document ID (asynchronous version)
await rag.adelete_by_doc_id("doc-12345")
```
Optimized processing when deleting by document ID:
- **Smart Cleanup**: Automatically identifies and removes entities and relationships that belong only to this document
- **Preserve Shared Knowledge**: If entities or relationships exist in other documents, they are preserved and their descriptions are rebuilt
- **Cache Optimization**: Clears related LLM cache to reduce storage overhead
- **Incremental Rebuilding**: Reconstructs affected entity and relationship descriptions from remaining documents
The deletion process includes:
1. Delete all text chunks related to the document
2. Identify and delete entities and relationships that belong only to this document
3. Rebuild entities and relationships that still exist in other documents
4. Update all related vector indexes
5. Clean up document status records
Note: Deletion by document ID is an asynchronous operation as it involves complex knowledge graph reconstruction processes.
</details>
**Important Reminders:**
1. **Irreversible Operations**: All deletion operations are irreversible, please use with caution
2. **Performance Considerations**: Deleting large amounts of data may take some time, especially deletion by document ID
3. **Data Consistency**: Deletion operations automatically maintain consistency between the knowledge graph and vector database
4. **Backup Recommendations**: Consider backing up data before performing important deletion operations
**Batch Deletion Recommendations:**
- For batch deletion operations, consider using asynchronous methods for better performance
- For large-scale deletions, consider processing in batches to avoid excessive system load
## Entity Merging
<details>
@ -1042,6 +1157,119 @@ When merging entities:
</details>
## Multimodal Document Processing (RAG-Anything Integration)
LightRAG now seamlessly integrates with [RAG-Anything](https://github.com/HKUDS/RAG-Anything), a comprehensive **All-in-One Multimodal Document Processing RAG system** built specifically for LightRAG. RAG-Anything enables advanced parsing and retrieval-augmented generation (RAG) capabilities, allowing you to handle multimodal documents seamlessly and extract structured content—including text, images, tables, and formulas—from various document formats for integration into your RAG pipeline.
**Key Features:**
- **End-to-End Multimodal Pipeline**: Complete workflow from document ingestion and parsing to intelligent multimodal query answering
- **Universal Document Support**: Seamless processing of PDFs, Office documents (DOC/DOCX/PPT/PPTX/XLS/XLSX), images, and diverse file formats
- **Specialized Content Analysis**: Dedicated processors for images, tables, mathematical equations, and heterogeneous content types
- **Multimodal Knowledge Graph**: Automatic entity extraction and cross-modal relationship discovery for enhanced understanding
- **Hybrid Intelligent Retrieval**: Advanced search capabilities spanning textual and multimodal content with contextual understanding
**Quick Start:**
1. Install RAG-Anything:
```bash
pip install raganything
```
2. Process multimodal documents:
<details>
<summary> <b> RAGAnything Usage Example </b></summary>
```python
import asyncio
from raganything import RAGAnything
from lightrag import LightRAG
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.utils import EmbeddingFunc
import os
async def load_existing_lightrag():
# First, create or load an existing LightRAG instance
lightrag_working_dir = "./existing_lightrag_storage"
# Check if previous LightRAG instance exists
if os.path.exists(lightrag_working_dir) and os.listdir(lightrag_working_dir):
print("✅ Found existing LightRAG instance, loading...")
else:
print("❌ No existing LightRAG instance found, will create new one")
# Create/Load LightRAG instance with your configurations
lightrag_instance = LightRAG(
working_dir=lightrag_working_dir,
llm_model_func=lambda prompt, system_prompt=None, history_messages=[], **kwargs: openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key="your-api-key",
**kwargs,
),
embedding_func=EmbeddingFunc(
embedding_dim=3072,
max_token_size=8192,
func=lambda texts: openai_embed(
texts,
model="text-embedding-3-large",
api_key=api_key,
base_url=base_url,
),
)
)
# Initialize storage (this will load existing data if available)
await lightrag_instance.initialize_storages()
# Now initialize RAGAnything with the existing LightRAG instance
rag = RAGAnything(
lightrag=lightrag_instance, # Pass the existing LightRAG instance
# Only need vision model for multimodal processing
vision_model_func=lambda prompt, system_prompt=None, history_messages=[], image_data=None, **kwargs: openai_complete_if_cache(
"gpt-4o",
"",
system_prompt=None,
history_messages=[],
messages=[
{"role": "system", "content": system_prompt} if system_prompt else None,
{"role": "user", "content": [
{"type": "text", "text": prompt},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}}
]} if image_data else {"role": "user", "content": prompt}
],
api_key="your-api-key",
**kwargs,
) if image_data else openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key="your-api-key",
**kwargs,
)
# Note: working_dir, llm_model_func, embedding_func, etc. are inherited from lightrag_instance
)
# Query the existing knowledge base
result = await rag.query_with_multimodal(
"What data has been processed in this LightRAG instance?",
mode="hybrid"
)
print("Query result:", result)
# Add new multimodal documents to the existing LightRAG instance
await rag.process_document_complete(
file_path="path/to/new/multimodal_document.pdf",
output_dir="./output"
)
if __name__ == "__main__":
asyncio.run(load_existing_lightrag())
```
</details>
For detailed documentation and advanced usage, please refer to the [RAG-Anything repository](https://github.com/HKUDS/RAG-Anything).
## Token Usage Tracking
<details>
@ -1183,6 +1411,33 @@ Valid modes are:
</details>
## Troubleshooting
### Common Initialization Errors
If you encounter these errors when using LightRAG:
1. **`AttributeError: __aenter__`**
- **Cause**: Storage backends not initialized
- **Solution**: Call `await rag.initialize_storages()` after creating the LightRAG instance
2. **`KeyError: 'history_messages'`**
- **Cause**: Pipeline status not initialized
- **Solution**: Call `await initialize_pipeline_status()` after initializing storages
3. **Both errors in sequence**
- **Cause**: Neither initialization method was called
- **Solution**: Always follow this pattern:
```python
rag = LightRAG(...)
await rag.initialize_storages()
await initialize_pipeline_status()
```
### Model Switching Issues
When switching between different embedding models, you must clear the data directory to avoid errors. The only file you may want to preserve is `kv_store_llm_response_cache.json` if you wish to retain the LLM cache.
## LightRAG API
The LightRAG Server is designed to provide Web UI and API support. **For more information about LightRAG Server, please refer to [LightRAG Server](./lightrag/api/README.md).**
@ -1234,7 +1489,7 @@ Output the results in the following structure:
### Batch Eval
To evaluate the performance of two RAG systems on high-level queries, LightRAG uses the following prompt, with the specific code available in `example/batch_eval.py`.
To evaluate the performance of two RAG systems on high-level queries, LightRAG uses the following prompt, with the specific code available in `reproduce/batch_eval.py`.
<details>
<summary> Prompt </summary>
@ -1448,7 +1703,47 @@ def extract_queries(file_path):
</details>
## Star History
## 🔗 Related Projects
*Ecosystem & Extensions*
<div align="center">
<table>
<tr>
<td align="center">
<a href="https://github.com/HKUDS/RAG-Anything">
<div style="width: 100px; height: 100px; background: linear-gradient(135deg, rgba(0, 217, 255, 0.1) 0%, rgba(0, 217, 255, 0.05) 100%); border-radius: 15px; border: 1px solid rgba(0, 217, 255, 0.2); display: flex; align-items: center; justify-content: center; margin-bottom: 10px;">
<span style="font-size: 32px;">📸</span>
</div>
<b>RAG-Anything</b><br>
<sub>Multimodal RAG</sub>
</a>
</td>
<td align="center">
<a href="https://github.com/HKUDS/VideoRAG">
<div style="width: 100px; height: 100px; background: linear-gradient(135deg, rgba(0, 217, 255, 0.1) 0%, rgba(0, 217, 255, 0.05) 100%); border-radius: 15px; border: 1px solid rgba(0, 217, 255, 0.2); display: flex; align-items: center; justify-content: center; margin-bottom: 10px;">
<span style="font-size: 32px;">🎥</span>
</div>
<b>VideoRAG</b><br>
<sub>Extreme Long-Context Video RAG</sub>
</a>
</td>
<td align="center">
<a href="https://github.com/HKUDS/MiniRAG">
<div style="width: 100px; height: 100px; background: linear-gradient(135deg, rgba(0, 217, 255, 0.1) 0%, rgba(0, 217, 255, 0.05) 100%); border-radius: 15px; border: 1px solid rgba(0, 217, 255, 0.2); display: flex; align-items: center; justify-content: center; margin-bottom: 10px;">
<span style="font-size: 32px;"></span>
</div>
<b>MiniRAG</b><br>
<sub>Extremely Simple RAG</sub>
</a>
</td>
</tr>
</table>
</div>
---
## ⭐ Star History
<a href="https://star-history.com/#HKUDS/LightRAG&Date">
<picture>
@ -1458,15 +1753,22 @@ def extract_queries(file_path):
</picture>
</a>
## Contribution
## 🤝 Contribution
Thank you to all our contributors!
<div align="center">
We thank all our contributors for their valuable contributions.
</div>
<a href="https://github.com/HKUDS/LightRAG/graphs/contributors">
<img src="https://contrib.rocks/image?repo=HKUDS/LightRAG" />
</a>
<div align="center">
<a href="https://github.com/HKUDS/LightRAG/graphs/contributors">
<img src="https://contrib.rocks/image?repo=HKUDS/LightRAG" style="border-radius: 15px; box-shadow: 0 0 20px rgba(0, 217, 255, 0.3);" />
</a>
</div>
## 🌟Citation
---
## 📖 Citation
```python
@article{guo2024lightrag,
@ -1479,4 +1781,31 @@ primaryClass={cs.IR}
}
```
**Thank you for your interest in our work!**
---
<div align="center" style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); border-radius: 15px; padding: 30px; margin: 30px 0;">
<div>
<img src="https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif" width="500">
</div>
<div style="margin-top: 20px;">
<a href="https://github.com/HKUDS/LightRAG" style="text-decoration: none;">
<img src="https://img.shields.io/badge/⭐%20Star%20us%20on%20GitHub-1a1a2e?style=for-the-badge&logo=github&logoColor=white">
</a>
<a href="https://github.com/HKUDS/LightRAG/issues" style="text-decoration: none;">
<img src="https://img.shields.io/badge/🐛%20Report%20Issues-ff6b6b?style=for-the-badge&logo=github&logoColor=white">
</a>
<a href="https://github.com/HKUDS/LightRAG/discussions" style="text-decoration: none;">
<img src="https://img.shields.io/badge/💬%20Discussions-4ecdc4?style=for-the-badge&logo=github&logoColor=white">
</a>
</div>
</div>
<div align="center">
<div style="width: 100%; max-width: 600px; margin: 20px auto; padding: 20px; background: linear-gradient(135deg, rgba(0, 217, 255, 0.1) 0%, rgba(0, 217, 255, 0.05) 100%); border-radius: 15px; border: 1px solid rgba(0, 217, 255, 0.2);">
<div style="display: flex; justify-content: center; align-items: center; gap: 15px;">
<span style="font-size: 24px;"></span>
<span style="color: #00d9ff; font-size: 18px;">Thank you for visiting LightRAG!</span>
<span style="font-size: 24px;"></span>
</div>
</div>
</div>

View file

@ -1,10 +1,12 @@
services:
lightrag:
container_name: lightrag
image: ghcr.io/hkuds/lightrag:latest
build:
context: .
dockerfile: Dockerfile
tags:
- lightrag:latest
- ghcr.io/hkuds/lightrag:latest
ports:
- "${PORT:-9621}:9621"
volumes:

View file

@ -0,0 +1,281 @@
# LightRAG Multi-Document Processing: Concurrent Control Strategy Analysis
LightRAG employs a multi-layered concurrent control strategy when processing multiple documents. This article provides an in-depth analysis of the concurrent control mechanisms at document level, chunk level, and LLM request level, helping you understand why specific concurrent behaviors occur.
## Overview
LightRAG's concurrent control is divided into three layers:
1. **Document-level concurrency**: Controls the number of documents processed simultaneously
2. **Chunk-level concurrency**: Controls the number of chunks processed simultaneously within a single document
3. **LLM request-level concurrency**: Controls the global concurrent number of LLM requests
## 1. Document-Level Concurrent Control
**Control Parameter**: `max_parallel_insert`
Document-level concurrency is controlled by the `max_parallel_insert` parameter, with a default value of 2.
```python
# lightrag/lightrag.py
max_parallel_insert: int = field(default=int(os.getenv("MAX_PARALLEL_INSERT", 2)))
```
### Implementation Mechanism
In the `apipeline_process_enqueue_documents` method, a semaphore is used to control document concurrency:
```python
# lightrag/lightrag.py - apipeline_process_enqueue_documents method
async def process_document(
doc_id: str,
status_doc: DocProcessingStatus,
split_by_character: str | None,
split_by_character_only: bool,
pipeline_status: dict,
pipeline_status_lock: asyncio.Lock,
semaphore: asyncio.Semaphore, # Document-level semaphore
) -> None:
"""Process single document"""
async with semaphore: # 🔥 Document-level concurrent control
# ... Process all chunks of a single document
# Create document-level semaphore
semaphore = asyncio.Semaphore(self.max_parallel_insert) # Default 2
# Create processing tasks for each document
doc_tasks = []
for doc_id, status_doc in to_process_docs.items():
doc_tasks.append(
process_document(
doc_id, status_doc, split_by_character, split_by_character_only,
pipeline_status, pipeline_status_lock, semaphore
)
)
# Wait for all documents to complete processing
await asyncio.gather(*doc_tasks)
```
## 2. Chunk-Level Concurrent Control
**Control Parameter**: `llm_model_max_async`
**Key Point**: Each document independently creates its own chunk semaphore!
```python
# lightrag/lightrag.py
llm_model_max_async: int = field(default=int(os.getenv("MAX_ASYNC", 4)))
```
### Implementation Mechanism
In the `extract_entities` function, **each document independently creates** its own chunk semaphore:
```python
# lightrag/operate.py - extract_entities function
async def extract_entities(chunks: dict[str, TextChunkSchema], global_config: dict[str, str], ...):
# 🔥 Key: Each document independently creates this semaphore!
llm_model_max_async = global_config.get("llm_model_max_async", 4)
semaphore = asyncio.Semaphore(llm_model_max_async) # Chunk semaphore for each document
async def _process_with_semaphore(chunk):
async with semaphore: # 🔥 Chunk concurrent control within document
return await _process_single_content(chunk)
# Create tasks for each chunk
tasks = []
for c in ordered_chunks:
task = asyncio.create_task(_process_with_semaphore(c))
tasks.append(task)
# Wait for all chunks to complete processing
done, pending = await asyncio.wait(tasks, return_when=asyncio.FIRST_EXCEPTION)
chunk_results = [task.result() for task in tasks]
return chunk_results
```
### Important Inference: System Overall Chunk Concurrency
Since each document independently creates chunk semaphores, the theoretical chunk concurrency of the system is:
**Theoretical Chunk Concurrency = max_parallel_insert × llm_model_max_async**
For example:
- `max_parallel_insert = 2` (process 2 documents simultaneously)
- `llm_model_max_async = 4` (maximum 4 chunk concurrency per document)
- **Theoretical result**: Maximum 2 × 4 = 8 chunks simultaneously in "processing" state
## 3. LLM Request-Level Concurrent Control (The Real Bottleneck)
**Control Parameter**: `llm_model_max_async` (globally shared)
**Key**: Although there might be 8 chunks "in processing", all LLM requests share the same global priority queue!
```python
# lightrag/lightrag.py - __post_init__ method
self.llm_model_func = priority_limit_async_func_call(self.llm_model_max_async)(
partial(
self.llm_model_func,
hashing_kv=hashing_kv,
**self.llm_model_kwargs,
)
)
# 🔥 Global LLM queue size = llm_model_max_async = 4
```
### Priority Queue Implementation
```python
# lightrag/utils.py - priority_limit_async_func_call function
def priority_limit_async_func_call(max_size: int, max_queue_size: int = 1000):
def final_decro(func):
queue = asyncio.PriorityQueue(maxsize=max_queue_size)
tasks = set()
async def worker():
"""Worker that processes tasks in the priority queue"""
while not shutdown_event.is_set():
try:
priority, count, future, args, kwargs = await asyncio.wait_for(queue.get(), timeout=1.0)
result = await func(*args, **kwargs) # 🔥 Actual LLM call
if not future.done():
future.set_result(result)
except Exception as e:
# Error handling...
finally:
queue.task_done()
# 🔥 Create fixed number of workers (max_size), this is the real concurrency limit
for _ in range(max_size):
task = asyncio.create_task(worker())
tasks.add(task)
```
## 4. Chunk Internal Processing Mechanism (Serial)
### Why Serial?
Internal processing of each chunk strictly follows this serial execution order:
```python
# lightrag/operate.py - _process_single_content function
async def _process_single_content(chunk_key_dp: tuple[str, TextChunkSchema]):
# Step 1: Initial entity extraction
hint_prompt = entity_extract_prompt.format(**{**context_base, "input_text": content})
final_result = await use_llm_func_with_cache(hint_prompt, use_llm_func, ...)
# Process initial extraction results
maybe_nodes, maybe_edges = await _process_extraction_result(final_result, chunk_key, file_path)
# Step 2: Gleaning phase
for now_glean_index in range(entity_extract_max_gleaning):
# 🔥 Serial wait for gleaning results
glean_result = await use_llm_func_with_cache(
continue_prompt, use_llm_func,
llm_response_cache=llm_response_cache,
history_messages=history, cache_type="extract"
)
# Process gleaning results
glean_nodes, glean_edges = await _process_extraction_result(glean_result, chunk_key, file_path)
# Merge results...
# Step 3: Determine whether to continue loop
if now_glean_index == entity_extract_max_gleaning - 1:
break
# 🔥 Serial wait for loop decision results
if_loop_result = await use_llm_func_with_cache(
if_loop_prompt, use_llm_func,
llm_response_cache=llm_response_cache,
history_messages=history, cache_type="extract"
)
if if_loop_result.strip().strip('"').strip("'").lower() != "yes":
break
return maybe_nodes, maybe_edges
```
## 5. Complete Concurrent Hierarchy Diagram
![lightrag_indexing.png](assets%2Flightrag_indexing.png)
### Chunk Internal Processing (Serial)
```
Initial Extraction → Gleaning → Loop Decision → Complete
```
## 6. Real-World Scenario Analysis
### Scenario 1: Single Document with Multiple Chunks
Assume 1 document with 6 chunks:
- **Document level**: Only 1 document, not limited by `max_parallel_insert`
- **Chunk level**: Maximum 4 chunks processed simultaneously (limited by `llm_model_max_async=4`)
- **LLM level**: Global maximum 4 LLM requests concurrent
**Expected behavior**: 4 chunks process concurrently, remaining 2 chunks wait.
### Scenario 2: Multiple Documents with Multiple Chunks
Assume 3 documents, each with 10 chunks:
- **Document level**: Maximum 2 documents processed simultaneously
- **Chunk level**: Maximum 4 chunks per document processed simultaneously
- **Theoretical Chunk concurrency**: 2 × 4 = 8 chunks processed simultaneously
- **Actual LLM concurrency**: Only 4 LLM requests actually execute
**Actual state distribution**:
```
# Possible system state:
Document 1: 4 chunks "processing" (2 executing LLM, 2 waiting for LLM response)
Document 2: 4 chunks "processing" (2 executing LLM, 2 waiting for LLM response)
Document 3: Waiting for document-level semaphore
Total:
- 8 chunks in "processing" state
- 4 LLM requests actually executing
- 4 chunks waiting for LLM response
```
## 7. Performance Optimization Recommendations
### Understanding the Bottleneck
The real bottleneck is the global LLM queue, not the chunk semaphores!
### Adjustment Strategies
**Strategy 1: Increase LLM Concurrent Capacity**
```bash
# Environment variable configuration
export MAX_PARALLEL_INSERT=2 # Keep document concurrency
export MAX_ASYNC=8 # 🔥 Increase LLM request concurrency
```
**Strategy 2: Balance Document and LLM Concurrency**
```python
rag = LightRAG(
max_parallel_insert=3, # Moderately increase document concurrency
llm_model_max_async=12, # Significantly increase LLM concurrency
entity_extract_max_gleaning=0, # Reduce serial steps within chunks
)
```
## 8. Summary
Key characteristics of LightRAG's multi-document concurrent processing mechanism:
### Concurrent Layers
1. **Inter-document competition**: Controlled by `max_parallel_insert`, default 2 documents concurrent
2. **Theoretical Chunk concurrency**: Each document independently creates semaphores, total = max_parallel_insert × llm_model_max_async
3. **Actual LLM concurrency**: All chunks share global LLM queue, controlled by `llm_model_max_async`
4. **Intra-chunk serial**: Multiple LLM requests within each chunk execute strictly serially
### Key Insights
- **Theoretical vs Actual**: System may have many chunks "in processing", but only few are actually executing LLM requests
- **Real Bottleneck**: Global LLM request queue is the performance bottleneck, not chunk semaphores
- **Optimization Focus**: Increasing `llm_model_max_async` is more effective than increasing `max_parallel_insert`

Binary file not shown.

After

Width:  |  Height:  |  Size: 183 KiB

View file

@ -0,0 +1,277 @@
# LightRAG 多文档并发控制机制详解
LightRAG 在处理多个文档时采用了多层次的并发控制策略。本文将深入分析文档级别、chunk级别和LLM请求级别的并发控制机制帮助您理解为什么会出现特定的并发行为。
## 概述
LightRAG 的并发控制分为三个层次:
1. 文档级别并发:控制同时处理的文档数量
2. Chunk级别并发控制单个文档内同时处理的chunk数量
3. LLM请求级别并发控制全局LLM请求的并发数量
## 1. 文档级别并发控制
**控制参数**`max_parallel_insert`
文档级别的并发由 `max_parallel_insert` 参数控制默认值为2。
```python
# lightrag/lightrag.py
max_parallel_insert: int = field(default=int(os.getenv("MAX_PARALLEL_INSERT", 2)))
```
### 实现机制
`apipeline_process_enqueue_documents` 方法中,使用信号量控制文档并发:
```python
# lightrag/lightrag.py - apipeline_process_enqueue_documents方法
async def process_document(
doc_id: str,
status_doc: DocProcessingStatus,
split_by_character: str | None,
split_by_character_only: bool,
pipeline_status: dict,
pipeline_status_lock: asyncio.Lock,
semaphore: asyncio.Semaphore, # 文档级别信号量
) -> None:
"""Process single document"""
async with semaphore: # 🔥 文档级别并发控制
# ... 处理单个文档的所有chunks
# 创建文档级别信号量
semaphore = asyncio.Semaphore(self.max_parallel_insert) # 默认2
# 为每个文档创建处理任务
doc_tasks = []
for doc_id, status_doc in to_process_docs.items():
doc_tasks.append(
process_document(
doc_id, status_doc, split_by_character, split_by_character_only,
pipeline_status, pipeline_status_lock, semaphore
)
)
# 等待所有文档处理完成
await asyncio.gather(*doc_tasks)
```
## 2. Chunk级别并发控制
**控制参数**`llm_model_max_async`
**关键点**每个文档都会独立创建自己的chunk信号量
```python
# lightrag/lightrag.py
llm_model_max_async: int = field(default=int(os.getenv("MAX_ASYNC", 4)))
```
### 实现机制
`extract_entities` 函数中,**每个文档独立创建**自己的chunk信号量
```python
# lightrag/operate.py - extract_entities函数
async def extract_entities(chunks: dict[str, TextChunkSchema], global_config: dict[str, str], ...):
# 🔥 关键:每个文档都会独立创建这个信号量!
llm_model_max_async = global_config.get("llm_model_max_async", 4)
semaphore = asyncio.Semaphore(llm_model_max_async) # 每个文档的chunk信号量
async def _process_with_semaphore(chunk):
async with semaphore: # 🔥 文档内部的chunk并发控制
return await _process_single_content(chunk)
# 为每个chunk创建任务
tasks = []
for c in ordered_chunks:
task = asyncio.create_task(_process_with_semaphore(c))
tasks.append(task)
# 等待所有chunk处理完成
done, pending = await asyncio.wait(tasks, return_when=asyncio.FIRST_EXCEPTION)
chunk_results = [task.result() for task in tasks]
return chunk_results
```
### 重要推论系统整体Chunk并发数
由于每个文档独立创建chunk信号量系统理论上的chunk并发数是
**理论Chunk并发数 = max_parallel_insert × llm_model_max_async**
例如:
- `max_parallel_insert = 2`同时处理2个文档
- `llm_model_max_async = 4`每个文档最多4个chunk并发
- 理论结果:最多 2 × 4 = 8个chunk同时处于"处理中"状态
## 3. LLM请求级别并发控制真正的瓶颈
**控制参数**`llm_model_max_async`(全局共享)
**关键**尽管可能有8个chunk在"处理中"但所有LLM请求共享同一个全局优先级队列
```python
# lightrag/lightrag.py - __post_init__方法
self.llm_model_func = priority_limit_async_func_call(self.llm_model_max_async)(
partial(
self.llm_model_func,
hashing_kv=hashing_kv,
**self.llm_model_kwargs,
)
)
# 🔥 全局LLM队列大小 = llm_model_max_async = 4
```
### 优先级队列实现
```python
# lightrag/utils.py - priority_limit_async_func_call函数
def priority_limit_async_func_call(max_size: int, max_queue_size: int = 1000):
def final_decro(func):
queue = asyncio.PriorityQueue(maxsize=max_queue_size)
tasks = set()
async def worker():
"""Worker that processes tasks in the priority queue"""
while not shutdown_event.is_set():
try:
priority, count, future, args, kwargs = await asyncio.wait_for(queue.get(), timeout=1.0)
result = await func(*args, **kwargs) # 🔥 实际LLM调用
if not future.done():
future.set_result(result)
except Exception as e:
# 错误处理...
finally:
queue.task_done()
# 🔥 创建固定数量的workermax_size个这是真正的并发限制
for _ in range(max_size):
task = asyncio.create_task(worker())
tasks.add(task)
```
## 4. Chunk内部处理机制串行
### 为什么是串行?
每个chunk内部的处理严格按照以下顺序串行执行
```python
# lightrag/operate.py - _process_single_content函数
async def _process_single_content(chunk_key_dp: tuple[str, TextChunkSchema]):
# 步骤1初始实体提取
hint_prompt = entity_extract_prompt.format(**{**context_base, "input_text": content})
final_result = await use_llm_func_with_cache(hint_prompt, use_llm_func, ...)
# 处理初始提取结果
maybe_nodes, maybe_edges = await _process_extraction_result(final_result, chunk_key, file_path)
# 步骤2Gleaning深挖阶段
for now_glean_index in range(entity_extract_max_gleaning):
# 🔥 串行等待gleaning结果
glean_result = await use_llm_func_with_cache(
continue_prompt, use_llm_func,
llm_response_cache=llm_response_cache,
history_messages=history, cache_type="extract"
)
# 处理gleaning结果
glean_nodes, glean_edges = await _process_extraction_result(glean_result, chunk_key, file_path)
# 合并结果...
# 步骤3判断是否继续循环
if now_glean_index == entity_extract_max_gleaning - 1:
break
# 🔥 串行等待循环判断结果
if_loop_result = await use_llm_func_with_cache(
if_loop_prompt, use_llm_func,
llm_response_cache=llm_response_cache,
history_messages=history, cache_type="extract"
)
if if_loop_result.strip().strip('"').strip("'").lower() != "yes":
break
return maybe_nodes, maybe_edges
```
## 5. 完整的并发层次图
![lightrag_indexing.png](..%2Fassets%2Flightrag_indexing.png)
## 6. 实际运行场景分析
### 场景1单文档多Chunk
假设有1个文档包含6个chunks
- 文档级别只有1个文档不受 `max_parallel_insert` 限制
- Chunk级别最多4个chunks同时处理`llm_model_max_async=4` 限制)
- LLM级别全局最多4个LLM请求并发
**预期行为**4个chunks并发处理剩余2个chunks等待。
### 场景2多文档多Chunk
假设有3个文档每个文档包含10个chunks
- 文档级别最多2个文档同时处理
- Chunk级别每个文档最多4个chunks同时处理
- 理论Chunk并发2 × 4 = 8个chunks同时处理
- 实际LLM并发只有4个LLM请求真正执行
**实际状态分布**
```
# 可能的系统状态:
文档1: 4个chunks"处理中"其中2个在执行LLM2个在等待LLM响应
文档2: 4个chunks"处理中"其中2个在执行LLM2个在等待LLM响应
文档3: 等待文档级别信号量
总计:
- 8个chunks处于"处理中"状态
- 4个LLM请求真正执行
- 4个chunks等待LLM响应
```
## 7. 性能优化建议
### 理解瓶颈
**真正的瓶颈是全局LLM队列而不是chunk信号量**
### 调整策略
**策略1提高LLM并发能力**
```bash
# 环境变量配置
export MAX_PARALLEL_INSERT=2 # 保持文档并发
export MAX_ASYNC=8 # 🔥 增加LLM请求并发数
```
**策略2平衡文档和LLM并发**
```python
rag = LightRAG(
max_parallel_insert=3, # 适度增加文档并发
llm_model_max_async=12, # 大幅增加LLM并发
entity_extract_max_gleaning=0, # 减少chunk内串行步骤
)
```
## 8. 总结
LightRAG的多文档并发处理机制的关键特点
### 并发层次
1. **文档间争抢**:受 `max_parallel_insert` 控制默认2个文档并发
2. **理论Chunk并发**:每个文档独立创建信号量,总数 = `max_parallel_insert × llm_model_max_async`
3. **实际LLM并发**所有chunk共享全局LLM队列`llm_model_max_async` 控制
4. **单Chunk内串行**每个chunk内的多个LLM请求严格串行执行
### 关键洞察
- **理论vs实际**系统可能有很多chunk在"处理中"但只有少数在真正执行LLM请求
- **真正瓶颈**全局LLM请求队列是性能瓶颈而不是chunk信号量
- **优化重点**:提高 `llm_model_max_async` 比增加 `max_parallel_insert` 更有效

View file

@ -1,6 +1,5 @@
### This is sample file of .env
### Server Configuration
HOST=0.0.0.0
PORT=9621
@ -51,13 +50,15 @@ OLLAMA_EMULATING_MODEL_TAG=latest
# MAX_TOKEN_RELATION_DESC=4000
# MAX_TOKEN_ENTITY_DESC=4000
### Entity and ralation summarization configuration
### Entity and relation summarization configuration
### Language: English, Chinese, French, German ...
SUMMARY_LANGUAGE=English
### Number of duplicated entities/edges to trigger LLM re-summary on merge ( at least 3 is recommented)
# FORCE_LLM_SUMMARY_ON_MERGE=6
### Max tokens for entity/relations description after merge
# MAX_TOKEN_SUMMARY=500
### Maximum number of entity extraction attempts for ambiguous content
# MAX_GLEANING=1
### Number of parallel processing documents(Less than MAX_ASYNC/2 is recommended)
# MAX_PARALLEL_INSERT=2
@ -74,16 +75,20 @@ TIMEOUT=240
TEMPERATURE=0
### Max concurrency requests of LLM
MAX_ASYNC=4
### Max tokens send to LLM for entity relation summaries (less than context size of the model)
### MAX_TOKENS: max tokens send to LLM for entity relation summaries (less than context size of the model)
### MAX_TOKENS: set as num_ctx option for Ollama by API Server
MAX_TOKENS=32768
### LLM Binding type: openai, ollama, lollms
### LLM Binding type: openai, ollama, lollms, azure_openai
LLM_BINDING=openai
LLM_MODEL=gpt-4o
LLM_BINDING_HOST=https://api.openai.com/v1
LLM_BINDING_API_KEY=your_api_key
### Optional for Azure
# AZURE_OPENAI_API_VERSION=2024-08-01-preview
# AZURE_OPENAI_DEPLOYMENT=gpt-4o
### Embedding Configuration
### Embedding Binding type: openai, ollama, lollms
### Embedding Binding type: openai, ollama, lollms, azure_openai
EMBEDDING_BINDING=ollama
EMBEDDING_MODEL=bge-m3:latest
EMBEDDING_DIM=1024
@ -91,26 +96,51 @@ EMBEDDING_BINDING_API_KEY=your_api_key
# If the embedding service is deployed within the same Docker stack, use host.docker.internal instead of localhost
EMBEDDING_BINDING_HOST=http://localhost:11434
### Num of chunks send to Embedding in single request
# EMBEDDING_BATCH_NUM=32
# EMBEDDING_BATCH_NUM=10
### Max concurrency requests for Embedding
# EMBEDDING_FUNC_MAX_ASYNC=16
### Maximum tokens sent to Embedding for each chunk (no longer in use?)
# MAX_EMBED_TOKENS=8192
### Optional for Azure
# AZURE_EMBEDDING_DEPLOYMENT=text-embedding-3-large
# AZURE_EMBEDDING_API_VERSION=2023-05-15
# AZURE_EMBEDDING_ENDPOINT=your_endpoint
# AZURE_EMBEDDING_API_KEY=your_api_key
###########################
### Data storage selection
###########################
### In-memory database with local file persistence(Recommended for small scale deployment)
# LIGHTRAG_KV_STORAGE=JsonKVStorage
# LIGHTRAG_DOC_STATUS_STORAGE=JsonDocStatusStorage
# LIGHTRAG_GRAPH_STORAGE=NetworkXStorage
# LIGHTRAG_VECTOR_STORAGE=NanoVectorDBStorage
# LIGHTRAG_VECTOR_STORAGE=FaissVectorDBStorage
### PostgreSQL
# LIGHTRAG_KV_STORAGE=PGKVStorage
# LIGHTRAG_VECTOR_STORAGE=PGVectorStorage
# LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage
# LIGHTRAG_GRAPH_STORAGE=PGGraphStorage
# LIGHTRAG_VECTOR_STORAGE=PGVectorStorage
### MongoDB (Vector storage only available on Atlas Cloud)
# LIGHTRAG_KV_STORAGE=MongoKVStorage
# LIGHTRAG_DOC_STATUS_STORAGE=MongoDocStatusStorage
# LIGHTRAG_GRAPH_STORAGE=MongoGraphStorage
# LIGHTRAG_VECTOR_STORAGE=MongoVectorDBStorage
### Redis Storage (Recommended for production deployment)
# LIGHTRAG_KV_STORAGE=RedisKVStorage
# LIGHTRAG_DOC_STATUS_STORAGE=RedisDocStatusStorage
### Vector Storage (Recommended for production deployment)
# LIGHTRAG_VECTOR_STORAGE=MilvusVectorDBStorage
# LIGHTRAG_VECTOR_STORAGE=QdrantVectorDBStorage
### Graph Storage (Recommended for production deployment)
# LIGHTRAG_GRAPH_STORAGE=Neo4JStorage
### TiDB Configuration (Deprecated)
# TIDB_HOST=localhost
# TIDB_PORT=4000
# TIDB_USER=your_username
# TIDB_PASSWORD='your_password'
# TIDB_DATABASE=your_database
### separating all data from difference Lightrag instances(deprecating)
# TIDB_WORKSPACE=default
####################################################################
### Default workspace for all storage types
### For the purpose of isolation of data for each LightRAG instance
### Valid characters: a-z, A-Z, 0-9, and _
####################################################################
# WORKSPACE=doc—
### PostgreSQL Configuration
POSTGRES_HOST=localhost
@ -119,30 +149,19 @@ POSTGRES_USER=your_username
POSTGRES_PASSWORD='your_password'
POSTGRES_DATABASE=your_database
POSTGRES_MAX_CONNECTIONS=12
### separating all data from difference Lightrag instances(deprecating)
# POSTGRES_WORKSPACE=default
# POSTGRES_WORKSPACE=forced_workspace_name
### Neo4j Configuration
NEO4J_URI=neo4j+s://xxxxxxxx.databases.neo4j.io
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD='your_password'
### Independent AGM Configuration(not for AMG embedded in PostreSQL)
# AGE_POSTGRES_DB=
# AGE_POSTGRES_USER=
# AGE_POSTGRES_PASSWORD=
# AGE_POSTGRES_HOST=
# AGE_POSTGRES_PORT=8529
# AGE Graph Name(apply to PostgreSQL and independent AGM)
### AGE_GRAPH_NAME is precated
# AGE_GRAPH_NAME=lightrag
# NEO4J_WORKSPACE=forced_workspace_name
### MongoDB Configuration
MONGO_URI=mongodb://root:root@localhost:27017/
#MONGO_URI=mongodb+srv://xxxx
MONGO_DATABASE=LightRAG
### separating all data from difference Lightrag instances(deprecating)
# MONGODB_GRAPH=false
# MONGODB_WORKSPACE=forced_workspace_name
### Milvus Configuration
MILVUS_URI=http://localhost:19530
@ -150,10 +169,13 @@ MILVUS_DB_NAME=lightrag
# MILVUS_USER=root
# MILVUS_PASSWORD=your_password
# MILVUS_TOKEN=your_token
# MILVUS_WORKSPACE=forced_workspace_name
### Qdrant
QDRANT_URL=http://localhost:16333
QDRANT_URL=http://localhost:6333
# QDRANT_API_KEY=your-api-key
# QDRANT_WORKSPACE=forced_workspace_name
### Redis
REDIS_URI=redis://localhost:6379
# REDIS_WORKSPACE=forced_workspace_name

View file

@ -1,6 +1,6 @@
import os
import json
from lightrag.utils import xml_to_json
import xml.etree.ElementTree as ET
from neo4j import GraphDatabase
# Constants
@ -14,6 +14,66 @@ NEO4J_USERNAME = "neo4j"
NEO4J_PASSWORD = "your_password"
def xml_to_json(xml_file):
try:
tree = ET.parse(xml_file)
root = tree.getroot()
# Print the root element's tag and attributes to confirm the file has been correctly loaded
print(f"Root element: {root.tag}")
print(f"Root attributes: {root.attrib}")
data = {"nodes": [], "edges": []}
# Use namespace
namespace = {"": "http://graphml.graphdrawing.org/xmlns"}
for node in root.findall(".//node", namespace):
node_data = {
"id": node.get("id").strip('"'),
"entity_type": node.find("./data[@key='d1']", namespace).text.strip('"')
if node.find("./data[@key='d1']", namespace) is not None
else "",
"description": node.find("./data[@key='d2']", namespace).text
if node.find("./data[@key='d2']", namespace) is not None
else "",
"source_id": node.find("./data[@key='d3']", namespace).text
if node.find("./data[@key='d3']", namespace) is not None
else "",
}
data["nodes"].append(node_data)
for edge in root.findall(".//edge", namespace):
edge_data = {
"source": edge.get("source").strip('"'),
"target": edge.get("target").strip('"'),
"weight": float(edge.find("./data[@key='d5']", namespace).text)
if edge.find("./data[@key='d5']", namespace) is not None
else 0.0,
"description": edge.find("./data[@key='d6']", namespace).text
if edge.find("./data[@key='d6']", namespace) is not None
else "",
"keywords": edge.find("./data[@key='d7']", namespace).text
if edge.find("./data[@key='d7']", namespace) is not None
else "",
"source_id": edge.find("./data[@key='d8']", namespace).text
if edge.find("./data[@key='d8']", namespace) is not None
else "",
}
data["edges"].append(edge_data)
# Print the number of nodes and edges found
print(f"Found {len(data['nodes'])} nodes and {len(data['edges'])} edges")
return data
except ET.ParseError as e:
print(f"Error parsing XML file: {e}")
return None
except Exception as e:
print(f"An error occurred: {e}")
return None
def convert_xml_to_json(xml_path, output_path):
"""Converts XML file to JSON and saves the output."""
if not os.path.exists(xml_path):

View file

@ -0,0 +1,229 @@
"""
Example of directly using modal processors
This example demonstrates how to use LightRAG's modal processors directly without going through MinerU.
"""
import asyncio
import argparse
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.kg.shared_storage import initialize_pipeline_status
from lightrag import LightRAG
from lightrag.utils import EmbeddingFunc
from raganything.modalprocessors import (
ImageModalProcessor,
TableModalProcessor,
EquationModalProcessor,
)
WORKING_DIR = "./rag_storage"
def get_llm_model_func(api_key: str, base_url: str = None):
return (
lambda prompt,
system_prompt=None,
history_messages=[],
**kwargs: openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key=api_key,
base_url=base_url,
**kwargs,
)
)
def get_vision_model_func(api_key: str, base_url: str = None):
return (
lambda prompt,
system_prompt=None,
history_messages=[],
image_data=None,
**kwargs: openai_complete_if_cache(
"gpt-4o",
"",
system_prompt=None,
history_messages=[],
messages=[
{"role": "system", "content": system_prompt} if system_prompt else None,
{
"role": "user",
"content": [
{"type": "text", "text": prompt},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{image_data}"
},
},
],
}
if image_data
else {"role": "user", "content": prompt},
],
api_key=api_key,
base_url=base_url,
**kwargs,
)
if image_data
else openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key=api_key,
base_url=base_url,
**kwargs,
)
)
async def process_image_example(lightrag: LightRAG, vision_model_func):
"""Example of processing an image"""
# Create image processor
image_processor = ImageModalProcessor(
lightrag=lightrag, modal_caption_func=vision_model_func
)
# Prepare image content
image_content = {
"img_path": "image.jpg",
"img_caption": ["Example image caption"],
"img_footnote": ["Example image footnote"],
}
# Process image
description, entity_info = await image_processor.process_multimodal_content(
modal_content=image_content,
content_type="image",
file_path="image_example.jpg",
entity_name="Example Image",
)
print("Image Processing Results:")
print(f"Description: {description}")
print(f"Entity Info: {entity_info}")
async def process_table_example(lightrag: LightRAG, llm_model_func):
"""Example of processing a table"""
# Create table processor
table_processor = TableModalProcessor(
lightrag=lightrag, modal_caption_func=llm_model_func
)
# Prepare table content
table_content = {
"table_body": """
| Name | Age | Occupation |
|------|-----|------------|
| John | 25 | Engineer |
| Mary | 30 | Designer |
""",
"table_caption": ["Employee Information Table"],
"table_footnote": ["Data updated as of 2024"],
}
# Process table
description, entity_info = await table_processor.process_multimodal_content(
modal_content=table_content,
content_type="table",
file_path="table_example.md",
entity_name="Employee Table",
)
print("\nTable Processing Results:")
print(f"Description: {description}")
print(f"Entity Info: {entity_info}")
async def process_equation_example(lightrag: LightRAG, llm_model_func):
"""Example of processing a mathematical equation"""
# Create equation processor
equation_processor = EquationModalProcessor(
lightrag=lightrag, modal_caption_func=llm_model_func
)
# Prepare equation content
equation_content = {"text": "E = mc^2", "text_format": "LaTeX"}
# Process equation
description, entity_info = await equation_processor.process_multimodal_content(
modal_content=equation_content,
content_type="equation",
file_path="equation_example.txt",
entity_name="Mass-Energy Equivalence",
)
print("\nEquation Processing Results:")
print(f"Description: {description}")
print(f"Entity Info: {entity_info}")
async def initialize_rag(api_key: str, base_url: str = None):
rag = LightRAG(
working_dir=WORKING_DIR,
embedding_func=EmbeddingFunc(
embedding_dim=3072,
max_token_size=8192,
func=lambda texts: openai_embed(
texts,
model="text-embedding-3-large",
api_key=api_key,
base_url=base_url,
),
),
llm_model_func=lambda prompt,
system_prompt=None,
history_messages=[],
**kwargs: openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key=api_key,
base_url=base_url,
**kwargs,
),
)
await rag.initialize_storages()
await initialize_pipeline_status()
return rag
def main():
"""Main function to run the example"""
parser = argparse.ArgumentParser(description="Modal Processors Example")
parser.add_argument("--api-key", required=True, help="OpenAI API key")
parser.add_argument("--base-url", help="Optional base URL for API")
parser.add_argument(
"--working-dir", "-w", default=WORKING_DIR, help="Working directory path"
)
args = parser.parse_args()
# Run examples
asyncio.run(main_async(args.api_key, args.base_url))
async def main_async(api_key: str, base_url: str = None):
# Initialize LightRAG
lightrag = await initialize_rag(api_key, base_url)
# Get model functions
llm_model_func = get_llm_model_func(api_key, base_url)
vision_model_func = get_vision_model_func(api_key, base_url)
# Run examples
await process_image_example(lightrag, vision_model_func)
await process_table_example(lightrag, llm_model_func)
await process_equation_example(lightrag, llm_model_func)
if __name__ == "__main__":
main()

View file

@ -0,0 +1,286 @@
#!/usr/bin/env python
"""
Example script demonstrating the integration of MinerU parser with RAGAnything
This example shows how to:
1. Process parsed documents with RAGAnything
2. Perform multimodal queries on the processed documents
3. Handle different types of content (text, images, tables)
"""
import os
import argparse
import asyncio
import logging
import logging.config
from pathlib import Path
# Add project root directory to Python path
import sys
sys.path.append(str(Path(__file__).parent.parent))
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.utils import EmbeddingFunc, logger, set_verbose_debug
from raganything import RAGAnything, RAGAnythingConfig
def configure_logging():
"""Configure logging for the application"""
# Get log directory path from environment variable or use current directory
log_dir = os.getenv("LOG_DIR", os.getcwd())
log_file_path = os.path.abspath(os.path.join(log_dir, "raganything_example.log"))
print(f"\nRAGAnything example log file: {log_file_path}\n")
os.makedirs(os.path.dirname(log_dir), exist_ok=True)
# Get log file max size and backup count from environment variables
log_max_bytes = int(os.getenv("LOG_MAX_BYTES", 10485760)) # Default 10MB
log_backup_count = int(os.getenv("LOG_BACKUP_COUNT", 5)) # Default 5 backups
logging.config.dictConfig(
{
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"default": {
"format": "%(levelname)s: %(message)s",
},
"detailed": {
"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s",
},
},
"handlers": {
"console": {
"formatter": "default",
"class": "logging.StreamHandler",
"stream": "ext://sys.stderr",
},
"file": {
"formatter": "detailed",
"class": "logging.handlers.RotatingFileHandler",
"filename": log_file_path,
"maxBytes": log_max_bytes,
"backupCount": log_backup_count,
"encoding": "utf-8",
},
},
"loggers": {
"lightrag": {
"handlers": ["console", "file"],
"level": "INFO",
"propagate": False,
},
},
}
)
# Set the logger level to INFO
logger.setLevel(logging.INFO)
# Enable verbose debug if needed
set_verbose_debug(os.getenv("VERBOSE", "false").lower() == "true")
async def process_with_rag(
file_path: str,
output_dir: str,
api_key: str,
base_url: str = None,
working_dir: str = None,
):
"""
Process document with RAGAnything
Args:
file_path: Path to the document
output_dir: Output directory for RAG results
api_key: OpenAI API key
base_url: Optional base URL for API
working_dir: Working directory for RAG storage
"""
try:
# Create RAGAnything configuration
config = RAGAnythingConfig(
working_dir=working_dir or "./rag_storage",
mineru_parse_method="auto",
enable_image_processing=True,
enable_table_processing=True,
enable_equation_processing=True,
)
# Define LLM model function
def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
return openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key=api_key,
base_url=base_url,
**kwargs,
)
# Define vision model function for image processing
def vision_model_func(
prompt, system_prompt=None, history_messages=[], image_data=None, **kwargs
):
if image_data:
return openai_complete_if_cache(
"gpt-4o",
"",
system_prompt=None,
history_messages=[],
messages=[
{"role": "system", "content": system_prompt}
if system_prompt
else None,
{
"role": "user",
"content": [
{"type": "text", "text": prompt},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{image_data}"
},
},
],
}
if image_data
else {"role": "user", "content": prompt},
],
api_key=api_key,
base_url=base_url,
**kwargs,
)
else:
return llm_model_func(prompt, system_prompt, history_messages, **kwargs)
# Define embedding function
embedding_func = EmbeddingFunc(
embedding_dim=3072,
max_token_size=8192,
func=lambda texts: openai_embed(
texts,
model="text-embedding-3-large",
api_key=api_key,
base_url=base_url,
),
)
# Initialize RAGAnything with new dataclass structure
rag = RAGAnything(
config=config,
llm_model_func=llm_model_func,
vision_model_func=vision_model_func,
embedding_func=embedding_func,
)
# Process document
await rag.process_document_complete(
file_path=file_path, output_dir=output_dir, parse_method="auto"
)
# Example queries - demonstrating different query approaches
logger.info("\nQuerying processed document:")
# 1. Pure text queries using aquery()
text_queries = [
"What is the main content of the document?",
"What are the key topics discussed?",
]
for query in text_queries:
logger.info(f"\n[Text Query]: {query}")
result = await rag.aquery(query, mode="hybrid")
logger.info(f"Answer: {result}")
# 2. Multimodal query with specific multimodal content using aquery_with_multimodal()
logger.info(
"\n[Multimodal Query]: Analyzing performance data in context of document"
)
multimodal_result = await rag.aquery_with_multimodal(
"Compare this performance data with any similar results mentioned in the document",
multimodal_content=[
{
"type": "table",
"table_data": """Method,Accuracy,Processing_Time
RAGAnything,95.2%,120ms
Traditional_RAG,87.3%,180ms
Baseline,82.1%,200ms""",
"table_caption": "Performance comparison results",
}
],
mode="hybrid",
)
logger.info(f"Answer: {multimodal_result}")
# 3. Another multimodal query with equation content
logger.info("\n[Multimodal Query]: Mathematical formula analysis")
equation_result = await rag.aquery_with_multimodal(
"Explain this formula and relate it to any mathematical concepts in the document",
multimodal_content=[
{
"type": "equation",
"latex": "F1 = 2 \\cdot \\frac{precision \\cdot recall}{precision + recall}",
"equation_caption": "F1-score calculation formula",
}
],
mode="hybrid",
)
logger.info(f"Answer: {equation_result}")
except Exception as e:
logger.error(f"Error processing with RAG: {str(e)}")
import traceback
logger.error(traceback.format_exc())
def main():
"""Main function to run the example"""
parser = argparse.ArgumentParser(description="MinerU RAG Example")
parser.add_argument("file_path", help="Path to the document to process")
parser.add_argument(
"--working_dir", "-w", default="./rag_storage", help="Working directory path"
)
parser.add_argument(
"--output", "-o", default="./output", help="Output directory path"
)
parser.add_argument(
"--api-key",
default=os.getenv("OPENAI_API_KEY"),
help="OpenAI API key (defaults to OPENAI_API_KEY env var)",
)
parser.add_argument("--base-url", help="Optional base URL for API")
args = parser.parse_args()
# Check if API key is provided
if not args.api_key:
logger.error("Error: OpenAI API key is required")
logger.error("Set OPENAI_API_KEY environment variable or use --api-key option")
return
# Create output directory if specified
if args.output:
os.makedirs(args.output, exist_ok=True)
# Process with RAG
asyncio.run(
process_with_rag(
args.file_path, args.output, args.api_key, args.base_url, args.working_dir
)
)
if __name__ == "__main__":
# Configure logging first
configure_logging()
print("RAGAnything Example")
print("=" * 30)
print("Processing document with multimodal RAG pipeline")
print("=" * 30)
main()

View file

@ -52,18 +52,23 @@ async def copy_from_postgres_to_json():
embedding_func=None,
)
# Get all cache data using the new flattened structure
all_data = await from_llm_response_cache.get_all()
# Convert flattened data to hierarchical structure for JsonKVStorage
kv = {}
for c_id in await from_llm_response_cache.all_keys():
print(f"Copying {c_id}")
workspace = c_id["workspace"]
mode = c_id["mode"]
_id = c_id["id"]
postgres_db.workspace = workspace
obj = await from_llm_response_cache.get_by_mode_and_id(mode, _id)
if mode not in kv:
kv[mode] = {}
kv[mode][_id] = obj[_id]
print(f"Object {obj}")
for flattened_key, cache_entry in all_data.items():
# Parse flattened key: {mode}:{cache_type}:{hash}
parts = flattened_key.split(":", 2)
if len(parts) == 3:
mode, cache_type, hash_value = parts
if mode not in kv:
kv[mode] = {}
kv[mode][hash_value] = cache_entry
print(f"Copying {flattened_key} -> {mode}[{hash_value}]")
else:
print(f"Skipping invalid key format: {flattened_key}")
await to_llm_response_cache.upsert(kv)
await to_llm_response_cache.index_done_callback()
print("Mission accomplished!")
@ -85,13 +90,24 @@ async def copy_from_json_to_postgres():
db=postgres_db,
)
for mode in await from_llm_response_cache.all_keys():
print(f"Copying {mode}")
caches = await from_llm_response_cache.get_by_id(mode)
for k, v in caches.items():
item = {mode: {k: v}}
print(f"\tCopying {item}")
await to_llm_response_cache.upsert(item)
# Get all cache data from JsonKVStorage (hierarchical structure)
all_data = await from_llm_response_cache.get_all()
# Convert hierarchical data to flattened structure for PGKVStorage
flattened_data = {}
for mode, mode_data in all_data.items():
print(f"Processing mode: {mode}")
for hash_value, cache_entry in mode_data.items():
# Determine cache_type from cache entry or use default
cache_type = cache_entry.get("cache_type", "extract")
# Create flattened key: {mode}:{cache_type}:{hash}
flattened_key = f"{mode}:{cache_type}:{hash_value}"
flattened_data[flattened_key] = cache_entry
print(f"\tConverting {mode}[{hash_value}] -> {flattened_key}")
# Upsert the flattened data
await to_llm_response_cache.upsert(flattened_data)
print("Mission accomplished!")
if __name__ == "__main__":

View file

@ -53,7 +53,6 @@ async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwar
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
**kwargs,
)
return response
except Exception as e:

View file

@ -0,0 +1,155 @@
import os
from lightrag import LightRAG, QueryParam
from lightrag.llm.llama_index_impl import (
llama_index_complete_if_cache,
llama_index_embed,
)
from lightrag.utils import EmbeddingFunc
from llama_index.llms.litellm import LiteLLM
from llama_index.embeddings.litellm import LiteLLMEmbedding
import asyncio
import nest_asyncio
nest_asyncio.apply()
from lightrag.kg.shared_storage import initialize_pipeline_status
# Configure working directory
WORKING_DIR = "./index_default"
print(f"WORKING_DIR: {WORKING_DIR}")
# Model configuration
LLM_MODEL = os.environ.get("LLM_MODEL", "gemma-3-4b")
print(f"LLM_MODEL: {LLM_MODEL}")
EMBEDDING_MODEL = os.environ.get("EMBEDDING_MODEL", "arctic-embed")
print(f"EMBEDDING_MODEL: {EMBEDDING_MODEL}")
EMBEDDING_MAX_TOKEN_SIZE = int(os.environ.get("EMBEDDING_MAX_TOKEN_SIZE", 8192))
print(f"EMBEDDING_MAX_TOKEN_SIZE: {EMBEDDING_MAX_TOKEN_SIZE}")
# LiteLLM configuration
LITELLM_URL = os.environ.get("LITELLM_URL", "http://localhost:4000")
print(f"LITELLM_URL: {LITELLM_URL}")
LITELLM_KEY = os.environ.get("LITELLM_KEY", "sk-4JdvGFKqSA3S0k_5p0xufw")
if not os.path.exists(WORKING_DIR):
os.mkdir(WORKING_DIR)
# Initialize LLM function
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
try:
# Initialize LiteLLM if not in kwargs
if "llm_instance" not in kwargs:
llm_instance = LiteLLM(
model=f"openai/{LLM_MODEL}", # Format: "provider/model_name"
api_base=LITELLM_URL,
api_key=LITELLM_KEY,
temperature=0.7,
)
kwargs["llm_instance"] = llm_instance
chat_kwargs = {}
chat_kwargs["litellm_params"] = {
"metadata": {
"opik": {
"project_name": "lightrag_llamaindex_litellm_opik_demo",
"tags": ["lightrag", "litellm"],
}
}
}
response = await llama_index_complete_if_cache(
kwargs["llm_instance"],
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
chat_kwargs=chat_kwargs,
)
return response
except Exception as e:
print(f"LLM request failed: {str(e)}")
raise
# Initialize embedding function
async def embedding_func(texts):
try:
embed_model = LiteLLMEmbedding(
model_name=f"openai/{EMBEDDING_MODEL}",
api_base=LITELLM_URL,
api_key=LITELLM_KEY,
)
return await llama_index_embed(texts, embed_model=embed_model)
except Exception as e:
print(f"Embedding failed: {str(e)}")
raise
# Get embedding dimension
async def get_embedding_dim():
test_text = ["This is a test sentence."]
embedding = await embedding_func(test_text)
embedding_dim = embedding.shape[1]
print(f"embedding_dim={embedding_dim}")
return embedding_dim
async def initialize_rag():
embedding_dimension = await get_embedding_dim()
rag = LightRAG(
working_dir=WORKING_DIR,
llm_model_func=llm_model_func,
embedding_func=EmbeddingFunc(
embedding_dim=embedding_dimension,
max_token_size=EMBEDDING_MAX_TOKEN_SIZE,
func=embedding_func,
),
)
await rag.initialize_storages()
await initialize_pipeline_status()
return rag
def main():
# Initialize RAG instance
rag = asyncio.run(initialize_rag())
# Insert example text
with open("./book.txt", "r", encoding="utf-8") as f:
rag.insert(f.read())
# Test different query modes
print("\nNaive Search:")
print(
rag.query(
"What are the top themes in this story?", param=QueryParam(mode="naive")
)
)
print("\nLocal Search:")
print(
rag.query(
"What are the top themes in this story?", param=QueryParam(mode="local")
)
)
print("\nGlobal Search:")
print(
rag.query(
"What are the top themes in this story?", param=QueryParam(mode="global")
)
)
print("\nHybrid Search:")
print(
rag.query(
"What are the top themes in this story?", param=QueryParam(mode="hybrid")
)
)
if __name__ == "__main__":
main()

View file

@ -18,12 +18,12 @@ os.environ["REDIS_URI"] = "redis://localhost:6379"
# neo4j
BATCH_SIZE_NODES = 500
BATCH_SIZE_EDGES = 100
os.environ["NEO4J_URI"] = "bolt://117.50.173.35:7687"
os.environ["NEO4J_URI"] = "neo4j://localhost:7687"
os.environ["NEO4J_USERNAME"] = "neo4j"
os.environ["NEO4J_PASSWORD"] = "12345678"
# milvus
os.environ["MILVUS_URI"] = "http://117.50.173.35:19530"
os.environ["MILVUS_URI"] = "http://localhost:19530"
os.environ["MILVUS_USER"] = "root"
os.environ["MILVUS_PASSWORD"] = "Milvus"
os.environ["MILVUS_DB_NAME"] = "lightrag"

191
k8s-deploy/README-zh.md Normal file
View file

@ -0,0 +1,191 @@
# LightRAG Helm Chart
这是用于在Kubernetes集群上部署LightRAG服务的Helm chart。
LightRAG有两种推荐的部署方法
1. **轻量级部署**:使用内置轻量级存储,适合测试和小规模使用
2. **生产环境部署**使用外部数据库如PostgreSQL和Neo4J适合生产环境和大规模使用
> 如果您想要部署过程的视频演示,可以查看[bilibili](https://www.bilibili.com/video/BV1bUJazBEq2/)上的视频教程,对于喜欢视觉指导的用户可能会有所帮助。
## 前提条件
确保安装和配置了以下工具:
* **Kubernetes集群**
* 需要一个运行中的Kubernetes集群。
* 对于本地开发或演示,可以使用[Minikube](https://minikube.sigs.k8s.io/docs/start/)需要≥2个CPU≥4GB内存以及Docker/VM驱动支持
* 任何标准的云端或本地Kubernetes集群EKS、GKE、AKS等也可以使用。
* **kubectl**
* Kubernetes命令行工具用于管理集群。
* 按照官方指南安装:[安装和设置kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl)。
* **Helm**v3.x+
* Kubernetes包管理器用于安装LightRAG。
* 通过官方指南安装:[安装Helm](https://helm.sh/docs/intro/install/)。
## 轻量级部署(无需外部数据库)
这种部署选项使用内置的轻量级存储组件,非常适合测试、演示或小规模使用场景。无需外部数据库配置。
您可以使用提供的便捷脚本或直接使用Helm命令部署LightRAG。两种方法都配置了`lightrag/values.yaml`文件中定义的相同环境变量。
### 使用便捷脚本(推荐):
```bash
export OPENAI_API_BASE=<您的OPENAI_API_BASE>
export OPENAI_API_KEY=<您的OPENAI_API_KEY>
bash ./install_lightrag_dev.sh
```
### 或直接使用Helm
```bash
# 您可以覆盖任何想要的环境参数
helm upgrade --install lightrag ./lightrag \
--namespace rag \
--set-string env.LIGHTRAG_KV_STORAGE=JsonKVStorage \
--set-string env.LIGHTRAG_VECTOR_STORAGE=NanoVectorDBStorage \
--set-string env.LIGHTRAG_GRAPH_STORAGE=NetworkXStorage \
--set-string env.LIGHTRAG_DOC_STATUS_STORAGE=JsonDocStatusStorage \
--set-string env.LLM_BINDING=openai \
--set-string env.LLM_MODEL=gpt-4o-mini \
--set-string env.LLM_BINDING_HOST=$OPENAI_API_BASE \
--set-string env.LLM_BINDING_API_KEY=$OPENAI_API_KEY \
--set-string env.EMBEDDING_BINDING=openai \
--set-string env.EMBEDDING_MODEL=text-embedding-ada-002 \
--set-string env.EMBEDDING_DIM=1536 \
--set-string env.EMBEDDING_BINDING_API_KEY=$OPENAI_API_KEY
```
### 访问应用程序:
```bash
# 1. 在终端中运行此端口转发命令:
kubectl --namespace rag port-forward svc/lightrag-dev 9621:9621
# 2. 当命令运行时,打开浏览器并导航到:
# http://localhost:9621
```
## 生产环境部署(使用外部数据库)
### 1. 安装数据库
> 如果您已经准备好了数据库,可以跳过此步骤。详细信息可以在:[README.md](databases%2FREADME.md)中找到。
我们推荐使用KubeBlocks进行数据库部署。KubeBlocks是一个云原生数据库操作符可以轻松地在Kubernetes上以生产规模运行任何数据库。
首先安装KubeBlocks和KubeBlocks-Addons如已安装可跳过
```bash
bash ./databases/01-prepare.sh
```
然后安装所需的数据库。默认情况下这将安装PostgreSQL和Neo4J但您可以修改[00-config.sh](databases%2F00-config.sh)以根据需要选择不同的数据库:
```bash
bash ./databases/02-install-database.sh
```
验证集群是否正在运行:
```bash
kubectl get clusters -n rag
# 预期输出:
# NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE
# neo4j-cluster Delete Running 39s
# pg-cluster postgresql Delete Running 42s
kubectl get po -n rag
# 预期输出:
# NAME READY STATUS RESTARTS AGE
# neo4j-cluster-neo4j-0 1/1 Running 0 58s
# pg-cluster-postgresql-0 4/4 Running 0 59s
# pg-cluster-postgresql-1 4/4 Running 0 59s
```
### 2. 安装LightRAG
LightRAG及其数据库部署在同一Kubernetes集群中使配置变得简单。
安装脚本会自动从KubeBlocks获取所有数据库连接信息无需手动设置数据库凭证
```bash
export OPENAI_API_BASE=<您的OPENAI_API_BASE>
export OPENAI_API_KEY=<您的OPENAI_API_KEY>
bash ./install_lightrag.sh
```
### 访问应用程序:
```bash
# 1. 在终端中运行此端口转发命令:
kubectl --namespace rag port-forward svc/lightrag 9621:9621
# 2. 当命令运行时,打开浏览器并导航到:
# http://localhost:9621
```
## 配置
### 修改资源配置
您可以通过修改`values.yaml`文件来配置LightRAG的资源使用
```yaml
replicaCount: 1 # 副本数量,可根据需要增加
resources:
limits:
cpu: 1000m # CPU限制可根据需要调整
memory: 2Gi # 内存限制,可根据需要调整
requests:
cpu: 500m # CPU请求可根据需要调整
memory: 1Gi # 内存请求,可根据需要调整
```
### 修改持久存储
```yaml
persistence:
enabled: true
ragStorage:
size: 10Gi # RAG存储大小可根据需要调整
inputs:
size: 5Gi # 输入数据存储大小,可根据需要调整
```
### 配置环境变量
`values.yaml`文件中的`env`部分包含LightRAG的所有环境配置类似于`.env`文件。当使用helm upgrade或helm install命令时可以使用--set标志覆盖这些变量。
```yaml
env:
HOST: 0.0.0.0
PORT: 9621
WEBUI_TITLE: Graph RAG Engine
WEBUI_DESCRIPTION: Simple and Fast Graph Based RAG System
# LLM配置
LLM_BINDING: openai # LLM服务提供商
LLM_MODEL: gpt-4o-mini # LLM模型
LLM_BINDING_HOST: # API基础URL可选
LLM_BINDING_API_KEY: # API密钥
# 嵌入配置
EMBEDDING_BINDING: openai # 嵌入服务提供商
EMBEDDING_MODEL: text-embedding-ada-002 # 嵌入模型
EMBEDDING_DIM: 1536 # 嵌入维度
EMBEDDING_BINDING_API_KEY: # API密钥
# 存储配置
LIGHTRAG_KV_STORAGE: PGKVStorage # 键值存储类型
LIGHTRAG_VECTOR_STORAGE: PGVectorStorage # 向量存储类型
LIGHTRAG_GRAPH_STORAGE: Neo4JStorage # 图存储类型
LIGHTRAG_DOC_STATUS_STORAGE: PGDocStatusStorage # 文档状态存储类型
```
## 注意事项
- 在部署前确保设置了所有必要的环境变量API密钥和数据库密码
- 出于安全原因建议使用环境变量传递敏感信息而不是直接写入脚本或values文件
- 轻量级部署适合测试和小规模使用,但数据持久性和性能可能有限
- 生产环境部署PostgreSQL + Neo4J推荐用于生产环境和大规模使用
- 有关更多自定义配置请参考LightRAG官方文档

191
k8s-deploy/README.md Normal file
View file

@ -0,0 +1,191 @@
# LightRAG Helm Chart
This is the Helm chart for LightRAG, used to deploy LightRAG services on a Kubernetes cluster.
There are two recommended deployment methods for LightRAG:
1. **Lightweight Deployment**: Using built-in lightweight storage, suitable for testing and small-scale usage
2. **Production Deployment**: Using external databases (such as PostgreSQL and Neo4J), suitable for production environments and large-scale usage
> If you'd like a video walkthrough of the deployment process, feel free to check out this optional [video tutorial](https://youtu.be/JW1z7fzeKTw?si=vPzukqqwmdzq9Q4q) on YouTube. It might help clarify some steps for those who prefer visual guidance.
## Prerequisites
Make sure the following tools are installed and configured:
* **Kubernetes cluster**
* A running Kubernetes cluster is required.
* For local development or demos you can use [Minikube](https://minikube.sigs.k8s.io/docs/start/) (needs ≥ 2 CPUs, ≥ 4 GB RAM, and Docker/VM-driver support).
* Any standard cloud or on-premises Kubernetes cluster (EKS, GKE, AKS, etc.) also works.
* **kubectl**
* The Kubernetes command-line tool for managing your cluster.
* Follow the official guide: [Install and Set Up kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl).
* **Helm** (v3.x+)
* Kubernetes package manager used to install LightRAG.
* Install it via the official instructions: [Installing Helm](https://helm.sh/docs/intro/install/).
## Lightweight Deployment (No External Databases Required)
This deployment option uses built-in lightweight storage components that are perfect for testing, demos, or small-scale usage scenarios. No external database configuration is required.
You can deploy LightRAG using either the provided convenience script or direct Helm commands. Both methods configure the same environment variables defined in the `lightrag/values.yaml` file.
### Using the convenience script (recommended):
```bash
export OPENAI_API_BASE=<YOUR_OPENAI_API_BASE>
export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
bash ./install_lightrag_dev.sh
```
### Or using Helm directly:
```bash
# You can override any env param you want
helm upgrade --install lightrag ./lightrag \
--namespace rag \
--set-string env.LIGHTRAG_KV_STORAGE=JsonKVStorage \
--set-string env.LIGHTRAG_VECTOR_STORAGE=NanoVectorDBStorage \
--set-string env.LIGHTRAG_GRAPH_STORAGE=NetworkXStorage \
--set-string env.LIGHTRAG_DOC_STATUS_STORAGE=JsonDocStatusStorage \
--set-string env.LLM_BINDING=openai \
--set-string env.LLM_MODEL=gpt-4o-mini \
--set-string env.LLM_BINDING_HOST=$OPENAI_API_BASE \
--set-string env.LLM_BINDING_API_KEY=$OPENAI_API_KEY \
--set-string env.EMBEDDING_BINDING=openai \
--set-string env.EMBEDDING_MODEL=text-embedding-ada-002 \
--set-string env.EMBEDDING_DIM=1536 \
--set-string env.EMBEDDING_BINDING_API_KEY=$OPENAI_API_KEY
```
### Accessing the application:
```bash
# 1. Run this port-forward command in your terminal:
kubectl --namespace rag port-forward svc/lightrag-dev 9621:9621
# 2. While the command is running, open your browser and navigate to:
# http://localhost:9621
```
## Production Deployment (Using External Databases)
### 1. Install Databases
> You can skip this step if you've already prepared databases. Detailed information can be found in: [README.md](databases%2FREADME.md).
We recommend KubeBlocks for database deployment. KubeBlocks is a cloud-native database operator that makes it easy to run any database on Kubernetes at production scale.
First, install KubeBlocks and KubeBlocks-Addons (skip if already installed):
```bash
bash ./databases/01-prepare.sh
```
Then install the required databases. By default, this will install PostgreSQL and Neo4J, but you can modify [00-config.sh](databases%2F00-config.sh) to select different databases based on your needs:
```bash
bash ./databases/02-install-database.sh
```
Verify that the clusters are up and running:
```bash
kubectl get clusters -n rag
# Expected output:
# NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE
# neo4j-cluster Delete Running 39s
# pg-cluster postgresql Delete Running 42s
kubectl get po -n rag
# Expected output:
# NAME READY STATUS RESTARTS AGE
# neo4j-cluster-neo4j-0 1/1 Running 0 58s
# pg-cluster-postgresql-0 4/4 Running 0 59s
# pg-cluster-postgresql-1 4/4 Running 0 59s
```
### 2. Install LightRAG
LightRAG and its databases are deployed within the same Kubernetes cluster, making configuration straightforward.
The installation script automatically retrieves all database connection information from KubeBlocks, eliminating the need to manually set database credentials:
```bash
export OPENAI_API_BASE=<YOUR_OPENAI_API_BASE>
export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
bash ./install_lightrag.sh
```
### Accessing the application:
```bash
# 1. Run this port-forward command in your terminal:
kubectl --namespace rag port-forward svc/lightrag 9621:9621
# 2. While the command is running, open your browser and navigate to:
# http://localhost:9621
```
## Configuration
### Modifying Resource Configuration
You can configure LightRAG's resource usage by modifying the `values.yaml` file:
```yaml
replicaCount: 1 # Number of replicas, can be increased as needed
resources:
limits:
cpu: 1000m # CPU limit, can be adjusted as needed
memory: 2Gi # Memory limit, can be adjusted as needed
requests:
cpu: 500m # CPU request, can be adjusted as needed
memory: 1Gi # Memory request, can be adjusted as needed
```
### Modifying Persistent Storage
```yaml
persistence:
enabled: true
ragStorage:
size: 10Gi # RAG storage size, can be adjusted as needed
inputs:
size: 5Gi # Input data storage size, can be adjusted as needed
```
### Configuring Environment Variables
The `env` section in the `values.yaml` file contains all environment configurations for LightRAG, similar to a `.env` file. When using helm upgrade or helm install commands, you can override these with the --set flag.
```yaml
env:
HOST: 0.0.0.0
PORT: 9621
WEBUI_TITLE: Graph RAG Engine
WEBUI_DESCRIPTION: Simple and Fast Graph Based RAG System
# LLM Configuration
LLM_BINDING: openai # LLM service provider
LLM_MODEL: gpt-4o-mini # LLM model
LLM_BINDING_HOST: # API base URL (optional)
LLM_BINDING_API_KEY: # API key
# Embedding Configuration
EMBEDDING_BINDING: openai # Embedding service provider
EMBEDDING_MODEL: text-embedding-ada-002 # Embedding model
EMBEDDING_DIM: 1536 # Embedding dimension
EMBEDDING_BINDING_API_KEY: # API key
# Storage Configuration
LIGHTRAG_KV_STORAGE: PGKVStorage # Key-value storage type
LIGHTRAG_VECTOR_STORAGE: PGVectorStorage # Vector storage type
LIGHTRAG_GRAPH_STORAGE: Neo4JStorage # Graph storage type
LIGHTRAG_DOC_STATUS_STORAGE: PGDocStatusStorage # Document status storage type
```
## Notes
- Ensure all necessary environment variables (API keys and database passwords) are set before deployment
- For security reasons, it's recommended to pass sensitive information using environment variables rather than writing them directly in scripts or values files
- Lightweight deployment is suitable for testing and small-scale usage, but data persistence and performance may be limited
- Production deployment (PostgreSQL + Neo4J) is recommended for production environments and large-scale usage
- For more customized configurations, please refer to the official LightRAG documentation

View file

@ -0,0 +1,21 @@
#!/bin/bash
# Get the directory where this script is located
DATABASE_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
source "$DATABASE_SCRIPT_DIR/scripts/common.sh"
# Namespace configuration
NAMESPACE="rag"
# version
KB_VERSION="1.0.0-beta.48"
ADDON_CLUSTER_CHART_VERSION="1.0.0-alpha.0"
# Helm repository
HELM_REPO="https://apecloud.github.io/helm-charts"
# Set to true to enable the database, false to disable
ENABLE_POSTGRESQL=true
ENABLE_REDIS=false
ENABLE_QDRANT=false
ENABLE_NEO4J=true
ENABLE_ELASTICSEARCH=false
ENABLE_MONGODB=false

View file

@ -0,0 +1,33 @@
#!/bin/bash
# Get the directory where this script is located
DATABASE_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
# Load configuration file
source "$DATABASE_SCRIPT_DIR/00-config.sh"
check_dependencies
# Check if KubeBlocks is already installed, install it if it is not.
source "$DATABASE_SCRIPT_DIR/install-kubeblocks.sh"
# Create namespaces
print "Creating namespaces..."
kubectl create namespace $NAMESPACE 2>/dev/null || true
# Install database addons
print "Installing KubeBlocks database addons..."
# Add and update Helm repository
print "Adding and updating KubeBlocks Helm repository..."
helm repo add kubeblocks $HELM_REPO
helm repo update
# Install database addons based on configuration
[ "$ENABLE_POSTGRESQL" = true ] && print "Installing PostgreSQL addon..." && helm upgrade --install kb-addon-postgresql kubeblocks/postgresql --namespace kb-system --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_REDIS" = true ] && print "Installing Redis addon..." && helm upgrade --install kb-addon-redis kubeblocks/redis --namespace kb-system --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_ELASTICSEARCH" = true ] && print "Installing Elasticsearch addon..." && helm upgrade --install kb-addon-elasticsearch kubeblocks/elasticsearch --namespace kb-system --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_QDRANT" = true ] && print "Installing Qdrant addon..." && helm upgrade --install kb-addon-qdrant kubeblocks/qdrant --namespace kb-system --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_MONGODB" = true ] && print "Installing MongoDB addon..." && helm upgrade --install kb-addon-mongodb kubeblocks/mongodb --namespace kb-system --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_NEO4J" = true ] && print "Installing Neo4j addon..." && helm upgrade --install kb-addon-neo4j kubeblocks/neo4j --namespace kb-system --version $ADDON_CLUSTER_CHART_VERSION
print_success "KubeBlocks database addons installation completed!"
print "Now you can run 02-install-database.sh to install database clusters"

View file

@ -0,0 +1,62 @@
#!/bin/bash
# Get the directory where this script is located
DATABASE_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
# Load configuration file
source "$DATABASE_SCRIPT_DIR/00-config.sh"
print "Installing database clusters..."
# Install database clusters based on configuration
[ "$ENABLE_POSTGRESQL" = true ] && print "Installing PostgreSQL cluster..." && helm upgrade --install pg-cluster kubeblocks/postgresql-cluster -f "$DATABASE_SCRIPT_DIR/postgresql/values.yaml" --namespace $NAMESPACE --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_REDIS" = true ] && print "Installing Redis cluster..." && helm upgrade --install redis-cluster kubeblocks/redis-cluster -f "$DATABASE_SCRIPT_DIR/redis/values.yaml" --namespace $NAMESPACE --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_ELASTICSEARCH" = true ] && print "Installing Elasticsearch cluster..." && helm upgrade --install es-cluster kubeblocks/elasticsearch-cluster -f "$DATABASE_SCRIPT_DIR/elasticsearch/values.yaml" --namespace $NAMESPACE --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_QDRANT" = true ] && print "Installing Qdrant cluster..." && helm upgrade --install qdrant-cluster kubeblocks/qdrant-cluster -f "$DATABASE_SCRIPT_DIR/qdrant/values.yaml" --namespace $NAMESPACE --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_MONGODB" = true ] && print "Installing MongoDB cluster..." && helm upgrade --install mongodb-cluster kubeblocks/mongodb-cluster -f "$DATABASE_SCRIPT_DIR/mongodb/values.yaml" --namespace $NAMESPACE --version $ADDON_CLUSTER_CHART_VERSION
[ "$ENABLE_NEO4J" = true ] && print "Installing Neo4j cluster..." && helm upgrade --install neo4j-cluster kubeblocks/neo4j-cluster -f "$DATABASE_SCRIPT_DIR/neo4j/values.yaml" --namespace $NAMESPACE --version $ADDON_CLUSTER_CHART_VERSION
# Wait for databases to be ready
print "Waiting for databases to be ready..."
TIMEOUT=600 # Set timeout to 10 minutes
START_TIME=$(date +%s)
while true; do
CURRENT_TIME=$(date +%s)
ELAPSED=$((CURRENT_TIME - START_TIME))
if [ $ELAPSED -gt $TIMEOUT ]; then
print_error "Timeout waiting for databases to be ready. Please check database status manually and try again"
exit 1
fi
# Build wait conditions for enabled databases
WAIT_CONDITIONS=()
[ "$ENABLE_POSTGRESQL" = true ] && WAIT_CONDITIONS+=("kubectl wait --for=condition=ready pods -l app.kubernetes.io/instance=pg-cluster -n $NAMESPACE --timeout=10s")
[ "$ENABLE_REDIS" = true ] && WAIT_CONDITIONS+=("kubectl wait --for=condition=ready pods -l app.kubernetes.io/instance=redis-cluster -n $NAMESPACE --timeout=10s")
[ "$ENABLE_ELASTICSEARCH" = true ] && WAIT_CONDITIONS+=("kubectl wait --for=condition=ready pods -l app.kubernetes.io/instance=es-cluster -n $NAMESPACE --timeout=10s")
[ "$ENABLE_QDRANT" = true ] && WAIT_CONDITIONS+=("kubectl wait --for=condition=ready pods -l app.kubernetes.io/instance=qdrant-cluster -n $NAMESPACE --timeout=10s")
[ "$ENABLE_MONGODB" = true ] && WAIT_CONDITIONS+=("kubectl wait --for=condition=ready pods -l app.kubernetes.io/instance=mongodb-cluster -n $NAMESPACE --timeout=10s")
[ "$ENABLE_NEO4J" = true ] && WAIT_CONDITIONS+=("kubectl wait --for=condition=ready pods -l app.kubernetes.io/instance=neo4j-cluster -n $NAMESPACE --timeout=10s")
# Check if all enabled databases are ready
ALL_READY=true
for CONDITION in "${WAIT_CONDITIONS[@]}"; do
if ! eval "$CONDITION &> /dev/null"; then
ALL_READY=false
break
fi
done
if [ "$ALL_READY" = true ]; then
print "All database pods are ready, continuing with deployment..."
break
fi
print "Waiting for database pods to be ready (${ELAPSED}s elapsed)..."
sleep 10
done
print_success "Database clusters installation completed!"
print "Use the following command to check the status of installed clusters:"
print "kubectl get clusters -n $NAMESPACE"

View file

@ -0,0 +1,20 @@
#!/bin/bash
# Get the directory where this script is located
DATABASE_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
# Load configuration file
source "$DATABASE_SCRIPT_DIR/00-config.sh"
print "Uninstalling database clusters..."
# Uninstall database clusters based on configuration
[ "$ENABLE_POSTGRESQL" = true ] && print "Uninstalling PostgreSQL cluster..." && helm uninstall pg-cluster --namespace $NAMESPACE 2>/dev/null || true
[ "$ENABLE_REDIS" = true ] && print "Uninstalling Redis cluster..." && helm uninstall redis-cluster --namespace $NAMESPACE 2>/dev/null || true
[ "$ENABLE_ELASTICSEARCH" = true ] && print "Uninstalling Elasticsearch cluster..." && helm uninstall es-cluster --namespace $NAMESPACE 2>/dev/null || true
[ "$ENABLE_QDRANT" = true ] && print "Uninstalling Qdrant cluster..." && helm uninstall qdrant-cluster --namespace $NAMESPACE 2>/dev/null || true
[ "$ENABLE_MONGODB" = true ] && print "Uninstalling MongoDB cluster..." && helm uninstall mongodb-cluster --namespace $NAMESPACE 2>/dev/null || true
[ "$ENABLE_NEO4J" = true ] && print "Uninstalling Neo4j cluster..." && helm uninstall neo4j-cluster --namespace $NAMESPACE 2>/dev/null || true
print_success "Database clusters uninstalled"
print "To uninstall database addons and KubeBlocks, run 04-cleanup.sh"

View file

@ -0,0 +1,26 @@
#!/bin/bash
# Get the directory where this script is located
DATABASE_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
# Load configuration file
source "$DATABASE_SCRIPT_DIR/00-config.sh"
print "Uninstalling KubeBlocks database addons..."
# Uninstall database addons based on configuration
[ "$ENABLE_POSTGRESQL" = true ] && print "Uninstalling PostgreSQL addon..." && helm uninstall kb-addon-postgresql --namespace kb-system 2>/dev/null || true
[ "$ENABLE_REDIS" = true ] && print "Uninstalling Redis addon..." && helm uninstall kb-addon-redis --namespace kb-system 2>/dev/null || true
[ "$ENABLE_ELASTICSEARCH" = true ] && print "Uninstalling Elasticsearch addon..." && helm uninstall kb-addon-elasticsearch --namespace kb-system 2>/dev/null || true
[ "$ENABLE_QDRANT" = true ] && print "Uninstalling Qdrant addon..." && helm uninstall kb-addon-qdrant --namespace kb-system 2>/dev/null || true
[ "$ENABLE_MONGODB" = true ] && print "Uninstalling MongoDB addon..." && helm uninstall kb-addon-mongodb --namespace kb-system 2>/dev/null || true
[ "$ENABLE_NEO4J" = true ] && print "Uninstalling Neo4j addon..." && helm uninstall kb-addon-neo4j --namespace kb-system 2>/dev/null || true
print_success "Database addons uninstallation completed!"
source "$DATABASE_SCRIPT_DIR/uninstall-kubeblocks.sh"
kubectl delete namespace $NAMESPACE
kubectl delete namespace kb-system
print_success "KubeBlocks uninstallation completed!"

View file

@ -0,0 +1,170 @@
# Using KubeBlocks to Deploy and Manage Databases
Learn how to quickly deploy and manage various databases in a Kubernetes (K8s) environment through KubeBlocks.
## Introduction to KubeBlocks
KubeBlocks is a production-ready, open-source toolkit that runs any database--SQL, NoSQL, vector, or document--on Kubernetes.
It scales smoothly from quick dev tests to full production clusters, making it a solid choice for RAG workloads like FastGPT that need several data stores working together.
## Prerequisites
Make sure the following tools are installed and configured:
* **Kubernetes cluster**
* A running Kubernetes cluster is required.
* For local development or demos you can use [Minikube](https://minikube.sigs.k8s.io/docs/start/) (needs ≥ 2 CPUs, ≥ 4 GB RAM, and Docker/VM-driver support).
* Any standard cloud or on-premises Kubernetes cluster (EKS, GKE, AKS, etc.) also works.
* **kubectl**
* The Kubernetes command-line interface.
* Follow the official guide: [Install and Set Up kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl).
* **Helm** (v3.x+)
* Kubernetes package manager used by the scripts below.
* Install it via the official instructions: [Installing Helm](https://helm.sh/docs/intro/install/).
## Installing
1. **Configure the databases you want**
Edit `00-config.sh` file. Based on your requirements, set the variable to `true` for the databases you want to install.
For example, to install PostgreSQL and Neo4j:
```bash
ENABLE_POSTGRESQL=true
ENABLE_REDIS=false
ENABLE_ELASTICSEARCH=false
ENABLE_QDRANT=false
ENABLE_MONGODB=false
ENABLE_NEO4J=true
```
2. **Prepare the environment and install KubeBlocks add-ons**
```bash
bash ./01-prepare.sh
```
*What the script does*
`01-prepare.sh` performs basic pre-checks (Helm, kubectl, cluster reachability), adds the KubeBlocks Helm repo, and installs any core CRDs or controllers that KubeBlocks itself needs. It also installs the addons for every database you enabled in `00-config.sh`, but **does not** create the actual database clusters yet.
3. **(Optional) Modify database settings**
Before deployment you can edit the `values.yaml` file inside each `<db>/` directory to change `version`, `replicas`, `CPU`, `memory`, `storage size`, etc.
4. **Install the database clusters**
```bash
bash ./02-install-database.sh
```
*What the script does*
`02-install-database.sh` **actually deploys the chosen databases to Kubernetes**.
When the script completes, confirm that the clusters are up. It may take a few minutes for all the clusters to become ready,
especially if this is the first time running the script as Kubernetes needs to pull container images from registries.
You can monitor the progress using the following commands:
```bash
kubectl get clusters -n rag
NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE
es-cluster Delete Running 11m
mongodb-cluster mongodb Delete Running 11m
pg-cluster postgresql Delete Running 11m
qdrant-cluster qdrant Delete Running 11m
redis-cluster redis Delete Running 11m
```
You can see all the Database `Pods` created by KubeBlocks.
Initially, you might see pods in `ContainerCreating` or `Pending` status - this is normal while images are being pulled and containers are starting up.
Wait until all pods show `Running` status:
```bash
kubectl get po -n rag
NAME READY STATUS RESTARTS AGE
es-cluster-mdit-0 2/2 Running 0 11m
mongodb-cluster-mongodb-0 2/2 Running 0 11m
pg-cluster-postgresql-0 4/4 Running 0 11m
pg-cluster-postgresql-1 4/4 Running 0 11m
qdrant-cluster-qdrant-0 2/2 Running 0 11m
redis-cluster-redis-0 2/2 Running 0 11m
```
You can also check the detailed status of a specific pod if it's taking longer than expected:
```bash
kubectl describe pod <pod-name> -n rag
```
## Connect to Databases
To connect to your databases, follow these steps to identify available accounts, retrieve credentials, and establish connections:
### 1. List Available Database Clusters
First, view the database clusters running in your namespace:
```bash
kubectl get cluster -n rag
```
### 2. Retrieve Authentication Credentials
For PostgreSQL, retrieve the username and password from Kubernetes secrets:
```bash
# Get PostgreSQL username
kubectl get secrets -n rag pg-cluster-postgresql-account-postgres -o jsonpath='{.data.username}' | base64 -d
# Get PostgreSQL password
kubectl get secrets -n rag pg-cluster-postgresql-account-postgres -o jsonpath='{.data.password}' | base64 -d
```
If you have trouble finding the correct secret name, list all secrets:
```bash
kubectl get secrets -n rag
```
### 3. Port Forward to Local Machine
Use port forwarding to access PostgreSQL from your local machine:
```bash
# Forward PostgreSQL port (5432) to your local machine
# You can see all services with: kubectl get svc -n rag
kubectl port-forward -n rag svc/pg-cluster-postgresql-postgresql 5432:5432
```
### 4. Connect Using Database Client
Now you can connect using your preferred PostgreSQL client with the retrieved credentials:
```bash
# Example: connecting with psql
export PGUSER=$(kubectl get secrets -n rag pg-cluster-postgresql-account-postgres -o jsonpath='{.data.username}' | base64 -d)
export PGPASSWORD=$(kubectl get secrets -n rag pg-cluster-postgresql-account-postgres -o jsonpath='{.data.password}' | base64 -d)
psql -h localhost -p 5432 -U $PGUSER
```
Keep the port-forwarding terminal running while you're connecting to the database.
## Uninstalling
1. **Remove the database clusters**
```bash
bash ./03-uninstall-database.sh
```
The script deletes the database clusters that were enabled in `00-config.sh`.
2. **Clean up KubeBlocks add-ons**
```bash
bash ./04-cleanup.sh
```
This removes the addons installed by `01-prepare.sh`.
## Reference
* [Kubeblocks Documentation](https://kubeblocks.io/docs/preview/user_docs/overview/introduction)

View file

@ -0,0 +1,36 @@
## description: The version of ElasticSearch.
## default: 8.8.2
version: "8.8.2"
## description: Mode for ElasticSearch
## default: multi-node
## one of: [single-node, multi-node]
mode: single-node
## description: The number of replicas, for single-node mode, the replicas is 1, for multi-node mode, the default replicas is 3.
## default: 1
## minimum: 1
## maximum: 5
replicas: 1
## description: CPU cores.
## default: 1
## minimum: 0.5
## maximum: 64
cpu: 1
## description: Memory, the unit is Gi.
## default: 2
## minimum: 1
## maximum: 1000
memory: 2
## description: Storage size, the unit is Gi.
## default: 20
## minimum: 1
## maximum: 10000
storage: 5
extra:
terminationPolicy: Delete
disableExporter: true

View file

@ -0,0 +1,52 @@
#!/bin/bash
# Get the directory where this script is located
DATABASE_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
# Load configuration file
source "$DATABASE_SCRIPT_DIR/00-config.sh"
# Check dependencies
check_dependencies
# Function for installing KubeBlocks
install_kubeblocks() {
print "Ready to install KubeBlocks."
# Install CSI Snapshotter CRDs
kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.2.0/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.2.0/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.2.0/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
# Add and update Piraeus repository
helm repo add piraeus-charts https://piraeus.io/helm-charts/
helm repo update
# Install snapshot controller
helm install snapshot-controller piraeus-charts/snapshot-controller -n kb-system --create-namespace
kubectl wait --for=condition=ready pods -l app.kubernetes.io/name=snapshot-controller -n kb-system --timeout=60s
print_success "snapshot-controller installation complete!"
# Install KubeBlocks CRDs
kubectl create -f https://github.com/apecloud/kubeblocks/releases/download/v${KB_VERSION}/kubeblocks_crds.yaml
# Add and update KubeBlocks repository
helm repo add kubeblocks $HELM_REPO
helm repo update
# Install KubeBlocks
helm install kubeblocks kubeblocks/kubeblocks --namespace kb-system --create-namespace --version=${KB_VERSION}
# Verify installation
print "Waiting for KubeBlocks to be ready..."
kubectl wait --for=condition=ready pods -l app.kubernetes.io/instance=kubeblocks -n kb-system --timeout=120s
print_success "KubeBlocks installation complete!"
}
# Check if KubeBlocks is already installed
print "Checking if KubeBlocks is already installed in kb-system namespace..."
if kubectl get namespace kb-system &>/dev/null && kubectl get deployment kubeblocks -n kb-system &>/dev/null; then
print_success "KubeBlocks is already installed in kb-system namespace."
else
# Call the function to install KubeBlocks
install_kubeblocks
fi

View file

@ -0,0 +1,34 @@
## description: Cluster version.
## default: 6.0.16
## one of: [8.0.8, 8.0.6, 8.0.4, 7.0.19, 7.0.16, 7.0.12, 6.0.22, 6.0.20, 6.0.16, 5.0.30, 5.0.28, 4.4.29, 4.2.24, 4.0.28]
version: 6.0.16
## description: Cluster topology mode.
## default: standalone
## one of: [standalone, replicaset]
mode: standalone
## description: CPU cores.
## default: 0.5
## minimum: 0.5
## maximum: 64
cpu: 1
## description: Memory, the unit is Gi.
## default: 0.5
## minimum: 0.5
## maximum: 1000
memory: 1
## description: Storage size, the unit is Gi.
## default: 20
## minimum: 1
## maximum: 10000
storage: 20
## default: enabled
## one of: [enabled, disabled]
hostnetwork: "disabled"
extra:
terminationPolicy: Delete

View file

@ -0,0 +1,46 @@
# Version
# description: Cluster version.
# default: 5.26.5
# one of: [5.26.5, 4.4.42]
version: 5.26.5
# Mode
# description: Cluster topology mode.
# default: singlealone
# one of: [singlealone]
mode: singlealone
# CPU
# description: CPU cores.
# default: 2
# minimum: 2
# maximum: 64
cpu: 2
# Memory(Gi)
# description: Memory, the unit is Gi.
# default: 2
# minimum: 2
# maximum: 1000
memory: 4
# Storage(Gi)
# description: Storage size, the unit is Gi.
# default: 20
# minimum: 1
# maximum: 10000
storage: 20
# Replicas
# description: The number of replicas, for standalone mode, the replicas is 1, for replicaset mode, the default replicas is 3.
# default: 1
# minimum: 1
# maximum: 5
replicas: 1
# Storage Class Name
# description: Storage class name of the data volume
storageClassName: ""
extra:
terminationPolicy: Delete

View file

@ -0,0 +1,33 @@
## description: service version.
## default: 15.7.0
version: 16.4.0
## mode postgresql cluster topology mode replication
mode: replication
## description: The number of replicas, for standalone mode, the replicas is 1, for replication mode, the default replicas is 2.
## default: 1
## minimum: 1
## maximum: 5
replicas: 2
## description: CPU cores.
## default: 0.5
## minimum: 0.5
## maximum: 64
cpu: 1
## description: Memory, the unit is Gi.
## default: 0.5
## minimum: 0.5
## maximum: 1000
memory: 1
## description: Storage size, the unit is Gi.
## default: 20
## minimum: 1
## maximum: 10000
storage: 5
## terminationPolicy define Cluster termination policy. One of DoNotTerminate, Delete, WipeOut.
terminationPolicy: Delete

View file

@ -0,0 +1,31 @@
## description: The version of Qdrant.
## default: 1.10.0
version: 1.10.0
## description: The number of replicas.
## default: 1
## minimum: 1
## maximum: 16
replicas: 1
## description: CPU cores.
## default: 1
## minimum: 0.5
## maximum: 64
cpu: 1
## description: Memory, the unit is Gi.
## default: 2
## minimum: 0.5
## maximum: 1000
memory: 1
## description: Storage size, the unit is Gi.
## default: 20
## minimum: 1
## maximum: 10000
storage: 20
## customized default values to override kblib chart's values
extra:
terminationPolicy: Delete

View file

@ -0,0 +1,34 @@
## description: Cluster version.
## default: 7.2.7
version: 7.2.7
## description: Cluster topology mode.
## default: replication
## one of: [standalone, replication, cluster, replication-twemproxy]
mode: standalone
## description: The number of replicas, for standalone mode, the replicas is 1, for replication mode, the default replicas is 2.
## default: 1
## minimum: 1
## maximum: 5
replicas: 1
## description: CPU cores.
## default: 0.5
## minimum: 0.5
## maximum: 64
cpu: 0.5
## description: Memory, the unit is Gi.
## default: 0.5
## minimum: 0.5
## maximum: 1000
memory: 1
## description: Storage size, the unit is Gi.
## default: 20
## minimum: 1
## maximum: 10000
storage: 20
extra:
disableExporter: true

View file

@ -0,0 +1,43 @@
#!/bin/bash
print_title() {
echo "============================================"
echo "$1"
echo "============================================"
}
print_success() {
echo "$1"
}
print_error() {
echo "$1"
}
print_warning() {
echo "⚠️ $1"
}
print_info() {
echo "🔹 $1"
}
print() {
echo "$1"
}
# Check dependencies
check_dependencies(){
print "Checking dependencies..."
command -v kubectl >/dev/null 2>&1 || { print "Error: kubectl command not found"; exit 1; }
command -v helm >/dev/null 2>&1 || { print "Error: helm command not found"; exit 1; }
# Check if Kubernetes is available
print "Checking if Kubernetes is available..."
kubectl cluster-info &>/dev/null
if [ $? -ne 0 ]; then
print "Error: Kubernetes cluster is not accessible. Please ensure you have proper access to a Kubernetes cluster."
exit 1
fi
print_success "Kubernetes cluster is accessible."
}

View file

@ -0,0 +1,51 @@
#!/bin/bash
# Get the directory where this script is located
DATABASE_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
# Load configuration file
source "$DATABASE_SCRIPT_DIR/00-config.sh"
# Check dependencies
print "Checking dependencies..."
command -v kubectl >/dev/null 2>&1 || { print "Error: kubectl command not found"; exit 1; }
command -v helm >/dev/null 2>&1 || { print "Error: helm command not found"; exit 1; }
print "Checking if Kubernetes is available..."
if ! kubectl cluster-info &>/dev/null; then
print "Error: Kubernetes cluster is not accessible. Please ensure you have proper access to a Kubernetes cluster."
exit 1
fi
print "Checking if KubeBlocks is installed in kb-system namespace..."
if ! kubectl get namespace kb-system &>/dev/null; then
print "KubeBlocks is not installed in kb-system namespace."
exit 0
fi
# Function for uninstalling KubeBlocks
uninstall_kubeblocks() {
print "Uninstalling KubeBlocks..."
# Uninstall KubeBlocks Helm chart
helm uninstall kubeblocks -n kb-system
# Uninstall snapshot controller
helm uninstall snapshot-controller -n kb-system
# Delete KubeBlocks CRDs
kubectl delete -f https://github.com/apecloud/kubeblocks/releases/download/v${KB_VERSION}/kubeblocks_crds.yaml --ignore-not-found=true
# Delete CSI Snapshotter CRDs
kubectl delete -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.2.0/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml --ignore-not-found=true
kubectl delete -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.2.0/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml --ignore-not-found=true
kubectl delete -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.2.0/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml --ignore-not-found=true
# Delete the kb-system namespace
print "Waiting for resources to be removed..."
kubectl delete namespace kb-system --timeout=180s
print "KubeBlocks has been successfully uninstalled!"
}
# Call the function to uninstall KubeBlocks
uninstall_kubeblocks

95
k8s-deploy/install_lightrag.sh Executable file
View file

@ -0,0 +1,95 @@
#!/bin/bash
NAMESPACE=rag
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
if [ -z "$OPENAI_API_KEY" ]; then
echo "OPENAI_API_KEY environment variable is not set"
read -s -p "Enter your OpenAI API key: " OPENAI_API_KEY
if [ -z "$OPENAI_API_KEY" ]; then
echo "Error: OPENAI_API_KEY must be provided"
exit 1
fi
export OPENAI_API_KEY=$OPENAI_API_KEY
fi
if [ -z "$OPENAI_API_BASE" ]; then
echo "OPENAI_API_BASE environment variable is not set, will use default value"
read -p "Enter OpenAI API base URL (press Enter to skip if not needed): " OPENAI_API_BASE
export OPENAI_API_BASE=$OPENAI_API_BASE
fi
# Install KubeBlocks (if not already installed)
bash "$SCRIPT_DIR/databases/01-prepare.sh"
# Install database clusters
bash "$SCRIPT_DIR/databases/02-install-database.sh"
# Create vector extension in PostgreSQL if enabled
print "Waiting for PostgreSQL pods to be ready..."
if kubectl wait --for=condition=ready pods -l kubeblocks.io/role=primary,app.kubernetes.io/instance=pg-cluster -n $NAMESPACE --timeout=300s; then
print "Creating vector extension in PostgreSQL..."
kubectl exec -it $(kubectl get pods -l kubeblocks.io/role=primary,app.kubernetes.io/instance=pg-cluster -n $NAMESPACE -o name) -n $NAMESPACE -- psql -c "CREATE EXTENSION vector;"
print_success "Vector extension created successfully."
else
print "Warning: PostgreSQL pods not ready within timeout. Vector extension not created."
fi
# Get database passwords from Kubernetes secrets
echo "Retrieving database credentials from Kubernetes secrets..."
POSTGRES_PASSWORD=$(kubectl get secrets -n rag pg-cluster-postgresql-account-postgres -o jsonpath='{.data.password}' | base64 -d)
if [ -z "$POSTGRES_PASSWORD" ]; then
echo "Error: Could not retrieve PostgreSQL password. Make sure PostgreSQL is deployed and the secret exists."
exit 1
fi
export POSTGRES_PASSWORD=$POSTGRES_PASSWORD
NEO4J_PASSWORD=$(kubectl get secrets -n rag neo4j-cluster-neo4j-account-neo4j -o jsonpath='{.data.password}' | base64 -d)
if [ -z "$NEO4J_PASSWORD" ]; then
echo "Error: Could not retrieve Neo4J password. Make sure Neo4J is deployed and the secret exists."
exit 1
fi
export NEO4J_PASSWORD=$NEO4J_PASSWORD
#REDIS_PASSWORD=$(kubectl get secrets -n rag redis-cluster-redis-account-default -o jsonpath='{.data.password}' | base64 -d)
#if [ -z "$REDIS_PASSWORD" ]; then
# echo "Error: Could not retrieve Redis password. Make sure Redis is deployed and the secret exists."
# exit 1
#fi
#export REDIS_PASSWORD=$REDIS_PASSWORD
echo "Deploying production LightRAG (using external databases)..."
if ! kubectl get namespace rag &> /dev/null; then
echo "creating namespace 'rag'..."
kubectl create namespace rag
fi
helm upgrade --install lightrag $SCRIPT_DIR/lightrag \
--namespace $NAMESPACE \
--set-string env.POSTGRES_PASSWORD=$POSTGRES_PASSWORD \
--set-string env.NEO4J_PASSWORD=$NEO4J_PASSWORD \
--set-string env.LLM_BINDING=openai \
--set-string env.LLM_MODEL=gpt-4o-mini \
--set-string env.LLM_BINDING_HOST=$OPENAI_API_BASE \
--set-string env.LLM_BINDING_API_KEY=$OPENAI_API_KEY \
--set-string env.EMBEDDING_BINDING=openai \
--set-string env.EMBEDDING_MODEL=text-embedding-ada-002 \
--set-string env.EMBEDDING_DIM=1536 \
--set-string env.EMBEDDING_BINDING_API_KEY=$OPENAI_API_KEY
# --set-string env.REDIS_URI="redis://default:${REDIS_PASSWORD}@redis-cluster-redis-redis:6379"
# Wait for LightRAG pod to be ready
echo ""
echo "Waiting for lightrag pod to be ready..."
kubectl wait --for=condition=ready pod -l app.kubernetes.io/instance=lightrag --timeout=300s -n rag
echo "lightrag pod is ready"
echo ""
echo "Running Port-Forward:"
echo " kubectl --namespace rag port-forward svc/lightrag 9621:9621"
echo "==========================================="
echo ""
echo "✅ You can visit LightRAG at: http://localhost:9621"
echo ""
kubectl --namespace rag port-forward svc/lightrag 9621:9621

View file

@ -0,0 +1,81 @@
#!/bin/bash
NAMESPACE=rag
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
check_dependencies(){
echo "Checking dependencies..."
command -v kubectl >/dev/null 2>&1 || { echo "Error: kubectl command not found"; exit 1; }
command -v helm >/dev/null 2>&1 || { echo "Error: helm command not found"; exit 1; }
# Check if Kubernetes is available
echo "Checking if Kubernetes is available..."
kubectl cluster-info &>/dev/null
if [ $? -ne 0 ]; then
echo "Error: Kubernetes cluster is not accessible. Please ensure you have proper access to a Kubernetes cluster."
exit 1
fi
echo "Kubernetes cluster is accessible."
}
check_dependencies
if [ -z "$OPENAI_API_KEY" ]; then
echo "OPENAI_API_KEY environment variable is not set"
read -s -p "Enter your OpenAI API key: " OPENAI_API_KEY
if [ -z "$OPENAI_API_KEY" ]; then
echo "Error: OPENAI_API_KEY must be provided"
exit 1
fi
export OPENAI_API_KEY=$OPENAI_API_KEY
fi
if [ -z "$OPENAI_API_BASE" ]; then
echo "OPENAI_API_BASE environment variable is not set, will use default value"
read -p "Enter OpenAI API base URL (press Enter to skip if not needed): " OPENAI_API_BASE
export OPENAI_API_BASE=$OPENAI_API_BASE
fi
required_env_vars=("OPENAI_API_BASE" "OPENAI_API_KEY")
for var in "${required_env_vars[@]}"; do
if [ -z "${!var}" ]; then
echo "Error: $var environment variable is not set"
exit 1
fi
done
if ! kubectl get namespace rag &> /dev/null; then
echo "creating namespace 'rag'..."
kubectl create namespace rag
fi
helm upgrade --install lightrag-dev $SCRIPT_DIR/lightrag \
--namespace rag \
--set-string env.LIGHTRAG_KV_STORAGE=JsonKVStorage \
--set-string env.LIGHTRAG_VECTOR_STORAGE=NanoVectorDBStorage \
--set-string env.LIGHTRAG_GRAPH_STORAGE=NetworkXStorage \
--set-string env.LIGHTRAG_DOC_STATUS_STORAGE=JsonDocStatusStorage \
--set-string env.LLM_BINDING=openai \
--set-string env.LLM_MODEL=gpt-4o-mini \
--set-string env.LLM_BINDING_HOST=$OPENAI_API_BASE \
--set-string env.LLM_BINDING_API_KEY=$OPENAI_API_KEY \
--set-string env.EMBEDDING_BINDING=openai \
--set-string env.EMBEDDING_MODEL=text-embedding-ada-002 \
--set-string env.EMBEDDING_DIM=1536 \
--set-string env.EMBEDDING_BINDING_API_KEY=$OPENAI_API_KEY
# Wait for LightRAG pod to be ready
echo ""
echo "Waiting for lightrag-dev pod to be ready..."
kubectl wait --for=condition=ready pod -l app.kubernetes.io/instance=lightrag-dev --timeout=300s -n rag
echo "lightrag-dev pod is ready"
echo ""
echo "Running Port-Forward:"
echo " kubectl --namespace rag port-forward svc/lightrag-dev 9621:9621"
echo "==========================================="
echo ""
echo "✅ You can visit LightRAG at: http://localhost:9621"
echo ""
kubectl --namespace rag port-forward svc/lightrag-dev 9621:9621

View file

@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

View file

@ -0,0 +1,10 @@
apiVersion: v2
name: lightrag
description: A Helm chart for LightRAG, an efficient and lightweight RAG system
type: application
version: 0.1.0
appVersion: "1.0.0"
maintainers:
- name: LightRAG Team
- name: earayu
email: earayu@gmail.com

View file

@ -0,0 +1,38 @@
===========================================
LightRAG has been successfully deployed!
===========================================
View application logs:
kubectl logs -f --namespace {{ .Release.Namespace }} deploy/{{ include "lightrag.fullname" . }}
===========================================
Access the application:
{{- if contains "NodePort" .Values.service.type }}
Run these commands to get access information:
-----------------------------------------
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "lightrag.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo "LightRAG is accessible at: http://$NODE_IP:$NODE_PORT"
-----------------------------------------
{{- else if contains "LoadBalancer" .Values.service.type }}
Run these commands to get access information (external IP may take a minute to assign):
-----------------------------------------
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "lightrag.fullname" . }} --template "{{ "{{ range (index .status.loadBalancer.ingress 0) }}{{ . }}{{ end }}" }}")
echo "LightRAG is accessible at: http://$SERVICE_IP:{{ .Values.service.port }}"
-----------------------------------------
If SERVICE_IP is empty, retry the command or check service status with:
kubectl get svc --namespace {{ .Release.Namespace }} {{ include "lightrag.fullname" . }}
{{- else if contains "ClusterIP" .Values.service.type }}
For development environments, to access LightRAG from your local machine:
1. Run this port-forward command in your terminal:
kubectl --namespace {{ .Release.Namespace }} port-forward svc/{{ include "lightrag.fullname" . }} {{ .Values.service.port }}:{{ .Values.env.PORT }}
2. While the command is running, open your browser and navigate to:
http://localhost:{{ .Values.service.port }}
Note: To stop port-forwarding, press Ctrl+C in the terminal.
{{- end }}
===========================================

View file

@ -0,0 +1,42 @@
{{/*
Application name
*/}}
{{- define "lightrag.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Full application name
*/}}
{{- define "lightrag.fullname" -}}
{{- default .Release.Name .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "lightrag.labels" -}}
app.kubernetes.io/name: {{ include "lightrag.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "lightrag.selectorLabels" -}}
app.kubernetes.io/name: {{ include "lightrag.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
.env file content
*/}}
{{- define "lightrag.envContent" -}}
{{- $first := true -}}
{{- range $key, $val := .Values.env -}}
{{- if not $first -}}{{- "\n" -}}{{- end -}}
{{- $first = false -}}
{{ $key }}={{ $val }}
{{- end -}}
{{- end -}}

View file

@ -0,0 +1,62 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "lightrag.fullname" . }}
labels:
{{- include "lightrag.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "lightrag.selectorLabels" . | nindent 6 }}
template:
metadata:
annotations:
checksum/config: {{ include "lightrag.envContent" . | sha256sum }}
labels:
{{- include "lightrag.selectorLabels" . | nindent 8 }}
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: {{ .Values.env.PORT }}
protocol: TCP
readinessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 2
successThreshold: 1
failureThreshold: 3
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumeMounts:
- name: rag-storage
mountPath: /app/data/rag_storage
- name: inputs
mountPath: /app/data/inputs
- name: env-file
mountPath: /app/.env
subPath: .env
volumes:
- name: env-file
secret:
secretName: {{ include "lightrag.fullname" . }}-env
{{- if .Values.persistence.enabled }}
- name: rag-storage
persistentVolumeClaim:
claimName: {{ include "lightrag.fullname" . }}-rag-storage
- name: inputs
persistentVolumeClaim:
claimName: {{ include "lightrag.fullname" . }}-inputs
{{- else }}
- name: rag-storage
emptyDir: {}
- name: inputs
emptyDir: {}
{{- end }}

View file

@ -0,0 +1,28 @@
{{- if .Values.persistence.enabled }}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: {{ include "lightrag.fullname" . }}-rag-storage
labels:
{{- include "lightrag.labels" . | nindent 4 }}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.persistence.ragStorage.size }}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: {{ include "lightrag.fullname" . }}-inputs
labels:
{{- include "lightrag.labels" . | nindent 4 }}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.persistence.inputs.size }}
{{- end }}

View file

@ -0,0 +1,10 @@
apiVersion: v1
kind: Secret
metadata:
name: {{ include "lightrag.fullname" . }}-env
labels:
{{- include "lightrag.labels" . | nindent 4 }}
type: Opaque
stringData:
.env: |-
{{- include "lightrag.envContent" . | nindent 4 }}

View file

@ -0,0 +1,15 @@
apiVersion: v1
kind: Service
metadata:
name: {{ include "lightrag.fullname" . }}
labels:
{{- include "lightrag.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: {{ .Values.env.PORT }}
protocol: TCP
name: http
selector:
{{- include "lightrag.selectorLabels" . | nindent 4 }}

View file

@ -0,0 +1,58 @@
replicaCount: 1
image:
repository: ghcr.io/hkuds/lightrag
tag: latest
service:
type: ClusterIP
port: 9621
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 500m
memory: 1Gi
persistence:
enabled: true
ragStorage:
size: 10Gi
inputs:
size: 5Gi
env:
HOST: 0.0.0.0
PORT: 9621
WEBUI_TITLE: Graph RAG Engine
WEBUI_DESCRIPTION: Simple and Fast Graph Based RAG System
LLM_BINDING: openai
LLM_MODEL: gpt-4o-mini
LLM_BINDING_HOST:
LLM_BINDING_API_KEY:
EMBEDDING_BINDING: openai
EMBEDDING_MODEL: text-embedding-ada-002
EMBEDDING_DIM: 1536
EMBEDDING_BINDING_API_KEY:
LIGHTRAG_KV_STORAGE: PGKVStorage
LIGHTRAG_VECTOR_STORAGE: PGVectorStorage
# LIGHTRAG_KV_STORAGE: RedisKVStorage
# LIGHTRAG_VECTOR_STORAGE: QdrantVectorDBStorage
LIGHTRAG_GRAPH_STORAGE: Neo4JStorage
LIGHTRAG_DOC_STATUS_STORAGE: PGDocStatusStorage
# Replace with your POSTGRES credentials
POSTGRES_HOST: pg-cluster-postgresql-postgresql
POSTGRES_PORT: 5432
POSTGRES_USER: postgres
POSTGRES_PASSWORD:
POSTGRES_DATABASE: postgres
POSTGRES_WORKSPACE: default
# Replace with your NEO4J credentials
NEO4J_URI: neo4j://neo4j-cluster-neo4j:7687
NEO4J_USERNAME: neo4j
NEO4J_PASSWORD:
# Replace with your Qdrant credentials
QDRANT_URL: http://qdrant-cluster-qdrant-qdrant:6333
# REDIS_URI: redis://default:${REDIS_PASSWORD}@redis-cluster-redis-redis:6379

View file

@ -0,0 +1,4 @@
#!/bin/bash
NAMESPACE=rag
helm uninstall lightrag --namespace $NAMESPACE

View file

@ -0,0 +1,4 @@
#!/bin/bash
NAMESPACE=rag
helm uninstall lightrag-dev --namespace $NAMESPACE

View file

@ -1,5 +1,5 @@
from .lightrag import LightRAG as LightRAG, QueryParam as QueryParam
__version__ = "1.3.7"
__version__ = "1.3.10"
__author__ = "Zirui Guo"
__url__ = "https://github.com/HKUDS/LightRAG"

View file

@ -41,7 +41,9 @@ LightRAG 需要同时集成 LLM大型语言模型和嵌入模型以有效
* openai 或 openai 兼容
* azure_openai
建议使用环境变量来配置 LightRAG 服务器。项目根目录中有一个名为 `env.example` 的示例环境变量文件。请将此文件复制到启动目录并重命名为 `.env`。之后,您可以在 `.env` 文件中修改与 LLM 和嵌入模型相关的参数。需要注意的是LightRAG 服务器每次启动时都会将 `.env` 中的环境变量加载到系统环境变量中。由于 LightRAG 服务器会优先使用系统环境变量中的设置,如果您在通过命令行启动 LightRAG 服务器后修改了 `.env` 文件,则需要执行 `source .env` 使新设置生效。
建议使用环境变量来配置 LightRAG 服务器。项目根目录中有一个名为 `env.example` 的示例环境变量文件。请将此文件复制到启动目录并重命名为 `.env`。之后,您可以在 `.env` 文件中修改与 LLM 和嵌入模型相关的参数。需要注意的是LightRAG 服务器每次启动时都会将 `.env` 中的环境变量加载到系统环境变量中。**LightRAG 服务器会优先使用系统环境变量中的设置**。
> 由于安装了 Python 扩展的 VS Code 可能会在集成终端中自动加载 .env 文件,请在每次修改 .env 文件后打开新的终端会话。
以下是 LLM 和嵌入模型的一些常见设置示例:
@ -92,49 +94,93 @@ lightrag-server
```
lightrag-gunicorn --workers 4
```
`.env` 文件必须放在启动目录中。启动时LightRAG 服务器将创建一个文档目录(默认为 `./inputs`)和一个数据目录(默认为 `./rag_storage`)。这允许您从不同目录启动多个 LightRAG 服务器实例,每个实例配置为监听不同的网络端口
启动LightRAG的时候当前工作目录必须含有`.env`配置文件。**要求将.env文件置于启动目录中是经过特意设计的**。 这样做的目的是支持用户同时启动多个LightRAG实例并为不同实例配置不同的.env文件。**修改.env文件后您需要重新打开终端以使新设置生效**。 这是因为每次启动时LightRAG Server会将.env文件中的环境变量加载至系统环境变量且系统环境变量的设置具有更高优先级
以下是一些常用的启动参数
启动时可以通过命令行参数覆盖`.env`文件中的配置。常用的命令行参数包括
- `--host`服务器监听地址默认0.0.0.0
- `--port`服务器监听端口默认9621
- `--timeout`LLM 请求超时时间默认150 秒)
- `--log-level`日志级别默认INFO
- --input-dir指定要扫描文档的目录默认./input
- `--working-dir`:数据库持久化目录(默认:./rag_storage
- `--input-dir`:上传文件存放目录(默认:./inputs
- `--workspace`: 工作空间名称用于逻辑上隔离多个LightRAG实例之间的数据默认
> - **要求将.env文件置于启动目录中是经过特意设计的**。 这样做的目的是支持用户同时启动多个LightRAG实例并为不同实例配置不同的.env文件。
> - **修改.env文件后您需要重新打开终端以使新设置生效**。 这是因为每次启动时LightRAG Server会将.env文件中的环境变量加载至系统环境变量且系统环境变量的设置具有更高优先级。
### 使用 Docker Compose 启动 LightRAG 服务器
* 克隆代码仓库:
```
git clone https://github.com/HKUDS/LightRAG.git
cd LightRAG
```
### 使用 Docker 启动 LightRAG 服务器
* 配置 .env 文件:
通过复制 env.example 文件创建个性化的 .env 文件,并根据实际需求设置 LLM 及 Embedding 参数。
通过复制示例文件 [`env.example`](env.example) 创建个性化的 .env 文件,并根据实际需求设置 LLM 及 Embedding 参数。
* 创建一个名为 docker-compose.yml 的文件:
```yaml
services:
lightrag:
container_name: lightrag
image: ghcr.io/hkuds/lightrag:latest
ports:
- "${PORT:-9621}:9621"
volumes:
- ./data/rag_storage:/app/data/rag_storage
- ./data/inputs:/app/data/inputs
- ./config.ini:/app/config.ini
- ./.env:/app/.env
env_file:
- .env
restart: unless-stopped
extra_hosts:
- "host.docker.internal:host-gateway"
```
* 通过以下命令启动 LightRAG 服务器:
```
docker compose up
# 如拉取了新版本,请添加 --build 重新构建
docker compose up --build
```
> 在此获取LightRAG docker镜像历史版本: [LightRAG Docker Images]( https://github.com/HKUDS/LightRAG/pkgs/container/lightrag)
```shell
docker compose up
# 如果希望启动后让程序退到后台运行,需要在命令的最后添加 -d 参数
```
> 可以通过以下链接获取官方的docker compose文件[docker-compose.yml]( https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml) 。如需获取LightRAG的历史版本镜像可以访问以下链接: [LightRAG Docker Images]( https://github.com/HKUDS/LightRAG/pkgs/container/lightrag)
### 启动时自动扫描
当使用 `--auto-scan-at-startup` 参数启动任何服务器时,系统将自动:
当使用 `--auto-scan-at-startup` 参数启动LightRAG Server时,系统将自动:
1. 扫描输入目录中的新文件
2. 为尚未在数据库中的新文档建立索引
3. 使所有内容立即可用于 RAG 查询
这种工作模式给启动一个临时的RAG任务提供给了方便。
> `--input-dir` 参数指定要扫描的输入目录。您可以从 webui 触发输入目录扫描。
### 启动多个LightRAG实例
有两种方式可以启动多个LightRAG实例。第一种方式是为每个实例配置一个完全独立的工作环境。此时需要为每个实例创建一个独立的工作目录然后在这个工作目录上放置一个当前实例专用的`.env`配置文件。不同实例的配置文件中的服务器监听端口不能重复,然后在工作目录上执行 lightrag-server 启动服务即可。
第二种方式是所有实例共享一套相同的`.env`配置文件然后通过命令行参数来为每个实例指定不同的服务器监听端口和工作空间。你可以在同一个工作目录中通过不同的命令行参数启动多个LightRAG实例。例如
```
# 启动实例1
lightrag-server --port 9621 --workspace space1
# 启动实例2
lightrag-server --port 9622 --workspace space2
```
工作空间的作用是实现不同实例之间的数据隔离。因此不同实例之间的`workspace`参数必须不同,否则会导致数据混乱,数据将会被破坏。
### LightRAG实例间的数据隔离
每个实例配置一个独立的工作目录和专用`.env`配置文件通常能够保证内存数据库中的本地持久化文件保存在各自的工作目录实现数据的相互隔离。LightRAG默认存储全部都是内存数据库通过这种方式进行数据隔离是没有问题的。但是如果使用的是外部数据库如果不同实例访问的是同一个数据库实例就需要通过配置工作空间来实现数据隔离否则不同实例的数据将会出现冲突并被破坏。
命令行的 workspace 参数和`.env`文件中的环境变量`WORKSPACE` 都可以用于指定当前实例的工作空间名字,命令行参数的优先级别更高。下面是不同类型的存储实现工作空间的方式:
- **对于本地基于文件的数据库,数据隔离通过工作空间子目录实现:** JsonKVStorage, JsonDocStatusStorage, NetworkXStorage, NanoVectorDBStorage, FaissVectorDBStorage。
- **对于将数据存储在集合collection中的数据库通过在集合名称前添加工作空间前缀来实现** RedisKVStorage, RedisDocStatusStorage, MilvusVectorDBStorage, QdrantVectorDBStorage, MongoKVStorage, MongoDocStatusStorage, MongoVectorDBStorage, MongoGraphStorage, PGGraphStorage。
- **对于关系型数据库,数据隔离通过向表中添加 `workspace` 字段进行数据的逻辑隔离:** PGKVStorage, PGVectorStorage, PGDocStatusStorage。
* **对于Neo4j图数据库通过label来实现数据的逻辑隔离**Neo4JStorage
为了保持对遗留数据的兼容在未配置工作空间时PostgreSQL的默认工作空间为`default`Neo4j的默认工作空间为`base`。对于所有的外部存储,系统都提供了专用的工作空间环境变量,用于覆盖公共的 `WORKSPACE`环境变量配置。这些适用于指定存储类型的工作空间环境变量为:`REDIS_WORKSPACE`, `MILVUS_WORKSPACE`, `QDRANT_WORKSPACE`, `MONGODB_WORKSPACE`, `POSTGRES_WORKSPACE`, `NEO4J_WORKSPACE`
### Gunicorn + Uvicorn 的多工作进程
LightRAG 服务器可以在 `Gunicorn + Uvicorn` 预加载模式下运行。Gunicorn 的多工作进程(多进程)功能可以防止文档索引任务阻塞 RAG 查询。使用 CPU 密集型文档提取工具(如 docling在纯 Uvicorn 模式下可能会导致整个系统被阻塞。
@ -445,8 +491,6 @@ EMBEDDING_BINDING_HOST=http://localhost:11434
# WHITELIST_PATHS=/health,/api/*
```
#### 使用 ollama 默认本地服务器作为 llm 和嵌入后端运行 Lightrag 服务器
Ollama 是 llm 和嵌入的默认后端,因此默认情况下您可以不带参数运行 lightrag-server将使用默认值。确保已安装 ollama 并且正在运行,且默认模型已安装在 ollama 上。
@ -516,6 +560,23 @@ lightrag-server --help
pip install lightrag-hku
```
## 文档和块处理逻辑说明
LightRAG 中的文档处理流程有些复杂,分为两个主要阶段:提取阶段(实体和关系提取)和合并阶段(实体和关系合并)。有两个关键参数控制流程并发性:并行处理的最大文件数(`MAX_PARALLEL_INSERT`)和最大并发 LLM 请求数(`MAX_ASYNC`)。工作流程描述如下:
1. `MAX_PARALLEL_INSERT` 控制提取阶段并行处理的文件数量。
2. `MAX_ASYNC` 限制系统中并发 LLM 请求的总数包括查询、提取和合并的请求。LLM 请求具有不同的优先级:查询操作优先级最高,其次是合并,然后是提取。
3. 在单个文件中,来自不同文本块的实体和关系提取是并发处理的,并发度由 `MAX_ASYNC` 设置。只有在处理完 `MAX_ASYNC` 个文本块后,系统才会继续处理同一文件中的下一批文本块。
4. 合并阶段仅在文件中所有文本块完成实体和关系提取后开始。当一个文件进入合并阶段时,流程允许下一个文件开始提取。
5. 由于提取阶段通常比合并阶段快,因此实际并发处理的文件数可能会超过 `MAX_PARALLEL_INSERT`,因为此参数仅控制提取阶段的并行度。
6. 为防止竞争条件,合并阶段不支持多个文件的并发处理;一次只能合并一个文件,其他文件必须在队列中等待。
7. 每个文件在流程中被视为一个原子处理单元。只有当其所有文本块都完成提取和合并后,文件才会被标记为成功处理。如果在处理过程中发生任何错误,整个文件将被标记为失败,并且必须重新处理。
8. 当由于错误而重新处理文件时,由于 LLM 缓存,先前处理的文本块可以快速跳过。尽管 LLM 缓存在合并阶段也会被利用,但合并顺序的不一致可能会限制其在此阶段的有效性。
9. 如果在提取过程中发生错误,系统不会保留任何中间结果。如果在合并过程中发生错误,已合并的实体和关系可能会被保留;当重新处理同一文件时,重新提取的实体和关系将与现有实体和关系合并,而不会影响查询结果。
10. 在合并阶段结束时,所有实体和关系数据都会在向量数据库中更新。如果此时发生错误,某些更新可能会被保留。但是,下一次处理尝试将覆盖先前结果,确保成功重新处理的文件不会影响未来查询结果的完整性。
大型文件应分割成较小的片段以启用增量处理。可以通过在 Web UI 上按“扫描”按钮来启动失败文件的重新处理。
## API 端点
所有服务器LoLLMs、Ollama、OpenAI 和 Azure OpenAI都为 RAG 功能提供相同的 REST API 端点。当 API 服务器运行时,访问:

View file

@ -41,7 +41,9 @@ LightRAG necessitates the integration of both an LLM (Large Language Model) and
* openai or openai compatible
* azure_openai
It is recommended to use environment variables to configure the LightRAG Server. There is an example environment variable file named `env.example` in the root directory of the project. Please copy this file to the startup directory and rename it to `.env`. After that, you can modify the parameters related to the LLM and Embedding models in the `.env` file. It is important to note that the LightRAG Server will load the environment variables from `.env` into the system environment variables each time it starts. Since the LightRAG Server will prioritize the settings in the system environment variables, if you modify the `.env` file after starting the LightRAG Server via the command line, you need to execute `source .env` to make the new settings take effect.
It is recommended to use environment variables to configure the LightRAG Server. There is an example environment variable file named `env.example` in the root directory of the project. Please copy this file to the startup directory and rename it to `.env`. After that, you can modify the parameters related to the LLM and Embedding models in the `.env` file. It is important to note that the LightRAG Server will load the environment variables from `.env` into the system environment variables each time it starts. **LightRAG Server will prioritize the settings in the system environment variables to .env file**.
> Since VS Code with the Python extension may automatically load the .env file in the integrated terminal, please open a new terminal session after each modification to the .env file.
Here are some examples of common settings for LLM and Embedding models:
@ -92,51 +94,95 @@ lightrag-server
```
lightrag-gunicorn --workers 4
```
The `.env` file **must be placed in the startup directory**.
Upon launching, the LightRAG Server will create a documents directory (default is `./inputs`) and a data directory (default is `./rag_storage`). This allows you to initiate multiple instances of LightRAG Server from different directories, with each instance configured to listen on a distinct network port.
When starting LightRAG, the current working directory must contain the `.env` configuration file. **It is intentionally designed that the `.env` file must be placed in the startup directory**. The purpose of this is to allow users to launch multiple LightRAG instances simultaneously and configure different `.env` files for different instances. **After modifying the `.env` file, you need to reopen the terminal for the new settings to take effect.** This is because each time LightRAG Server starts, it loads the environment variables from the `.env` file into the system environment variables, and system environment variables have higher precedence.
Here are some commonly used startup parameters:
During startup, configurations in the `.env` file can be overridden by command-line parameters. Common command-line parameters include:
- `--host`: Server listening address (default: 0.0.0.0)
- `--port`: Server listening port (default: 9621)
- `--timeout`: LLM request timeout (default: 150 seconds)
- `--log-level`: Logging level (default: INFO)
- `--input-dir`: Specifying the directory to scan for documents (default: ./inputs)
- `--log-level`: Log level (default: INFO)
- `--working-dir`: Database persistence directory (default: ./rag_storage)
- `--input-dir`: Directory for uploaded files (default: ./inputs)
- `--workspace`: Workspace name, used to logically isolate data between multiple LightRAG instances (default: empty)
> - The requirement for the .env file to be in the startup directory is intentionally designed this way. The purpose is to support users in launching multiple LightRAG instances simultaneously, allowing different .env files for different instances.
> - **After changing the .env file, you need to open a new terminal to make the new settings take effect.** This because the LightRAG Server will load the environment variables from .env into the system environment variables each time it starts, and LightRAG Server will prioritize the settings in the system environment variables.
### Launching the LightRAG Server with Docker Compose
* Clone the repository:
```
git clone https://github.com/HKUDS/LightRAG.git
cd LightRAG
```
### Launching LightRAG Server with Docker
* Prepare the .env file:
Create a personalized .env file by duplicating env.example. Configure the LLM and embedding parameters according to your requirements.
Create a personalized .env file by copying the sample file [`env.example`](env.example). Configure the LLM and embedding parameters according to your requirements.
* Start the LightRAG Server using the following commands:
* Create a file named `docker-compose.yml`:
```yaml
services:
lightrag:
container_name: lightrag
image: ghcr.io/hkuds/lightrag:latest
ports:
- "${PORT:-9621}:9621"
volumes:
- ./data/rag_storage:/app/data/rag_storage
- ./data/inputs:/app/data/inputs
- ./config.ini:/app/config.ini
- ./.env:/app/.env
env_file:
- .env
restart: unless-stopped
extra_hosts:
- "host.docker.internal:host-gateway"
```
* Start the LightRAG Server with the following command:
```shell
docker compose up
# Use --build if you have pulled a new version
docker compose up --build
# If you want the program to run in the background after startup, add the -d parameter at the end of the command.
```
> Historical versions of LightRAG docker images can be found here: [LightRAG Docker Images]( https://github.com/HKUDS/LightRAG/pkgs/container/lightrag)
> You can get the official docker compose file from here: [docker-compose.yml](https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml). For historical versions of LightRAG docker images, visit this link: [LightRAG Docker Images](https://github.com/HKUDS/LightRAG/pkgs/container/lightrag)
### Auto scan on startup
When starting any of the servers with the `--auto-scan-at-startup` parameter, the system will automatically:
When starting the LightRAG Server with the `--auto-scan-at-startup` parameter, the system will automatically:
1. Scan for new files in the input directory
2. Index new documents that aren't already in the database
3. Make all content immediately available for RAG queries
This offers an efficient method for deploying ad-hoc RAG processes.
> The `--input-dir` parameter specifies the input directory to scan. You can trigger the input directory scan from the Web UI.
### Starting Multiple LightRAG Instances
There are two ways to start multiple LightRAG instances. The first way is to configure a completely independent working environment for each instance. This requires creating a separate working directory for each instance and placing a dedicated `.env` configuration file in that directory. The server listening ports in the configuration files of different instances cannot be the same. Then, you can start the service by running `lightrag-server` in the working directory.
The second way is for all instances to share the same set of `.env` configuration files, and then use command-line arguments to specify different server listening ports and workspaces for each instance. You can start multiple LightRAG instances in the same working directory with different command-line arguments. For example:
```
# Start instance 1
lightrag-server --port 9621 --workspace space1
# Start instance 2
lightrag-server --port 9622 --workspace space2
```
The purpose of a workspace is to achieve data isolation between different instances. Therefore, the `workspace` parameter must be different for different instances; otherwise, it will lead to data confusion and corruption.
### Data Isolation Between LightRAG Instances
Configuring an independent working directory and a dedicated `.env` configuration file for each instance can generally ensure that locally persisted files in the in-memory database are saved in their respective working directories, achieving data isolation. By default, LightRAG uses all in-memory databases, and this method of data isolation is sufficient. However, if you are using an external database, and different instances access the same database instance, you need to use workspaces to achieve data isolation; otherwise, the data of different instances will conflict and be destroyed.
The command-line `workspace` argument and the `WORKSPACE` environment variable in the `.env` file can both be used to specify the workspace name for the current instance, with the command-line argument having higher priority. Here is how workspaces are implemented for different types of storage:
- **For local file-based databases, data isolation is achieved through workspace subdirectories:** `JsonKVStorage`, `JsonDocStatusStorage`, `NetworkXStorage`, `NanoVectorDBStorage`, `FaissVectorDBStorage`.
- **For databases that store data in collections, it's done by adding a workspace prefix to the collection name:** `RedisKVStorage`, `RedisDocStatusStorage`, `MilvusVectorDBStorage`, `QdrantVectorDBStorage`, `MongoKVStorage`, `MongoDocStatusStorage`, `MongoVectorDBStorage`, `MongoGraphStorage`, `PGGraphStorage`.
- **For relational databases, data isolation is achieved by adding a `workspace` field to the tables for logical data separation:** `PGKVStorage`, `PGVectorStorage`, `PGDocStatusStorage`.
- **For the Neo4j graph database, logical data isolation is achieved through labels:** `Neo4JStorage`
To maintain compatibility with legacy data, the default workspace for PostgreSQL is `default` and for Neo4j is `base` when no workspace is configured. For all external storages, the system provides dedicated workspace environment variables to override the common `WORKSPACE` environment variable configuration. These storage-specific workspace environment variables are: `REDIS_WORKSPACE`, `MILVUS_WORKSPACE`, `QDRANT_WORKSPACE`, `MONGODB_WORKSPACE`, `POSTGRES_WORKSPACE`, `NEO4J_WORKSPACE`.
### Multiple workers for Gunicorn + Uvicorn
The LightRAG Server can operate in the `Gunicorn + Uvicorn` preload mode. Gunicorn's multiple worker (multiprocess) capability prevents document indexing tasks from blocking RAG queries. Using CPU-exhaustive document extraction tools, such as docling, can lead to the entire system being blocked in pure Uvicorn mode.
@ -449,6 +495,22 @@ EMBEDDING_BINDING_HOST=http://localhost:11434
```
## Document and Chunk Processing Login Clarification
The document processing pipeline in LightRAG is somewhat complex and is divided into two primary stages: the Extraction stage (entity and relationship extraction) and the Merging stage (entity and relationship merging). There are two key parameters that control pipeline concurrency: the maximum number of files processed in parallel (MAX_PARALLEL_INSERT) and the maximum number of concurrent LLM requests (MAX_ASYNC). The workflow is described as follows:
1. MAX_PARALLEL_INSERT controls the number of files processed in parallel during the extraction stage.
2. MAX_ASYNC limits the total number of concurrent LLM requests in the system, including those for querying, extraction, and merging. LLM requests have different priorities: query operations have the highest priority, followed by merging, and then extraction.
3. Within a single file, entity and relationship extractions from different text blocks are processed concurrently, with the degree of concurrency set by MAX_ASYNC. Only after MAX_ASYNC text blocks are processed will the system proceed to the next batch within the same file.
4. The merging stage begins only after all text blocks in a file have completed entity and relationship extraction. When a file enters the merging stage, the pipeline allows the next file to begin extraction.
5. Since the extraction stage is generally faster than merging, the actual number of files being processed concurrently may exceed MAX_PARALLEL_INSERT, as this parameter only controls parallelism during the extraction stage.
6. To prevent race conditions, the merging stage does not support concurrent processing of multiple files; only one file can be merged at a time, while other files must wait in queue.
7. Each file is treated as an atomic processing unit in the pipeline. A file is marked as successfully processed only after all its text blocks have completed extraction and merging. If any error occurs during processing, the entire file is marked as failed and must be reprocessed.
8. When a file is reprocessed due to errors, previously processed text blocks can be quickly skipped thanks to LLM caching. Although LLM cache is also utilized during the merging stage, inconsistencies in merging order may limit its effectiveness in this stage.
9. If an error occurs during extraction, the system does not retain any intermediate results. If an error occurs during merging, already merged entities and relationships might be preserved; when the same file is reprocessed, re-extracted entities and relationships will be merged with the existing ones, without impacting the query results.
10. At the end of the merging stage, all entity and relationship data are updated in the vector database. Should an error occur at this point, some updates may be retained. However, the next processing attempt will overwrite previous results, ensuring that successfully reprocessed files do not affect the integrity of future query results.
Large files should be divided into smaller segments to enable incremental processing. Reprocessing of failed files can be initiated by pressing the "Scan" button on the web UI.
## API Endpoints

View file

@ -1 +1 @@
__api_version__ = "0170"
__api_version__ = "0180"

View file

@ -184,10 +184,10 @@ def parse_args() -> argparse.Namespace:
# Namespace
parser.add_argument(
"--namespace-prefix",
"--workspace",
type=str,
default=get_env_value("NAMESPACE_PREFIX", ""),
help="Prefix of the namespace",
default=get_env_value("WORKSPACE", ""),
help="Default workspace for all storage",
)
parser.add_argument(
@ -244,6 +244,9 @@ def parse_args() -> argparse.Namespace:
# Get MAX_PARALLEL_INSERT from environment
args.max_parallel_insert = get_env_value("MAX_PARALLEL_INSERT", 2, int)
# Get MAX_GRAPH_NODES from environment
args.max_graph_nodes = get_env_value("MAX_GRAPH_NODES", 1000, int)
# Handle openai-ollama special case
if args.llm_binding == "openai-ollama":
args.llm_binding = "openai"

View file

@ -112,8 +112,8 @@ def create_app(args):
# Check if API key is provided either through env var or args
api_key = os.getenv("LIGHTRAG_API_KEY") or args.key
# Initialize document manager
doc_manager = DocumentManager(args.input_dir)
# Initialize document manager with workspace support for data isolation
doc_manager = DocumentManager(args.input_dir, workspace=args.workspace)
@asynccontextmanager
async def lifespan(app: FastAPI):
@ -295,6 +295,7 @@ def create_app(args):
if args.llm_binding in ["lollms", "ollama", "openai"]:
rag = LightRAG(
working_dir=args.working_dir,
workspace=args.workspace,
llm_model_func=lollms_model_complete
if args.llm_binding == "lollms"
else ollama_model_complete
@ -325,11 +326,13 @@ def create_app(args):
enable_llm_cache=args.enable_llm_cache,
auto_manage_storages_states=False,
max_parallel_insert=args.max_parallel_insert,
max_graph_nodes=args.max_graph_nodes,
addon_params={"language": args.summary_language},
)
else: # azure_openai
rag = LightRAG(
working_dir=args.working_dir,
workspace=args.workspace,
llm_model_func=azure_openai_model_complete,
chunk_token_size=int(args.chunk_size),
chunk_overlap_token_size=int(args.chunk_overlap_size),
@ -351,11 +354,18 @@ def create_app(args):
enable_llm_cache=args.enable_llm_cache,
auto_manage_storages_states=False,
max_parallel_insert=args.max_parallel_insert,
max_graph_nodes=args.max_graph_nodes,
addon_params={"language": args.summary_language},
)
# Add routes
app.include_router(create_document_routes(rag, doc_manager, api_key))
app.include_router(
create_document_routes(
rag,
doc_manager,
api_key,
)
)
app.include_router(create_query_routes(rag, api_key, args.top_k))
app.include_router(create_graph_routes(rag, api_key))
@ -466,6 +476,8 @@ def create_app(args):
"vector_storage": args.vector_storage,
"enable_llm_cache_for_extract": args.enable_llm_cache_for_extract,
"enable_llm_cache": args.enable_llm_cache,
"workspace": args.workspace,
"max_graph_nodes": args.max_graph_nodes,
},
"auth_mode": auth_mode,
"pipeline_busy": pipeline_status.get("busy", False),
@ -478,16 +490,31 @@ def create_app(args):
logger.error(f"Error getting health status: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
# Custom StaticFiles class to prevent caching of HTML files
class NoCacheStaticFiles(StaticFiles):
# Custom StaticFiles class for smart caching
class SmartStaticFiles(StaticFiles): # Renamed from NoCacheStaticFiles
async def get_response(self, path: str, scope):
response = await super().get_response(path, scope)
if path.endswith(".html"):
response.headers["Cache-Control"] = (
"no-cache, no-store, must-revalidate"
)
response.headers["Pragma"] = "no-cache"
response.headers["Expires"] = "0"
elif (
"/assets/" in path
): # Assets (JS, CSS, images, fonts) generated by Vite with hash in filename
response.headers["Cache-Control"] = (
"public, max-age=31536000, immutable"
)
# Add other rules here if needed for non-HTML, non-asset files
# Ensure correct Content-Type
if path.endswith(".js"):
response.headers["Content-Type"] = "application/javascript"
elif path.endswith(".css"):
response.headers["Content-Type"] = "text/css"
return response
# Webui mount webui/index.html
@ -495,7 +522,9 @@ def create_app(args):
static_dir.mkdir(exist_ok=True)
app.mount(
"/webui",
NoCacheStaticFiles(directory=static_dir, html=True, check_dir=True),
SmartStaticFiles(
directory=static_dir, html=True, check_dir=True
), # Use SmartStaticFiles
name="webui",
)

View file

@ -4,7 +4,6 @@ asyncpg
distro
dotenv
fastapi
graspologic>=3.4.1
httpcore
httpx
jiter

View file

@ -12,11 +12,18 @@ import pipmaster as pm
from datetime import datetime, timezone
from pathlib import Path
from typing import Dict, List, Optional, Any, Literal
from fastapi import APIRouter, BackgroundTasks, Depends, File, HTTPException, UploadFile
from fastapi import (
APIRouter,
BackgroundTasks,
Depends,
File,
HTTPException,
UploadFile,
)
from pydantic import BaseModel, Field, field_validator
from lightrag import LightRAG
from lightrag.base import DocProcessingStatus, DocStatus
from lightrag.base import DeletionResult, DocProcessingStatus, DocStatus
from lightrag.api.utils_api import get_combined_auth_dependency
from ..config import global_args
@ -55,6 +62,51 @@ router = APIRouter(
temp_prefix = "__tmp__"
def sanitize_filename(filename: str, input_dir: Path) -> str:
"""
Sanitize uploaded filename to prevent Path Traversal attacks.
Args:
filename: The original filename from the upload
input_dir: The target input directory
Returns:
str: Sanitized filename that is safe to use
Raises:
HTTPException: If the filename is unsafe or invalid
"""
# Basic validation
if not filename or not filename.strip():
raise HTTPException(status_code=400, detail="Filename cannot be empty")
# Remove path separators and traversal sequences
clean_name = filename.replace("/", "").replace("\\", "")
clean_name = clean_name.replace("..", "")
# Remove control characters and null bytes
clean_name = "".join(c for c in clean_name if ord(c) >= 32 and c != "\x7f")
# Remove leading/trailing whitespace and dots
clean_name = clean_name.strip().strip(".")
# Check if anything is left after sanitization
if not clean_name:
raise HTTPException(
status_code=400, detail="Invalid filename after sanitization"
)
# Verify the final path stays within the input directory
try:
final_path = (input_dir / clean_name).resolve()
if not final_path.is_relative_to(input_dir.resolve()):
raise HTTPException(status_code=400, detail="Unsafe filename detected")
except (OSError, ValueError):
raise HTTPException(status_code=400, detail="Invalid filename")
return clean_name
class ScanResponse(BaseModel):
"""Response model for document scanning operation
@ -84,22 +136,30 @@ class InsertTextRequest(BaseModel):
Attributes:
text: The text content to be inserted into the RAG system
file_source: Source of the text (optional)
"""
text: str = Field(
min_length=1,
description="The text to insert",
)
file_source: str = Field(default=None, min_length=0, description="File Source")
@field_validator("text", mode="after")
@classmethod
def strip_after(cls, text: str) -> str:
def strip_text_after(cls, text: str) -> str:
return text.strip()
@field_validator("file_source", mode="after")
@classmethod
def strip_source_after(cls, file_source: str) -> str:
return file_source.strip()
class Config:
json_schema_extra = {
"example": {
"text": "This is a sample text to be inserted into the RAG system."
"text": "This is a sample text to be inserted into the RAG system.",
"file_source": "Source of the text (optional)",
}
}
@ -109,25 +169,37 @@ class InsertTextsRequest(BaseModel):
Attributes:
texts: List of text contents to be inserted into the RAG system
file_sources: Sources of the texts (optional)
"""
texts: list[str] = Field(
min_length=1,
description="The texts to insert",
)
file_sources: list[str] = Field(
default=None, min_length=0, description="Sources of the texts"
)
@field_validator("texts", mode="after")
@classmethod
def strip_after(cls, texts: list[str]) -> list[str]:
def strip_texts_after(cls, texts: list[str]) -> list[str]:
return [text.strip() for text in texts]
@field_validator("file_sources", mode="after")
@classmethod
def strip_sources_after(cls, file_sources: list[str]) -> list[str]:
return [file_source.strip() for file_source in file_sources]
class Config:
json_schema_extra = {
"example": {
"texts": [
"This is the first text to be inserted.",
"This is the second text to be inserted.",
]
],
"file_sources": [
"First file source (optional)",
],
}
}
@ -232,6 +304,55 @@ Attributes:
"""
class DeleteDocRequest(BaseModel):
doc_ids: List[str] = Field(..., description="The IDs of the documents to delete.")
delete_file: bool = Field(
default=False,
description="Whether to delete the corresponding file in the upload directory.",
)
@field_validator("doc_ids", mode="after")
@classmethod
def validate_doc_ids(cls, doc_ids: List[str]) -> List[str]:
if not doc_ids:
raise ValueError("Document IDs list cannot be empty")
validated_ids = []
for doc_id in doc_ids:
if not doc_id or not doc_id.strip():
raise ValueError("Document ID cannot be empty")
validated_ids.append(doc_id.strip())
# Check for duplicates
if len(validated_ids) != len(set(validated_ids)):
raise ValueError("Document IDs must be unique")
return validated_ids
class DeleteEntityRequest(BaseModel):
entity_name: str = Field(..., description="The name of the entity to delete.")
@field_validator("entity_name", mode="after")
@classmethod
def validate_entity_name(cls, entity_name: str) -> str:
if not entity_name or not entity_name.strip():
raise ValueError("Entity name cannot be empty")
return entity_name.strip()
class DeleteRelationRequest(BaseModel):
source_entity: str = Field(..., description="The name of the source entity.")
target_entity: str = Field(..., description="The name of the target entity.")
@field_validator("source_entity", "target_entity", mode="after")
@classmethod
def validate_entity_names(cls, entity_name: str) -> str:
if not entity_name or not entity_name.strip():
raise ValueError("Entity name cannot be empty")
return entity_name.strip()
class DocStatusResponse(BaseModel):
id: str = Field(description="Document identifier")
content_summary: str = Field(description="Summary of document content")
@ -354,6 +475,7 @@ class DocumentManager:
def __init__(
self,
input_dir: str,
workspace: str = "", # New parameter for workspace isolation
supported_extensions: tuple = (
".txt",
".md",
@ -394,10 +516,19 @@ class DocumentManager:
".less", # LESS CSS
),
):
self.input_dir = Path(input_dir)
# Store the base input directory and workspace
self.base_input_dir = Path(input_dir)
self.workspace = workspace
self.supported_extensions = supported_extensions
self.indexed_files = set()
# Create workspace-specific input directory
# If workspace is provided, create a subdirectory for data isolation
if workspace:
self.input_dir = self.base_input_dir / workspace
else:
self.input_dir = self.base_input_dir
# Create input directory if it doesn't exist
self.input_dir.mkdir(parents=True, exist_ok=True)
@ -593,6 +724,12 @@ async def pipeline_enqueue_file(rag: LightRAG, file_path: Path) -> bool:
# Insert into the RAG queue
if content:
# Check if content contains only whitespace characters
if not content.strip():
logger.warning(
f"File contains only whitespace characters. file_paths={file_path.name}"
)
await rag.apipeline_enqueue_documents(content, file_paths=file_path.name)
logger.info(f"Successfully fetched and enqueued file: {file_path.name}")
return True
@ -656,16 +793,25 @@ async def pipeline_index_files(rag: LightRAG, file_paths: List[Path]):
logger.error(traceback.format_exc())
async def pipeline_index_texts(rag: LightRAG, texts: List[str]):
async def pipeline_index_texts(
rag: LightRAG, texts: List[str], file_sources: List[str] = None
):
"""Index a list of texts
Args:
rag: LightRAG instance
texts: The texts to index
file_sources: Sources of the texts
"""
if not texts:
return
await rag.apipeline_enqueue_documents(texts)
if file_sources is not None:
if len(file_sources) != 0 and len(file_sources) != len(texts):
[
file_sources.append("unknown_source")
for _ in range(len(file_sources), len(texts))
]
await rag.apipeline_enqueue_documents(input=texts, file_paths=file_sources)
await rag.apipeline_process_enqueue_documents()
@ -698,7 +844,7 @@ async def run_scanning_process(rag: LightRAG, doc_manager: DocumentManager):
try:
new_files = doc_manager.scan_directory_for_new_files()
total_files = len(new_files)
logger.info(f"Found {total_files} new files to index.")
logger.info(f"Found {total_files} files to index.")
if not new_files:
return
@ -712,6 +858,161 @@ async def run_scanning_process(rag: LightRAG, doc_manager: DocumentManager):
logger.error(traceback.format_exc())
async def background_delete_documents(
rag: LightRAG,
doc_manager: DocumentManager,
doc_ids: List[str],
delete_file: bool = False,
):
"""Background task to delete multiple documents"""
from lightrag.kg.shared_storage import (
get_namespace_data,
get_pipeline_status_lock,
)
pipeline_status = await get_namespace_data("pipeline_status")
pipeline_status_lock = get_pipeline_status_lock()
total_docs = len(doc_ids)
successful_deletions = []
failed_deletions = []
# Double-check pipeline status before proceeding
async with pipeline_status_lock:
if pipeline_status.get("busy", False):
logger.warning("Error: Unexpected pipeline busy state, aborting deletion.")
return # Abort deletion operation
# Set pipeline status to busy for deletion
pipeline_status.update(
{
"busy": True,
"job_name": f"Deleting {total_docs} Documents",
"job_start": datetime.now().isoformat(),
"docs": total_docs,
"batchs": total_docs,
"cur_batch": 0,
"latest_message": "Starting document deletion process",
}
)
# Use slice assignment to clear the list in place
pipeline_status["history_messages"][:] = ["Starting document deletion process"]
try:
# Loop through each document ID and delete them one by one
for i, doc_id in enumerate(doc_ids, 1):
async with pipeline_status_lock:
start_msg = f"Deleting document {i}/{total_docs}: {doc_id}"
logger.info(start_msg)
pipeline_status["cur_batch"] = i
pipeline_status["latest_message"] = start_msg
pipeline_status["history_messages"].append(start_msg)
file_path = "#"
try:
result = await rag.adelete_by_doc_id(doc_id)
file_path = (
getattr(result, "file_path", "-") if "result" in locals() else "-"
)
if result.status == "success":
successful_deletions.append(doc_id)
success_msg = (
f"Deleted document {i}/{total_docs}: {doc_id}[{file_path}]"
)
logger.info(success_msg)
async with pipeline_status_lock:
pipeline_status["history_messages"].append(success_msg)
# Handle file deletion if requested and file_path is available
if (
delete_file
and result.file_path
and result.file_path != "unknown_source"
):
try:
file_path = doc_manager.input_dir / result.file_path
if file_path.exists():
file_path.unlink()
file_delete_msg = (
f"Successfully deleted file: {result.file_path}"
)
logger.info(file_delete_msg)
async with pipeline_status_lock:
pipeline_status["latest_message"] = file_delete_msg
pipeline_status["history_messages"].append(
file_delete_msg
)
else:
file_not_found_msg = (
f"File not found for deletion: {result.file_path}"
)
logger.warning(file_not_found_msg)
async with pipeline_status_lock:
pipeline_status["latest_message"] = (
file_not_found_msg
)
pipeline_status["history_messages"].append(
file_not_found_msg
)
except Exception as file_error:
file_error_msg = f"Failed to delete file {result.file_path}: {str(file_error)}"
logger.error(file_error_msg)
async with pipeline_status_lock:
pipeline_status["latest_message"] = file_error_msg
pipeline_status["history_messages"].append(
file_error_msg
)
elif delete_file:
no_file_msg = f"No valid file path found for document {doc_id}"
logger.warning(no_file_msg)
async with pipeline_status_lock:
pipeline_status["latest_message"] = no_file_msg
pipeline_status["history_messages"].append(no_file_msg)
else:
failed_deletions.append(doc_id)
error_msg = f"Failed to delete {i}/{total_docs}: {doc_id}[{file_path}] - {result.message}"
logger.error(error_msg)
async with pipeline_status_lock:
pipeline_status["latest_message"] = error_msg
pipeline_status["history_messages"].append(error_msg)
except Exception as e:
failed_deletions.append(doc_id)
error_msg = f"Error deleting document {i}/{total_docs}: {doc_id}[{file_path}] - {str(e)}"
logger.error(error_msg)
logger.error(traceback.format_exc())
async with pipeline_status_lock:
pipeline_status["latest_message"] = error_msg
pipeline_status["history_messages"].append(error_msg)
except Exception as e:
error_msg = f"Critical error during batch deletion: {str(e)}"
logger.error(error_msg)
logger.error(traceback.format_exc())
async with pipeline_status_lock:
pipeline_status["history_messages"].append(error_msg)
finally:
# Final summary and check for pending requests
async with pipeline_status_lock:
pipeline_status["busy"] = False
completion_msg = f"Deletion completed: {len(successful_deletions)} successful, {len(failed_deletions)} failed"
pipeline_status["latest_message"] = completion_msg
pipeline_status["history_messages"].append(completion_msg)
# Check if there are pending document indexing requests
has_pending_request = pipeline_status.get("request_pending", False)
# If there are pending requests, start document processing pipeline
if has_pending_request:
try:
logger.info(
"Processing pending document indexing requests after deletion"
)
await rag.apipeline_process_enqueue_documents()
except Exception as e:
logger.error(f"Error processing pending documents after deletion: {e}")
def create_document_routes(
rag: LightRAG, doc_manager: DocumentManager, api_key: Optional[str] = None
):
@ -764,18 +1065,21 @@ def create_document_routes(
HTTPException: If the file type is not supported (400) or other errors occur (500).
"""
try:
if not doc_manager.is_supported_file(file.filename):
# Sanitize filename to prevent Path Traversal attacks
safe_filename = sanitize_filename(file.filename, doc_manager.input_dir)
if not doc_manager.is_supported_file(safe_filename):
raise HTTPException(
status_code=400,
detail=f"Unsupported file type. Supported types: {doc_manager.supported_extensions}",
)
file_path = doc_manager.input_dir / file.filename
file_path = doc_manager.input_dir / safe_filename
# Check if file already exists
if file_path.exists():
return InsertResponse(
status="duplicated",
message=f"File '{file.filename}' already exists in the input directory.",
message=f"File '{safe_filename}' already exists in the input directory.",
)
with open(file_path, "wb") as buffer:
@ -786,7 +1090,7 @@ def create_document_routes(
return InsertResponse(
status="success",
message=f"File '{file.filename}' uploaded successfully. Processing will continue in background.",
message=f"File '{safe_filename}' uploaded successfully. Processing will continue in background.",
)
except Exception as e:
logger.error(f"Error /documents/upload: {file.filename}: {str(e)}")
@ -816,7 +1120,12 @@ def create_document_routes(
HTTPException: If an error occurs during text processing (500).
"""
try:
background_tasks.add_task(pipeline_index_texts, rag, [request.text])
background_tasks.add_task(
pipeline_index_texts,
rag,
[request.text],
file_sources=[request.file_source],
)
return InsertResponse(
status="success",
message="Text successfully received. Processing will continue in background.",
@ -851,7 +1160,12 @@ def create_document_routes(
HTTPException: If an error occurs during text processing (500).
"""
try:
background_tasks.add_task(pipeline_index_texts, rag, request.texts)
background_tasks.add_task(
pipeline_index_texts,
rag,
request.texts,
file_sources=request.file_sources,
)
return InsertResponse(
status="success",
message="Text successfully received. Processing will continue in background.",
@ -861,114 +1175,6 @@ def create_document_routes(
logger.error(traceback.format_exc())
raise HTTPException(status_code=500, detail=str(e))
# TODO: deprecated, use /upload instead
@router.post(
"/file", response_model=InsertResponse, dependencies=[Depends(combined_auth)]
)
async def insert_file(
background_tasks: BackgroundTasks, file: UploadFile = File(...)
):
"""
Insert a file directly into the RAG system.
This endpoint accepts a file upload and processes it for inclusion in the RAG system.
The file is saved temporarily and processed in the background.
Args:
background_tasks: FastAPI BackgroundTasks for async processing
file (UploadFile): The file to be processed
Returns:
InsertResponse: A response object containing the status of the operation.
Raises:
HTTPException: If the file type is not supported (400) or other errors occur (500).
"""
try:
if not doc_manager.is_supported_file(file.filename):
raise HTTPException(
status_code=400,
detail=f"Unsupported file type. Supported types: {doc_manager.supported_extensions}",
)
temp_path = await save_temp_file(doc_manager.input_dir, file)
# Add to background tasks
background_tasks.add_task(pipeline_index_file, rag, temp_path)
return InsertResponse(
status="success",
message=f"File '{file.filename}' saved successfully. Processing will continue in background.",
)
except Exception as e:
logger.error(f"Error /documents/file: {str(e)}")
logger.error(traceback.format_exc())
raise HTTPException(status_code=500, detail=str(e))
# TODO: deprecated, use /upload instead
@router.post(
"/file_batch",
response_model=InsertResponse,
dependencies=[Depends(combined_auth)],
)
async def insert_batch(
background_tasks: BackgroundTasks, files: List[UploadFile] = File(...)
):
"""
Process multiple files in batch mode.
This endpoint allows uploading and processing multiple files simultaneously.
It handles partial successes and provides detailed feedback about failed files.
Args:
background_tasks: FastAPI BackgroundTasks for async processing
files (List[UploadFile]): List of files to process
Returns:
InsertResponse: A response object containing:
- status: "success", "partial_success", or "failure"
- message: Detailed information about the operation results
Raises:
HTTPException: If an error occurs during processing (500).
"""
try:
inserted_count = 0
failed_files = []
temp_files = []
for file in files:
if doc_manager.is_supported_file(file.filename):
# Create a temporary file to save the uploaded content
temp_files.append(await save_temp_file(doc_manager.input_dir, file))
inserted_count += 1
else:
failed_files.append(f"{file.filename} (unsupported type)")
if temp_files:
background_tasks.add_task(pipeline_index_files, rag, temp_files)
# Prepare status message
if inserted_count == len(files):
status = "success"
status_message = f"Successfully inserted all {inserted_count} documents"
elif inserted_count > 0:
status = "partial_success"
status_message = f"Successfully inserted {inserted_count} out of {len(files)} documents"
if failed_files:
status_message += f". Failed files: {', '.join(failed_files)}"
else:
status = "failure"
status_message = "No documents were successfully inserted"
if failed_files:
status_message += f". Failed files: {', '.join(failed_files)}"
return InsertResponse(status=status, message=status_message)
except Exception as e:
logger.error(f"Error /documents/batch: {str(e)}")
logger.error(traceback.format_exc())
raise HTTPException(status_code=500, detail=str(e))
@router.delete(
"", response_model=ClearDocumentsResponse, dependencies=[Depends(combined_auth)]
)
@ -1279,6 +1485,94 @@ def create_document_routes(
logger.error(traceback.format_exc())
raise HTTPException(status_code=500, detail=str(e))
class DeleteDocByIdResponse(BaseModel):
"""Response model for single document deletion operation."""
status: Literal["deletion_started", "busy", "not_allowed"] = Field(
description="Status of the deletion operation"
)
message: str = Field(description="Message describing the operation result")
doc_id: str = Field(description="The ID of the document to delete")
@router.delete(
"/delete_document",
response_model=DeleteDocByIdResponse,
dependencies=[Depends(combined_auth)],
summary="Delete a document and all its associated data by its ID.",
)
async def delete_document(
delete_request: DeleteDocRequest,
background_tasks: BackgroundTasks,
) -> DeleteDocByIdResponse:
"""
Delete documents and all their associated data by their IDs using background processing.
Deletes specific documents and all their associated data, including their status,
text chunks, vector embeddings, and any related graph data.
The deletion process runs in the background to avoid blocking the client connection.
It is disabled when llm cache for entity extraction is disabled.
This operation is irreversible and will interact with the pipeline status.
Args:
delete_request (DeleteDocRequest): The request containing the document IDs and delete_file options.
background_tasks: FastAPI BackgroundTasks for async processing
Returns:
DeleteDocByIdResponse: The result of the deletion operation.
- status="deletion_started": The document deletion has been initiated in the background.
- status="busy": The pipeline is busy with another operation.
- status="not_allowed": Operation not allowed when LLM cache for entity extraction is disabled.
Raises:
HTTPException:
- 500: If an unexpected internal error occurs during initialization.
"""
doc_ids = delete_request.doc_ids
# The rag object is initialized from the server startup args,
# so we can access its properties here.
if not rag.enable_llm_cache_for_entity_extract:
return DeleteDocByIdResponse(
status="not_allowed",
message="Operation not allowed when LLM cache for entity extraction is disabled.",
doc_id=", ".join(delete_request.doc_ids),
)
try:
from lightrag.kg.shared_storage import get_namespace_data
pipeline_status = await get_namespace_data("pipeline_status")
# Check if pipeline is busy
if pipeline_status.get("busy", False):
return DeleteDocByIdResponse(
status="busy",
message="Cannot delete documents while pipeline is busy",
doc_id=", ".join(doc_ids),
)
# Add deletion task to background tasks
background_tasks.add_task(
background_delete_documents,
rag,
doc_manager,
doc_ids,
delete_request.delete_file,
)
return DeleteDocByIdResponse(
status="deletion_started",
message=f"Document deletion for {len(doc_ids)} documents has been initiated. Processing will continue in background.",
doc_id=", ".join(doc_ids),
)
except Exception as e:
error_msg = f"Error initiating document deletion for {delete_request.doc_ids}: {str(e)}"
logger.error(error_msg)
logger.error(traceback.format_exc())
raise HTTPException(status_code=500, detail=error_msg)
@router.post(
"/clear_cache",
response_model=ClearCacheResponse,
@ -1332,4 +1626,77 @@ def create_document_routes(
logger.error(traceback.format_exc())
raise HTTPException(status_code=500, detail=str(e))
@router.delete(
"/delete_entity",
response_model=DeletionResult,
dependencies=[Depends(combined_auth)],
)
async def delete_entity(request: DeleteEntityRequest):
"""
Delete an entity and all its relationships from the knowledge graph.
Args:
request (DeleteEntityRequest): The request body containing the entity name.
Returns:
DeletionResult: An object containing the outcome of the deletion process.
Raises:
HTTPException: If the entity is not found (404) or an error occurs (500).
"""
try:
result = await rag.adelete_by_entity(entity_name=request.entity_name)
if result.status == "not_found":
raise HTTPException(status_code=404, detail=result.message)
if result.status == "fail":
raise HTTPException(status_code=500, detail=result.message)
# Set doc_id to empty string since this is an entity operation, not document
result.doc_id = ""
return result
except HTTPException:
raise
except Exception as e:
error_msg = f"Error deleting entity '{request.entity_name}': {str(e)}"
logger.error(error_msg)
logger.error(traceback.format_exc())
raise HTTPException(status_code=500, detail=error_msg)
@router.delete(
"/delete_relation",
response_model=DeletionResult,
dependencies=[Depends(combined_auth)],
)
async def delete_relation(request: DeleteRelationRequest):
"""
Delete a relationship between two entities from the knowledge graph.
Args:
request (DeleteRelationRequest): The request body containing the source and target entity names.
Returns:
DeletionResult: An object containing the outcome of the deletion process.
Raises:
HTTPException: If the relation is not found (404) or an error occurs (500).
"""
try:
result = await rag.adelete_by_relation(
source_entity=request.source_entity,
target_entity=request.target_entity,
)
if result.status == "not_found":
raise HTTPException(status_code=404, detail=result.message)
if result.status == "fail":
raise HTTPException(status_code=500, detail=result.message)
# Set doc_id to empty string since this is a relation operation, not document
result.doc_id = ""
return result
except HTTPException:
raise
except Exception as e:
error_msg = f"Error deleting relation from '{request.source_entity}' to '{request.target_entity}': {str(e)}"
logger.error(error_msg)
logger.error(traceback.format_exc())
raise HTTPException(status_code=500, detail=error_msg)
return router

View file

@ -1,7 +1,7 @@
from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel
from typing import List, Dict, Any, Optional
import logging
from typing import List, Dict, Any, Optional, Type
from lightrag.utils import logger
import time
import json
import re
@ -95,6 +95,68 @@ class OllamaTagResponse(BaseModel):
models: List[OllamaModel]
class OllamaRunningModelDetails(BaseModel):
parent_model: str
format: str
family: str
families: List[str]
parameter_size: str
quantization_level: str
class OllamaRunningModel(BaseModel):
name: str
model: str
size: int
digest: str
details: OllamaRunningModelDetails
expires_at: str
size_vram: int
class OllamaPsResponse(BaseModel):
models: List[OllamaRunningModel]
async def parse_request_body(
request: Request, model_class: Type[BaseModel]
) -> BaseModel:
"""
Parse request body based on Content-Type header.
Supports both application/json and application/octet-stream.
Args:
request: The FastAPI Request object
model_class: The Pydantic model class to parse the request into
Returns:
An instance of the provided model_class
"""
content_type = request.headers.get("content-type", "").lower()
try:
if content_type.startswith("application/json"):
# FastAPI already handles JSON parsing for us
body = await request.json()
elif content_type.startswith("application/octet-stream"):
# Manually parse octet-stream as JSON
body_bytes = await request.body()
body = json.loads(body_bytes.decode("utf-8"))
else:
# Try to parse as JSON for any other content type
body_bytes = await request.body()
body = json.loads(body_bytes.decode("utf-8"))
# Create an instance of the model
return model_class(**body)
except json.JSONDecodeError:
raise HTTPException(status_code=400, detail="Invalid JSON in request body")
except Exception as e:
raise HTTPException(
status_code=400, detail=f"Error parsing request body: {str(e)}"
)
def estimate_tokens(text: str) -> int:
"""Estimate the number of tokens in text using tiktoken"""
tokens = TiktokenTokenizer().encode(text)
@ -172,7 +234,7 @@ class OllamaAPI:
@self.router.get("/version", dependencies=[Depends(combined_auth)])
async def get_version():
"""Get Ollama version information"""
return OllamaVersionResponse(version="0.5.4")
return OllamaVersionResponse(version="0.9.3")
@self.router.get("/tags", dependencies=[Depends(combined_auth)])
async def get_tags():
@ -182,9 +244,9 @@ class OllamaAPI:
{
"name": self.ollama_server_infos.LIGHTRAG_MODEL,
"model": self.ollama_server_infos.LIGHTRAG_MODEL,
"modified_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"size": self.ollama_server_infos.LIGHTRAG_SIZE,
"digest": self.ollama_server_infos.LIGHTRAG_DIGEST,
"modified_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"details": {
"parent_model": "",
"format": "gguf",
@ -197,13 +259,43 @@ class OllamaAPI:
]
)
@self.router.post("/generate", dependencies=[Depends(combined_auth)])
async def generate(raw_request: Request, request: OllamaGenerateRequest):
@self.router.get("/ps", dependencies=[Depends(combined_auth)])
async def get_running_models():
"""List Running Models - returns currently running models"""
return OllamaPsResponse(
models=[
{
"name": self.ollama_server_infos.LIGHTRAG_MODEL,
"model": self.ollama_server_infos.LIGHTRAG_MODEL,
"size": self.ollama_server_infos.LIGHTRAG_SIZE,
"digest": self.ollama_server_infos.LIGHTRAG_DIGEST,
"details": {
"parent_model": "",
"format": "gguf",
"family": "llama",
"families": ["llama"],
"parameter_size": "7.2B",
"quantization_level": "Q4_0",
},
"expires_at": "2050-12-31T14:38:31.83753-07:00",
"size_vram": self.ollama_server_infos.LIGHTRAG_SIZE,
}
]
)
@self.router.post(
"/generate", dependencies=[Depends(combined_auth)], include_in_schema=True
)
async def generate(raw_request: Request):
"""Handle generate completion requests acting as an Ollama model
For compatibility purpose, the request is not processed by LightRAG,
and will be handled by underlying LLM model.
Supports both application/json and application/octet-stream Content-Types.
"""
try:
# Parse the request body manually
request = await parse_request_body(raw_request, OllamaGenerateRequest)
query = request.prompt
start_time = time.time_ns()
prompt_tokens = estimate_tokens(query)
@ -245,7 +337,10 @@ class OllamaAPI:
data = {
"model": self.ollama_server_infos.LIGHTRAG_MODEL,
"created_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"response": "",
"done": True,
"done_reason": "stop",
"context": [],
"total_duration": total_time,
"load_duration": 0,
"prompt_eval_count": prompt_tokens,
@ -278,13 +373,14 @@ class OllamaAPI:
else:
error_msg = f"Provider error: {error_msg}"
logging.error(f"Stream error: {error_msg}")
logger.error(f"Stream error: {error_msg}")
# Send error message to client
error_data = {
"model": self.ollama_server_infos.LIGHTRAG_MODEL,
"created_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"response": f"\n\nError: {error_msg}",
"error": f"\n\nError: {error_msg}",
"done": False,
}
yield f"{json.dumps(error_data, ensure_ascii=False)}\n"
@ -293,6 +389,7 @@ class OllamaAPI:
final_data = {
"model": self.ollama_server_infos.LIGHTRAG_MODEL,
"created_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"response": "",
"done": True,
}
yield f"{json.dumps(final_data, ensure_ascii=False)}\n"
@ -307,7 +404,10 @@ class OllamaAPI:
data = {
"model": self.ollama_server_infos.LIGHTRAG_MODEL,
"created_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"response": "",
"done": True,
"done_reason": "stop",
"context": [],
"total_duration": total_time,
"load_duration": 0,
"prompt_eval_count": prompt_tokens,
@ -352,6 +452,8 @@ class OllamaAPI:
"created_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"response": str(response_text),
"done": True,
"done_reason": "stop",
"context": [],
"total_duration": total_time,
"load_duration": 0,
"prompt_eval_count": prompt_tokens,
@ -363,13 +465,19 @@ class OllamaAPI:
trace_exception(e)
raise HTTPException(status_code=500, detail=str(e))
@self.router.post("/chat", dependencies=[Depends(combined_auth)])
async def chat(raw_request: Request, request: OllamaChatRequest):
@self.router.post(
"/chat", dependencies=[Depends(combined_auth)], include_in_schema=True
)
async def chat(raw_request: Request):
"""Process chat completion requests acting as an Ollama model
Routes user queries through LightRAG by selecting query mode based on prefix indicators.
Detects and forwards OpenWebUI session-related requests (for meta data generation task) directly to LLM.
Supports both application/json and application/octet-stream Content-Types.
"""
try:
# Parse the request body manually
request = await parse_request_body(raw_request, OllamaChatRequest)
# Get all messages
messages = request.messages
if not messages:
@ -459,6 +567,12 @@ class OllamaAPI:
data = {
"model": self.ollama_server_infos.LIGHTRAG_MODEL,
"created_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"message": {
"role": "assistant",
"content": "",
"images": None,
},
"done_reason": "stop",
"done": True,
"total_duration": total_time,
"load_duration": 0,
@ -496,7 +610,7 @@ class OllamaAPI:
else:
error_msg = f"Provider error: {error_msg}"
logging.error(f"Stream error: {error_msg}")
logger.error(f"Stream error: {error_msg}")
# Send error message to client
error_data = {
@ -507,6 +621,7 @@ class OllamaAPI:
"content": f"\n\nError: {error_msg}",
"images": None,
},
"error": f"\n\nError: {error_msg}",
"done": False,
}
yield f"{json.dumps(error_data, ensure_ascii=False)}\n"
@ -515,6 +630,11 @@ class OllamaAPI:
final_data = {
"model": self.ollama_server_infos.LIGHTRAG_MODEL,
"created_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"message": {
"role": "assistant",
"content": "",
"images": None,
},
"done": True,
}
yield f"{json.dumps(final_data, ensure_ascii=False)}\n"
@ -530,6 +650,12 @@ class OllamaAPI:
data = {
"model": self.ollama_server_infos.LIGHTRAG_MODEL,
"created_at": self.ollama_server_infos.LIGHTRAG_CREATED_AT,
"message": {
"role": "assistant",
"content": "",
"images": None,
},
"done_reason": "stop",
"done": True,
"total_duration": total_time,
"load_duration": 0,
@ -594,6 +720,7 @@ class OllamaAPI:
"content": str(response_text),
"images": None,
},
"done_reason": "stop",
"done": True,
"total_duration": total_time,
"load_duration": 0,

View file

@ -78,6 +78,10 @@ class QueryRequest(BaseModel):
description="Number of complete conversation turns (user-assistant pairs) to consider in the response context.",
)
ids: list[str] | None = Field(
default=None, description="List of ids to filter the results."
)
user_prompt: Optional[str] = Field(
default=None,
description="User-provided prompt for the query. If provided, this will be used instead of the default value from prompt template.",
@ -179,6 +183,9 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60):
if isinstance(response, str):
# If it's a string, send it all at once
yield f"{json.dumps({'response': response})}\n"
elif response is None:
# Handle None response (e.g., when only_need_context=True but no context found)
yield f"{json.dumps({'response': 'No relevant context found for the query.'})}\n"
else:
# If it's an async generator, send chunks one by one
try:

View file

@ -175,12 +175,24 @@ def display_splash_screen(args: argparse.Namespace) -> None:
args: Parsed command line arguments
"""
# Banner
ASCIIColors.cyan(f"""
🚀 LightRAG Server v{core_version}/{api_version}
Fast, Lightweight RAG Server Implementation
""")
# Banner
top_border = "╔══════════════════════════════════════════════════════════════╗"
bottom_border = "╚══════════════════════════════════════════════════════════════╝"
width = len(top_border) - 4 # width inside the borders
line1_text = f"LightRAG Server v{core_version}/{api_version}"
line2_text = "Fast, Lightweight RAG Server Implementation"
line1 = f"{line1_text.center(width)}"
line2 = f"{line2_text.center(width)}"
banner = f"""
{top_border}
{line1}
{line2}
{bottom_border}
"""
ASCIIColors.cyan(banner)
# Server Configuration
ASCIIColors.magenta("\n📡 Server Configuration:")
@ -284,8 +296,10 @@ def display_splash_screen(args: argparse.Namespace) -> None:
ASCIIColors.yellow(f"{args.vector_storage}")
ASCIIColors.white(" ├─ Graph Storage: ", end="")
ASCIIColors.yellow(f"{args.graph_storage}")
ASCIIColors.white(" ─ Document Status Storage: ", end="")
ASCIIColors.white(" ─ Document Status Storage: ", end="")
ASCIIColors.yellow(f"{args.doc_status_storage}")
ASCIIColors.white(" └─ Workspace: ", end="")
ASCIIColors.yellow(f"{args.workspace if args.workspace else '-'}")
# Server Status
ASCIIColors.green("\n✨ Server starting up...\n")

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show more