cherry-pick 1d2f534f
This commit is contained in:
parent
2aeee59fb9
commit
5f51abb88f
2 changed files with 201 additions and 96 deletions
142
README-zh.md
142
README-zh.md
|
|
@ -53,28 +53,24 @@
|
|||
|
||||
## 🎉 新闻
|
||||
|
||||
- [x] [2025.11.05]🎯📢添加**基于RAGAS的**LightRAG评估框架。
|
||||
- [x] [2025.10.22]🎯📢消除处理**大规模数据集**的瓶颈。
|
||||
- [x] [2025.09.15]🎯📢显著提升**小型LLM**(如Qwen3-30B-A3B)的知识图谱提取准确性。
|
||||
- [x] [2025.08.29]🎯📢现已支持**Reranker**,显著提升混合查询性能。
|
||||
- [x] [2025.08.04]🎯📢支持**文档删除**并重新生成知识图谱以确保查询性能。
|
||||
- [x] [2025.06.16]🎯📢我们的团队发布了[RAG-Anything](https://github.com/HKUDS/RAG-Anything),一个用于无缝处理文本、图像、表格和方程式的全功能多模态 RAG 系统。
|
||||
- [X] [2025.06.05]🎯📢LightRAG现已集成[RAG-Anything](https://github.com/HKUDS/RAG-Anything),支持全面的多模态文档解析与RAG能力(PDF、图片、Office文档、表格、公式等)。详见下方[多模态处理模块](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#多模态文档处理rag-anything集成)。
|
||||
- [X] [2025.03.18]🎯📢LightRAG现已支持引文功能。
|
||||
- [X] [2025.02.05]🎯📢我们团队发布了[VideoRAG](https://github.com/HKUDS/VideoRAG),用于理解超长上下文视频。
|
||||
- [X] [2025.01.13]🎯📢我们团队发布了[MiniRAG](https://github.com/HKUDS/MiniRAG),使用小型模型简化RAG。
|
||||
- [X] [2025.01.06]🎯📢现在您可以[使用PostgreSQL进行存储](#using-postgresql-for-storage)。
|
||||
- [X] [2024.11.25]🎯📢LightRAG现在支持无缝集成[自定义知识图谱](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#insert-custom-kg),使用户能够用自己的领域专业知识增强系统。
|
||||
- [X] [2024.11.19]🎯📢LightRAG的综合指南现已在[LearnOpenCV](https://learnopencv.com/lightrag)上发布。非常感谢博客作者。
|
||||
- [X] [2024.11.11]🎯📢LightRAG现在支持[通过实体名称删除实体](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#delete)。
|
||||
- [X] [2024.11.09]🎯📢推出[LightRAG Gui](https://lightrag-gui.streamlit.app),允许您插入、查询、可视化和下载LightRAG知识。
|
||||
- [X] [2024.11.04]🎯📢现在您可以[使用Neo4J进行存储](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-neo4j-for-storage)。
|
||||
- [X] [2024.10.29]🎯📢LightRAG现在通过`textract`支持多种文件类型,包括PDF、DOC、PPT和CSV。
|
||||
- [X] [2024.10.20]🎯📢我们为LightRAG添加了一个新功能:图形可视化。
|
||||
- [X] [2024.10.18]🎯📢我们添加了[LightRAG介绍视频](https://youtu.be/oageL-1I0GE)的链接。感谢作者!
|
||||
- [X] [2024.10.17]🎯📢我们创建了一个[Discord频道](https://discord.gg/yF2MmDJyGJ)!欢迎加入分享和讨论!🎉🎉
|
||||
- [X] [2024.10.16]🎯📢LightRAG现在支持[Ollama模型](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)!
|
||||
- [X] [2024.10.15]🎯📢LightRAG现在支持[Hugging Face模型](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)!
|
||||
- [x] [2025.11.05]🎯添加**基于RAGAS的**评估框架和**Langfuse**可观测性支持(API可随查询结果返回召回上下文)。
|
||||
- [x] [2025.10.22]🎯消除处理**大规模数据集**的性能瓶颈。
|
||||
- [x] [2025.09.15]🎯显著提升**小型LLM**(如Qwen3-30B-A3B)的知识图谱提取准确性。
|
||||
- [x] [2025.08.29]🎯现已支持**Reranker**,显著提升混合查询性能(现已设为默认查询模式)。
|
||||
- [x] [2025.08.04]🎯支持**文档删除**并重新生成知识图谱以确保查询性能。
|
||||
- [x] [2025.06.16]🎯我们的团队发布了[RAG-Anything](https://github.com/HKUDS/RAG-Anything),一个用于无缝处理文本、图像、表格和方程式的全功能多模态 RAG 系统。
|
||||
- [x] [2025.06.05]🎯LightRAG现已集成[RAG-Anything](https://github.com/HKUDS/RAG-Anything),支持全面的多模态文档解析与RAG能力(PDF、图片、Office文档、表格、公式等)。详见下方[多模态处理模块](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#多模态文档处理rag-anything集成)。
|
||||
- [x] [2025.03.18]🎯LightRAG现已支持参考文献功能。
|
||||
- [x] [2025.02.12]🎯现在您可以使用MongoDB作为一体化存储解决方案。
|
||||
- [x] [2025.02.05]🎯我们团队发布了[VideoRAG](https://github.com/HKUDS/VideoRAG),用于理解超长上下文视频。
|
||||
- [x] [2025.01.13]🎯我们团队发布了[MiniRAG](https://github.com/HKUDS/MiniRAG),使用小型模型简化RAG。
|
||||
- [x] [2025.01.06]🎯现在您可以使用PostgreSQL作为一体化存储解决方案。
|
||||
- [x] [2024.11.19]🎯LightRAG的综合指南现已在[LearnOpenCV](https://learnopencv.com/lightrag)上发布。非常感谢博客作者。
|
||||
- [x] [2024.11.09]🎯推出LightRAG Webui,允许您插入、查询、可视化LightRAG知识。
|
||||
- [x] [2024.11.04]🎯现在您可以[使用Neo4J进行存储](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-neo4j-for-storage)。
|
||||
- [x] [2024.10.18]🎯我们添加了[LightRAG介绍视频](https://youtu.be/oageL-1I0GE)的链接。感谢作者!
|
||||
- [x] [2024.10.17]🎯我们创建了一个[Discord频道](https://discord.gg/yF2MmDJyGJ)!欢迎加入分享和讨论!🎉🎉
|
||||
- [x] [2024.10.16]🎯LightRAG现在支持[Ollama模型](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)!
|
||||
|
||||
<details>
|
||||
<summary style="font-size: 1.4em; font-weight: bold; cursor: pointer; display: list-item;">
|
||||
|
|
@ -90,6 +86,11 @@
|
|||
|
||||
## 安装
|
||||
|
||||
> **💡 使用 uv 进行包管理**: 本项目使用 [uv](https://docs.astral.sh/uv/) 进行快速可靠的 Python 包管理。
|
||||
> 首先安装 uv: `curl -LsSf https://astral.sh/uv/install.sh | sh` (Unix/macOS) 或 `powershell -c "irm https://astral.sh/uv/install.ps1 | iex"` (Windows)
|
||||
>
|
||||
> **注意**: 如果您更喜欢使用 pip 也可以,但我们推荐使用 uv 以获得更好的性能和更可靠的依赖管理。
|
||||
|
||||
### 安装LightRAG服务器
|
||||
|
||||
LightRAG服务器旨在提供Web UI和API支持。Web UI便于文档索引、知识图谱探索和简单的RAG查询界面。LightRAG服务器还提供兼容Ollama的接口,旨在将LightRAG模拟为Ollama聊天模型。这使得AI聊天机器人(如Open WebUI)可以轻松访问LightRAG。
|
||||
|
|
@ -97,8 +98,13 @@ LightRAG服务器旨在提供Web UI和API支持。Web UI便于文档索引、知
|
|||
* 从PyPI安装
|
||||
|
||||
```bash
|
||||
pip install "lightrag-hku[api]"
|
||||
cp env.example .env
|
||||
# 使用 uv (推荐)
|
||||
uv pip install "lightrag-hku[api]"
|
||||
# 或使用 pip
|
||||
# pip install "lightrag-hku[api]"
|
||||
|
||||
cp env.example .env # 使用你的LLM和Embedding模型访问参数更新.env文件
|
||||
|
||||
lightrag-server
|
||||
```
|
||||
|
||||
|
|
@ -107,9 +113,17 @@ lightrag-server
|
|||
```bash
|
||||
git clone https://github.com/HKUDS/LightRAG.git
|
||||
cd LightRAG
|
||||
# 如有必要,创建Python虚拟环境
|
||||
# 以可开发(编辑)模式安装LightRAG服务器
|
||||
pip install -e ".[api]"
|
||||
|
||||
# 使用 uv (推荐)
|
||||
# 注意: uv sync 会自动在 .venv/ 目录创建虚拟环境
|
||||
uv sync --extra api
|
||||
source .venv/bin/activate # 激活虚拟环境 (Linux/macOS)
|
||||
# Windows 系统: .venv\Scripts\activate
|
||||
|
||||
# 或使用 pip 和虚拟环境
|
||||
# python -m venv .venv
|
||||
# source .venv/bin/activate # Windows: .venv\Scripts\activate
|
||||
# pip install -e ".[api]"
|
||||
|
||||
cp env.example .env # 使用你的LLM和Embedding模型访问参数更新.env文件
|
||||
|
||||
|
|
@ -140,13 +154,19 @@ docker compose up
|
|||
|
||||
```bash
|
||||
cd LightRAG
|
||||
pip install -e .
|
||||
# 注意: uv sync 会自动在 .venv/ 目录创建虚拟环境
|
||||
uv sync
|
||||
source .venv/bin/activate # 激活虚拟环境 (Linux/macOS)
|
||||
# Windows 系统: .venv\Scripts\activate
|
||||
|
||||
# 或: pip install -e .
|
||||
```
|
||||
|
||||
* 从PyPI安装
|
||||
|
||||
```bash
|
||||
pip install lightrag-hku
|
||||
uv pip install lightrag-hku
|
||||
# 或: pip install lightrag-hku
|
||||
```
|
||||
|
||||
## 快速开始
|
||||
|
|
@ -198,6 +218,10 @@ python examples/lightrag_openai_demo.py
|
|||
|
||||
> ⚠️ **如果您希望将LightRAG集成到您的项目中,建议您使用LightRAG Server提供的REST API**。LightRAG Core通常用于嵌入式应用,或供希望进行研究与评估的学者使用。
|
||||
|
||||
### ⚠️ 重要:初始化要求
|
||||
|
||||
LightRAG 在使用前需要显式初始化。 创建 LightRAG 实例后,您必须调用 await rag.initialize_storages(),否则将出现错误。
|
||||
|
||||
### 一个简单程序
|
||||
|
||||
以下Python代码片段演示了如何初始化LightRAG、插入文本并进行查询:
|
||||
|
|
@ -207,7 +231,6 @@ import os
|
|||
import asyncio
|
||||
from lightrag import LightRAG, QueryParam
|
||||
from lightrag.llm.openai import gpt_4o_mini_complete, gpt_4o_complete, openai_embed
|
||||
from lightrag.kg.shared_storage import initialize_pipeline_status
|
||||
from lightrag.utils import setup_logger
|
||||
|
||||
setup_logger("lightrag", level="INFO")
|
||||
|
|
@ -222,9 +245,7 @@ async def initialize_rag():
|
|||
embedding_func=openai_embed,
|
||||
llm_model_func=gpt_4o_mini_complete,
|
||||
)
|
||||
await rag.initialize_storages()
|
||||
await initialize_pipeline_status()
|
||||
return rag
|
||||
await rag.initialize_storages() return rag
|
||||
|
||||
async def main():
|
||||
try:
|
||||
|
|
@ -418,8 +439,6 @@ async def initialize_rag():
|
|||
)
|
||||
|
||||
await rag.initialize_storages()
|
||||
await initialize_pipeline_status()
|
||||
|
||||
return rag
|
||||
```
|
||||
|
||||
|
|
@ -548,7 +567,6 @@ from lightrag import LightRAG
|
|||
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
|
||||
from llama_index.embeddings.openai import OpenAIEmbedding
|
||||
from llama_index.llms.openai import OpenAI
|
||||
from lightrag.kg.shared_storage import initialize_pipeline_status
|
||||
from lightrag.utils import setup_logger
|
||||
|
||||
# 为LightRAG设置日志处理程序
|
||||
|
|
@ -565,8 +583,6 @@ async def initialize_rag():
|
|||
)
|
||||
|
||||
await rag.initialize_storages()
|
||||
await initialize_pipeline_status()
|
||||
|
||||
return rag
|
||||
|
||||
def main():
|
||||
|
|
@ -816,8 +832,6 @@ async def initialize_rag():
|
|||
# 初始化数据库连接
|
||||
await rag.initialize_storages()
|
||||
# 初始化文档处理的管道状态
|
||||
await initialize_pipeline_status()
|
||||
|
||||
return rag
|
||||
```
|
||||
|
||||
|
|
@ -867,8 +881,8 @@ rag = LightRAG(
|
|||
|
||||
对于生产级场景,您很可能想要利用企业级解决方案。PostgreSQL可以为您提供一站式储解解决方案,作为KV存储、向量数据库(pgvector)和图数据库(apache AGE)。支持 PostgreSQL 版本为16.6或以上。
|
||||
|
||||
* 如果您是初学者并想避免麻烦,推荐使用docker,请从这个镜像开始(请务必阅读概述):https://hub.docker.com/r/shangor/postgres-for-rag
|
||||
* Apache AGE的性能不如Neo4j。最求高性能的图数据库请使用Noe4j。
|
||||
* 如果您是初学者并想避免麻烦,推荐使用docker,请从这个镜像开始(默认帐号密码:rag/rag):https://hub.docker.com/r/gzdaniel/postgres-for-rag
|
||||
* Apache AGE的性能不如Neo4j。追求高性能的图数据库请使用Noe4j。
|
||||
|
||||
</details>
|
||||
|
||||
|
|
@ -1463,6 +1477,50 @@ LightRAG服务器提供全面的知识图谱可视化功能。它支持各种重
|
|||
|
||||

|
||||
|
||||
## Langfuse 可观测性集成
|
||||
|
||||
Langfuse 为 OpenAI 客户端提供了直接替代方案,可自动跟踪所有 LLM 交互,使开发者能够在无需修改代码的情况下监控、调试和优化其 RAG 系统。
|
||||
|
||||
### 安装 Langfuse 可选依赖
|
||||
|
||||
```
|
||||
pip install lightrag-hku
|
||||
pip install lightrag-hku[observability]
|
||||
|
||||
# 或从源代码安装并启用调试模式
|
||||
pip install -e .
|
||||
pip install -e ".[observability]"
|
||||
```
|
||||
|
||||
### 配置 Langfuse 环境变量
|
||||
|
||||
修改 .env 文件:
|
||||
|
||||
```
|
||||
## Langfuse 可观测性(可选)
|
||||
# LLM 可观测性和追踪平台
|
||||
# 安装命令: pip install lightrag-hku[observability]
|
||||
# 注册地址: https://cloud.langfuse.com 或自托管部署
|
||||
LANGFUSE_SECRET_KEY=""
|
||||
LANGFUSE_PUBLIC_KEY=""
|
||||
LANGFUSE_HOST="https://cloud.langfuse.com" # 或您的自托管实例地址
|
||||
LANGFUSE_ENABLE_TRACE=true
|
||||
```
|
||||
|
||||
### Langfuse 使用说明
|
||||
|
||||
安装并配置完成后,Langfuse 会自动追踪所有 OpenAI LLM 调用。Langfuse 仪表板功能包括:
|
||||
|
||||
- **追踪**:查看完整的 LLM 调用链
|
||||
- **分析**:Token 使用量、延迟、成本指标
|
||||
- **调试**:检查提示词和响应内容
|
||||
- **评估**:比较模型输出结果
|
||||
- **监控**:实时告警功能
|
||||
|
||||
### 重要提示
|
||||
|
||||
**注意**:LightRAG 目前仅把 OpenAI 兼容的 API 调用接入了 Langfuse。Ollama、Azure 和 AWS Bedrock 等 API 还无法使用 Langfuse 的可观测性功能。
|
||||
|
||||
## RAGAS评估
|
||||
|
||||
**RAGAS**(Retrieval Augmented Generation Assessment,检索增强生成评估)是一个使用LLM对RAG系统进行无参考评估的框架。我们提供了基于RAGAS的评估脚本。详细信息请参阅[基于RAGAS的评估框架](lightrag/evaluation/README.md)。
|
||||
|
|
|
|||
155
README.md
155
README.md
|
|
@ -51,28 +51,24 @@
|
|||
|
||||
---
|
||||
## 🎉 News
|
||||
- [x] [2025.11.05]🎯📢Add **RAGAS-based** Evaluation Framework for LightRAG.
|
||||
- [x] [2025.10.22]🎯📢Eliminate bottlenecks in processing **large-scale datasets**.
|
||||
- [x] [2025.09.15]🎯📢Significantly enhances KG extraction accuracy for **small LLMs** like Qwen3-30B-A3B.
|
||||
- [x] [2025.08.29]🎯📢**Reranker** is supported now , significantly boosting performance for mixed queries.
|
||||
- [x] [2025.08.04]🎯📢**Document deletion** with KG regeneration to ensure query performance.
|
||||
- [x] [2025.06.16]🎯📢Our team has released [RAG-Anything](https://github.com/HKUDS/RAG-Anything) an All-in-One Multimodal RAG System for seamless text, image, table, and equation processing.
|
||||
- [X] [2025.06.05]🎯📢LightRAG now supports comprehensive multimodal data handling through [RAG-Anything](https://github.com/HKUDS/RAG-Anything) integration, enabling seamless document parsing and RAG capabilities across diverse formats including PDFs, images, Office documents, tables, and formulas. Please refer to the new [multimodal section](https://github.com/HKUDS/LightRAG/?tab=readme-ov-file#multimodal-document-processing-rag-anything-integration) for details.
|
||||
- [X] [2025.03.18]🎯📢LightRAG now supports citation functionality, enabling proper source attribution.
|
||||
- [X] [2025.02.05]🎯📢Our team has released [VideoRAG](https://github.com/HKUDS/VideoRAG) understanding extremely long-context videos.
|
||||
- [X] [2025.01.13]🎯📢Our team has released [MiniRAG](https://github.com/HKUDS/MiniRAG) making RAG simpler with small models.
|
||||
- [X] [2025.01.06]🎯📢You can now [use PostgreSQL for Storage](#using-postgresql-for-storage).
|
||||
- [X] [2024.11.25]🎯📢LightRAG now supports seamless integration of [custom knowledge graphs](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#insert-custom-kg), empowering users to enhance the system with their own domain expertise.
|
||||
- [X] [2024.11.19]🎯📢A comprehensive guide to LightRAG is now available on [LearnOpenCV](https://learnopencv.com/lightrag). Many thanks to the blog author.
|
||||
- [X] [2024.11.11]🎯📢LightRAG now supports [deleting entities by their names](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#delete).
|
||||
- [X] [2024.11.09]🎯📢Introducing the [LightRAG Gui](https://lightrag-gui.streamlit.app), which allows you to insert, query, visualize, and download LightRAG knowledge.
|
||||
- [X] [2024.11.04]🎯📢You can now [use Neo4J for Storage](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-neo4j-for-storage).
|
||||
- [X] [2024.10.29]🎯📢LightRAG now supports multiple file types, including PDF, DOC, PPT, and CSV via `textract`.
|
||||
- [X] [2024.10.20]🎯📢We've added a new feature to LightRAG: Graph Visualization.
|
||||
- [X] [2024.10.18]🎯📢We've added a link to a [LightRAG Introduction Video](https://youtu.be/oageL-1I0GE). Thanks to the author!
|
||||
- [X] [2024.10.17]🎯📢We have created a [Discord channel](https://discord.gg/yF2MmDJyGJ)! Welcome to join for sharing and discussions! 🎉🎉
|
||||
- [X] [2024.10.16]🎯📢LightRAG now supports [Ollama models](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)!
|
||||
- [X] [2024.10.15]🎯📢LightRAG now supports [Hugging Face models](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)!
|
||||
- [x] [2025.11.05]🎯Add **RAGAS-based** Evaluation Framework and **Langfuse** observability for LightRAG (API can return retrieved contexts with query results).
|
||||
- [x] [2025.10.22]🎯Eliminate bottlenecks in processing **large-scale datasets**.
|
||||
- [x] [2025.09.15]🎯Significantly enhances KG extraction accuracy for **small LLMs** like Qwen3-30B-A3B.
|
||||
- [x] [2025.08.29]🎯**Reranker** is supported now , significantly boosting performance for mixed queries(Set as default query mode now).
|
||||
- [x] [2025.08.04]🎯**Document deletion** with KG regeneration to ensure query performance.
|
||||
- [x] [2025.06.16]🎯Our team has released [RAG-Anything](https://github.com/HKUDS/RAG-Anything) an All-in-One Multimodal RAG System for seamless text, image, table, and equation processing.
|
||||
- [x] [2025.06.05]🎯LightRAG now supports comprehensive multimodal data handling through [RAG-Anything](https://github.com/HKUDS/RAG-Anything) integration, enabling seamless document parsing and RAG capabilities across diverse formats including PDFs, images, Office documents, tables, and formulas. Please refer to the new [multimodal section](https://github.com/HKUDS/LightRAG/?tab=readme-ov-file#multimodal-document-processing-rag-anything-integration) for details.
|
||||
- [x] [2025.03.18]🎯LightRAG now supports citation functionality, enabling proper source attribution.
|
||||
- [x] [2025.02.12]🎯You can now use MongoDB as all in-one Storage.
|
||||
- [x] [2025.02.05]🎯Our team has released [VideoRAG](https://github.com/HKUDS/VideoRAG) understanding extremely long-context videos.
|
||||
- [x] [2025.01.13]🎯Our team has released [MiniRAG](https://github.com/HKUDS/MiniRAG) making RAG simpler with small models.
|
||||
- [x] [2025.01.06]🎯You can now use PostgreSQL as all in-one Storage.
|
||||
- [x] [2024.11.19]🎯A comprehensive guide to LightRAG is now available on [LearnOpenCV](https://learnopencv.com/lightrag). Many thanks to the blog author.
|
||||
- [x] [2024.11.09]🎯Introducing the LightRAG Webui, which allows you to insert, query, visualize LightRAG knowledge.
|
||||
- [x] [2024.11.04]🎯You can now [use Neo4J for Storage](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-neo4j-for-storage).
|
||||
- [x] [2024.10.18]🎯We've added a link to a [LightRAG Introduction Video](https://youtu.be/oageL-1I0GE). Thanks to the author!
|
||||
- [x] [2024.10.17]🎯We have created a [Discord channel](https://discord.gg/yF2MmDJyGJ)! Welcome to join for sharing and discussions! 🎉🎉
|
||||
- [x] [2024.10.16]🎯LightRAG now supports [Ollama models](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)!
|
||||
|
||||
<details>
|
||||
<summary style="font-size: 1.4em; font-weight: bold; cursor: pointer; display: list-item;">
|
||||
|
|
@ -88,6 +84,11 @@
|
|||
|
||||
## Installation
|
||||
|
||||
> **💡 Using uv for Package Management**: This project uses [uv](https://docs.astral.sh/uv/) for fast and reliable Python package management.
|
||||
> Install uv first: `curl -LsSf https://astral.sh/uv/install.sh | sh` (Unix/macOS) or `powershell -c "irm https://astral.sh/uv/install.ps1 | iex"` (Windows)
|
||||
>
|
||||
> **Note**: You can also use pip if you prefer, but uv is recommended for better performance and more reliable dependency management.
|
||||
>
|
||||
> **📦 Offline Deployment**: For offline or air-gapped environments, see the [Offline Deployment Guide](./docs/OfflineDeployment.md) for instructions on pre-installing all dependencies and cache files.
|
||||
|
||||
### Install LightRAG Server
|
||||
|
|
@ -97,8 +98,13 @@ The LightRAG Server is designed to provide Web UI and API support. The Web UI fa
|
|||
* Install from PyPI
|
||||
|
||||
```bash
|
||||
pip install "lightrag-hku[api]"
|
||||
cp env.example .env
|
||||
# Using uv (recommended)
|
||||
uv pip install "lightrag-hku[api]"
|
||||
# Or using pip
|
||||
# pip install "lightrag-hku[api]"
|
||||
|
||||
cp env.example .env # Update the .env with your LLM and embedding configurations
|
||||
|
||||
lightrag-server
|
||||
```
|
||||
|
||||
|
|
@ -107,9 +113,17 @@ lightrag-server
|
|||
```bash
|
||||
git clone https://github.com/HKUDS/LightRAG.git
|
||||
cd LightRAG
|
||||
# Create a Python virtual enviroment if neccesary
|
||||
# Install in editable mode with API support
|
||||
pip install -e ".[api]"
|
||||
|
||||
# Using uv (recommended)
|
||||
# Note: uv sync automatically creates a virtual environment in .venv/
|
||||
uv sync --extra api
|
||||
source .venv/bin/activate # Activate the virtual environment (Linux/macOS)
|
||||
# Or on Windows: .venv\Scripts\activate
|
||||
|
||||
# Or using pip with virtual environment
|
||||
# python -m venv .venv
|
||||
# source .venv/bin/activate # Windows: .venv\Scripts\activate
|
||||
# pip install -e ".[api]"
|
||||
|
||||
cp env.example .env # Update the .env with your LLM and embedding configurations
|
||||
|
||||
|
|
@ -136,17 +150,23 @@ docker compose up
|
|||
|
||||
### Install LightRAG Core
|
||||
|
||||
* Install from source (Recommend)
|
||||
* Install from source (Recommended)
|
||||
|
||||
```bash
|
||||
cd LightRAG
|
||||
pip install -e .
|
||||
# Note: uv sync automatically creates a virtual environment in .venv/
|
||||
uv sync
|
||||
source .venv/bin/activate # Activate the virtual environment (Linux/macOS)
|
||||
# Or on Windows: .venv\Scripts\activate
|
||||
|
||||
# Or: pip install -e .
|
||||
```
|
||||
|
||||
* Install from PyPI
|
||||
|
||||
```bash
|
||||
pip install lightrag-hku
|
||||
uv pip install lightrag-hku
|
||||
# Or: pip install lightrag-hku
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
|
@ -200,10 +220,7 @@ For a streaming response implementation example, please see `examples/lightrag_o
|
|||
|
||||
### ⚠️ Important: Initialization Requirements
|
||||
|
||||
**LightRAG requires explicit initialization before use.** You must call both `await rag.initialize_storages()` and `await initialize_pipeline_status()` after creating a LightRAG instance, otherwise you will encounter errors like:
|
||||
|
||||
- `AttributeError: __aenter__` - if storages are not initialized
|
||||
- `KeyError: 'history_messages'` - if pipeline status is not initialized
|
||||
**LightRAG requires explicit initialization before use.** You must call `await rag.initialize_storages()` after creating a LightRAG instance, otherwise you will encounter errors.
|
||||
|
||||
### A Simple Program
|
||||
|
||||
|
|
@ -214,7 +231,6 @@ import os
|
|||
import asyncio
|
||||
from lightrag import LightRAG, QueryParam
|
||||
from lightrag.llm.openai import gpt_4o_mini_complete, gpt_4o_complete, openai_embed
|
||||
from lightrag.kg.shared_storage import initialize_pipeline_status
|
||||
from lightrag.utils import setup_logger
|
||||
|
||||
setup_logger("lightrag", level="INFO")
|
||||
|
|
@ -230,9 +246,7 @@ async def initialize_rag():
|
|||
llm_model_func=gpt_4o_mini_complete,
|
||||
)
|
||||
# IMPORTANT: Both initialization calls are required!
|
||||
await rag.initialize_storages() # Initialize storage backends
|
||||
await initialize_pipeline_status() # Initialize processing pipeline
|
||||
return rag
|
||||
await rag.initialize_storages() # Initialize storage backends return rag
|
||||
|
||||
async def main():
|
||||
try:
|
||||
|
|
@ -421,8 +435,6 @@ async def initialize_rag():
|
|||
)
|
||||
|
||||
await rag.initialize_storages()
|
||||
await initialize_pipeline_status()
|
||||
|
||||
return rag
|
||||
```
|
||||
|
||||
|
|
@ -553,7 +565,6 @@ from lightrag import LightRAG
|
|||
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
|
||||
from llama_index.embeddings.openai import OpenAIEmbedding
|
||||
from llama_index.llms.openai import OpenAI
|
||||
from lightrag.kg.shared_storage import initialize_pipeline_status
|
||||
from lightrag.utils import setup_logger
|
||||
|
||||
# Setup log handler for LightRAG
|
||||
|
|
@ -570,8 +581,6 @@ async def initialize_rag():
|
|||
)
|
||||
|
||||
await rag.initialize_storages()
|
||||
await initialize_pipeline_status()
|
||||
|
||||
return rag
|
||||
|
||||
def main():
|
||||
|
|
@ -823,8 +832,6 @@ async def initialize_rag():
|
|||
# Initialize database connections
|
||||
await rag.initialize_storages()
|
||||
# Initialize pipeline status for document processing
|
||||
await initialize_pipeline_status()
|
||||
|
||||
return rag
|
||||
```
|
||||
|
||||
|
|
@ -838,7 +845,7 @@ see test_neo4j.py for a working example.
|
|||
For production level scenarios you will most likely want to leverage an enterprise solution. PostgreSQL can provide a one-stop solution for you as KV store, VectorDB (pgvector) and GraphDB (apache AGE). PostgreSQL version 16.6 or higher is supported.
|
||||
|
||||
* PostgreSQL is lightweight,the whole binary distribution including all necessary plugins can be zipped to 40MB: Ref to [Windows Release](https://github.com/ShanGor/apache-age-windows/releases/tag/PG17%2Fv1.5.0-rc0) as it is easy to install for Linux/Mac.
|
||||
* If you prefer docker, please start with this image if you are a beginner to avoid hiccups (DO read the overview): https://hub.docker.com/r/shangor/postgres-for-rag
|
||||
* If you prefer docker, please start with this image if you are a beginner to avoid hiccups (Default user password:rag/rag): https://hub.docker.com/r/gzdaniel/postgres-for-rag
|
||||
* How to start? Ref to: [examples/lightrag_zhipu_postgres_demo.py](https://github.com/HKUDS/LightRAG/blob/main/examples/lightrag_zhipu_postgres_demo.py)
|
||||
* For high-performance graph database requirements, Neo4j is recommended as Apache AGE's performance is not as competitive.
|
||||
|
||||
|
|
@ -909,8 +916,6 @@ async def initialize_rag():
|
|||
# Initialize database connections
|
||||
await rag.initialize_storages()
|
||||
# Initialize pipeline status for document processing
|
||||
await initialize_pipeline_status()
|
||||
|
||||
return rag
|
||||
```
|
||||
|
||||
|
|
@ -945,7 +950,8 @@ maxclients 500
|
|||
The `workspace` parameter ensures data isolation between different LightRAG instances. Once initialized, the `workspace` is immutable and cannot be changed.Here is how workspaces are implemented for different types of storage:
|
||||
|
||||
- **For local file-based databases, data isolation is achieved through workspace subdirectories:** `JsonKVStorage`, `JsonDocStatusStorage`, `NetworkXStorage`, `NanoVectorDBStorage`, `FaissVectorDBStorage`.
|
||||
- **For databases that store data in collections, it's done by adding a workspace prefix to the collection name:** `RedisKVStorage`, `RedisDocStatusStorage`, `MilvusVectorDBStorage`, `QdrantVectorDBStorage`, `MongoKVStorage`, `MongoDocStatusStorage`, `MongoVectorDBStorage`, `MongoGraphStorage`, `PGGraphStorage`.
|
||||
- **For databases that store data in collections, it's done by adding a workspace prefix to the collection name:** `RedisKVStorage`, `RedisDocStatusStorage`, `MilvusVectorDBStorage`, `MongoKVStorage`, `MongoDocStatusStorage`, `MongoVectorDBStorage`, `MongoGraphStorage`, `PGGraphStorage`.
|
||||
- **For Qdrant vector database, data isolation is achieved through payload-based partitioning (Qdrant's recommended multitenancy approach):** `QdrantVectorDBStorage` uses shared collections with payload filtering for unlimited workspace scalability.
|
||||
- **For relational databases, data isolation is achieved by adding a `workspace` field to the tables for logical data separation:** `PGKVStorage`, `PGVectorStorage`, `PGDocStatusStorage`.
|
||||
- **For the Neo4j graph database, logical data isolation is achieved through labels:** `Neo4JStorage`
|
||||
|
||||
|
|
@ -1517,16 +1523,13 @@ If you encounter these errors when using LightRAG:
|
|||
|
||||
2. **`KeyError: 'history_messages'`**
|
||||
- **Cause**: Pipeline status not initialized
|
||||
- **Solution**: Call `await initialize_pipeline_status()` after initializing storages
|
||||
|
||||
- **Solution**: Call `
|
||||
3. **Both errors in sequence**
|
||||
- **Cause**: Neither initialization method was called
|
||||
- **Solution**: Always follow this pattern:
|
||||
```python
|
||||
rag = LightRAG(...)
|
||||
await rag.initialize_storages()
|
||||
await initialize_pipeline_status()
|
||||
```
|
||||
await rag.initialize_storages() ```
|
||||
|
||||
### Model Switching Issues
|
||||
|
||||
|
|
@ -1542,6 +1545,50 @@ The LightRAG Server offers a comprehensive knowledge graph visualization feature
|
|||
|
||||

|
||||
|
||||
## Langfuse observability integration
|
||||
|
||||
Langfuse provides a drop-in replacement for the OpenAI client that automatically tracks all LLM interactions, enabling developers to monitor, debug, and optimize their RAG systems without code changes.
|
||||
|
||||
### Installation with Langfuse option
|
||||
|
||||
```
|
||||
pip install lightrag-hku
|
||||
pip install lightrag-hku[observability]
|
||||
|
||||
# Or install from souce code with debug mode enabled
|
||||
pip install -e .
|
||||
pip install -e ".[observability]"
|
||||
```
|
||||
|
||||
### Config Langfuse env vars
|
||||
|
||||
modify .env file:
|
||||
|
||||
```
|
||||
## Langfuse Observability (Optional)
|
||||
# LLM observability and tracing platform
|
||||
# Install with: pip install lightrag-hku[observability]
|
||||
# Sign up at: https://cloud.langfuse.com or self-host
|
||||
LANGFUSE_SECRET_KEY=""
|
||||
LANGFUSE_PUBLIC_KEY=""
|
||||
LANGFUSE_HOST="https://cloud.langfuse.com" # or your self-hosted instance
|
||||
LANGFUSE_ENABLE_TRACE=true
|
||||
```
|
||||
|
||||
### Langfuse Usage
|
||||
|
||||
Once installed and configured, Langfuse automatically traces all OpenAI LLM calls. Langfuse dashboard features include:
|
||||
|
||||
- **Tracing**: View complete LLM call chains
|
||||
- **Analytics**: Token usage, latency, cost metrics
|
||||
- **Debugging**: Inspect prompts and responses
|
||||
- **Evaluation**: Compare model outputs
|
||||
- **Monitoring**: Real-time alerting
|
||||
|
||||
### Important Notice
|
||||
|
||||
**Note**: LightRAG currently only integrates OpenAI-compatible API calls with Langfuse. APIs such as Ollama, Azure, and AWS Bedrock are not yet supported for Langfuse observability.
|
||||
|
||||
## RAGAS-based Evaluation
|
||||
|
||||
**RAGAS** (Retrieval Augmented Generation Assessment) is a framework for reference-free evaluation of RAG systems using LLMs. There is an evaluation script based on RAGAS. For detailed information, please refer to [RAGAS-based Evaluation Framework](lightrag/evaluation/README.md).
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue