diff --git a/README-zh.md b/README-zh.md index 5f9614e3..d5c358df 100644 --- a/README-zh.md +++ b/README-zh.md @@ -135,6 +135,22 @@ pip install lightrag-hku ## 快速开始 +### LightRAG的LLM及配套技术栈要求 + +LightRAG对大型语言模型(LLM)的能力要求远高于传统RAG,因为它需要LLM执行文档中的实体关系抽取任务。配置合适的Embedding和Reranker模型对提高查询表现也至关重要。 + +- **LLM选型**: + - 推荐选用参数量至少为32B的LLM。 + - 上下文长度至少为32KB,推荐达到64KB。 +- **Embedding模型**: + - 高性能的Embedding模型对RAG至关重要。 + - 推荐使用主流的多语言Embedding模型,例如:BAAI/bge-m3 和 text-embedding-3-large。 + - **重要提示**:在文档索引前必须确定使用的Embedding模型,且在文档查询阶段必须沿用与索引阶段相同的模型。 +- **Reranker模型配置**: + - 配置Reranker模型能够显著提升LightRAG的检索效果。 + - 启用Reranker模型后,推荐将“mix模式”设为默认查询模式。 + - 推荐选用主流的Reranker模型,例如:BAAI/bge-reranker-v2-m3 或 Jina 等服务商提供的模型。 + ### 使用LightRAG服务器 **有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。** @@ -831,7 +847,7 @@ rag = LightRAG( create INDEX CONCURRENTLY entity_idx_node_id ON dickens."Entity" (ag_catalog.agtype_access_operator(properties, '"node_id"'::agtype)); CREATE INDEX CONCURRENTLY entity_node_id_gin_idx ON dickens."Entity" using gin(properties); ALTER TABLE dickens."DIRECTED" CLUSTER ON directed_sid_idx; - + -- 如有必要可以删除 drop INDEX entity_p_idx; drop INDEX vertex_p_idx; @@ -1189,17 +1205,17 @@ LightRAG 现已与 [RAG-Anything](https://github.com/HKUDS/RAG-Anything) 实现 from lightrag.llm.openai import openai_complete_if_cache, openai_embed from lightrag.utils import EmbeddingFunc import os - + async def load_existing_lightrag(): # 首先,创建或加载现有的 LightRAG 实例 lightrag_working_dir = "./existing_lightrag_storage" - + # 检查是否存在之前的 LightRAG 实例 if os.path.exists(lightrag_working_dir) and os.listdir(lightrag_working_dir): print("✅ Found existing LightRAG instance, loading...") else: print("❌ No existing LightRAG instance found, will create new one") - + # 使用您的配置创建/加载 LightRAG 实例 lightrag_instance = LightRAG( working_dir=lightrag_working_dir, @@ -1222,10 +1238,10 @@ LightRAG 现已与 [RAG-Anything](https://github.com/HKUDS/RAG-Anything) 实现 ), ) ) - + # 初始化存储(如果有现有数据,这将加载现有数据) await lightrag_instance.initialize_storages() - + # 现在使用现有的 LightRAG 实例初始化 RAGAnything rag = RAGAnything( lightrag=lightrag_instance, # 传递现有的 LightRAG 实例 @@ -1254,20 +1270,20 @@ LightRAG 现已与 [RAG-Anything](https://github.com/HKUDS/RAG-Anything) 实现 ) # 注意:working_dir、llm_model_func、embedding_func 等都从 lightrag_instance 继承 ) - + # 查询现有的知识库 result = await rag.query_with_multimodal( "What data has been processed in this LightRAG instance?", mode="hybrid" ) print("Query result:", result) - + # 向现有的 LightRAG 实例添加新的多模态文档 await rag.process_document_complete( file_path="path/to/new/multimodal_document.pdf", output_dir="./output" ) - + if __name__ == "__main__": asyncio.run(load_existing_lightrag()) ``` diff --git a/README.md b/README.md index 0fa6c3d1..b594938b 100644 --- a/README.md +++ b/README.md @@ -134,6 +134,22 @@ pip install lightrag-hku ## Quick Start +### LLM and Technology Stack Requirements for LightRAG + +LightRAG's demands on the capabilities of Large Language Models (LLMs) are significantly higher than those of traditional RAG, as it requires the LLM to perform entity-relationship extraction tasks from documents. Configuring appropriate Embedding and Reranker models is also crucial for improving query performance. + +- **LLM Selection**: + - It is recommended to use an LLM with at least 32 billion parameters. + - The context length should be at least 32KB, with 64KB being recommended. +- **Embedding Model**: + - A high-performance Embedding model is essential for RAG. + - We recommend using mainstream multilingual Embedding models, such as: `BAAI/bge-m3` and `text-embedding-3-large`. + - **Important Note**: The Embedding model must be determined before document indexing, and the same model must be used during the document query phase. +- **Reranker Model Configuration**: + - Configuring a Reranker model can significantly enhance LightRAG's retrieval performance. + - When a Reranker model is enabled, it is recommended to set the "mix mode" as the default query mode. + - We recommend using mainstream Reranker models, such as: `BAAI/bge-reranker-v2-m3` or models provided by services like Jina. + ### Quick Start for LightRAG Server * For more information about LightRAG Server, please refer to [LightRAG Server](./lightrag/api/README.md).