Update README.md
This commit is contained in:
parent
2bf0d397ed
commit
1c53c5c764
2 changed files with 41 additions and 9 deletions
34
README-zh.md
34
README-zh.md
|
|
@ -135,6 +135,22 @@ pip install lightrag-hku
|
|||
|
||||
## 快速开始
|
||||
|
||||
### LightRAG的LLM及配套技术栈要求
|
||||
|
||||
LightRAG对大型语言模型(LLM)的能力要求远高于传统RAG,因为它需要LLM执行文档中的实体关系抽取任务。配置合适的Embedding和Reranker模型对提高查询表现也至关重要。
|
||||
|
||||
- **LLM选型**:
|
||||
- 推荐选用参数量至少为32B的LLM。
|
||||
- 上下文长度至少为32KB,推荐达到64KB。
|
||||
- **Embedding模型**:
|
||||
- 高性能的Embedding模型对RAG至关重要。
|
||||
- 推荐使用主流的多语言Embedding模型,例如:BAAI/bge-m3 和 text-embedding-3-large。
|
||||
- **重要提示**:在文档索引前必须确定使用的Embedding模型,且在文档查询阶段必须沿用与索引阶段相同的模型。
|
||||
- **Reranker模型配置**:
|
||||
- 配置Reranker模型能够显著提升LightRAG的检索效果。
|
||||
- 启用Reranker模型后,推荐将“mix模式”设为默认查询模式。
|
||||
- 推荐选用主流的Reranker模型,例如:BAAI/bge-reranker-v2-m3 或 Jina 等服务商提供的模型。
|
||||
|
||||
### 使用LightRAG服务器
|
||||
|
||||
**有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。**
|
||||
|
|
@ -831,7 +847,7 @@ rag = LightRAG(
|
|||
create INDEX CONCURRENTLY entity_idx_node_id ON dickens."Entity" (ag_catalog.agtype_access_operator(properties, '"node_id"'::agtype));
|
||||
CREATE INDEX CONCURRENTLY entity_node_id_gin_idx ON dickens."Entity" using gin(properties);
|
||||
ALTER TABLE dickens."DIRECTED" CLUSTER ON directed_sid_idx;
|
||||
|
||||
|
||||
-- 如有必要可以删除
|
||||
drop INDEX entity_p_idx;
|
||||
drop INDEX vertex_p_idx;
|
||||
|
|
@ -1189,17 +1205,17 @@ LightRAG 现已与 [RAG-Anything](https://github.com/HKUDS/RAG-Anything) 实现
|
|||
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
|
||||
from lightrag.utils import EmbeddingFunc
|
||||
import os
|
||||
|
||||
|
||||
async def load_existing_lightrag():
|
||||
# 首先,创建或加载现有的 LightRAG 实例
|
||||
lightrag_working_dir = "./existing_lightrag_storage"
|
||||
|
||||
|
||||
# 检查是否存在之前的 LightRAG 实例
|
||||
if os.path.exists(lightrag_working_dir) and os.listdir(lightrag_working_dir):
|
||||
print("✅ Found existing LightRAG instance, loading...")
|
||||
else:
|
||||
print("❌ No existing LightRAG instance found, will create new one")
|
||||
|
||||
|
||||
# 使用您的配置创建/加载 LightRAG 实例
|
||||
lightrag_instance = LightRAG(
|
||||
working_dir=lightrag_working_dir,
|
||||
|
|
@ -1222,10 +1238,10 @@ LightRAG 现已与 [RAG-Anything](https://github.com/HKUDS/RAG-Anything) 实现
|
|||
),
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
# 初始化存储(如果有现有数据,这将加载现有数据)
|
||||
await lightrag_instance.initialize_storages()
|
||||
|
||||
|
||||
# 现在使用现有的 LightRAG 实例初始化 RAGAnything
|
||||
rag = RAGAnything(
|
||||
lightrag=lightrag_instance, # 传递现有的 LightRAG 实例
|
||||
|
|
@ -1254,20 +1270,20 @@ LightRAG 现已与 [RAG-Anything](https://github.com/HKUDS/RAG-Anything) 实现
|
|||
)
|
||||
# 注意:working_dir、llm_model_func、embedding_func 等都从 lightrag_instance 继承
|
||||
)
|
||||
|
||||
|
||||
# 查询现有的知识库
|
||||
result = await rag.query_with_multimodal(
|
||||
"What data has been processed in this LightRAG instance?",
|
||||
mode="hybrid"
|
||||
)
|
||||
print("Query result:", result)
|
||||
|
||||
|
||||
# 向现有的 LightRAG 实例添加新的多模态文档
|
||||
await rag.process_document_complete(
|
||||
file_path="path/to/new/multimodal_document.pdf",
|
||||
output_dir="./output"
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(load_existing_lightrag())
|
||||
```
|
||||
|
|
|
|||
16
README.md
16
README.md
|
|
@ -134,6 +134,22 @@ pip install lightrag-hku
|
|||
|
||||
## Quick Start
|
||||
|
||||
### LLM and Technology Stack Requirements for LightRAG
|
||||
|
||||
LightRAG's demands on the capabilities of Large Language Models (LLMs) are significantly higher than those of traditional RAG, as it requires the LLM to perform entity-relationship extraction tasks from documents. Configuring appropriate Embedding and Reranker models is also crucial for improving query performance.
|
||||
|
||||
- **LLM Selection**:
|
||||
- It is recommended to use an LLM with at least 32 billion parameters.
|
||||
- The context length should be at least 32KB, with 64KB being recommended.
|
||||
- **Embedding Model**:
|
||||
- A high-performance Embedding model is essential for RAG.
|
||||
- We recommend using mainstream multilingual Embedding models, such as: `BAAI/bge-m3` and `text-embedding-3-large`.
|
||||
- **Important Note**: The Embedding model must be determined before document indexing, and the same model must be used during the document query phase.
|
||||
- **Reranker Model Configuration**:
|
||||
- Configuring a Reranker model can significantly enhance LightRAG's retrieval performance.
|
||||
- When a Reranker model is enabled, it is recommended to set the "mix mode" as the default query mode.
|
||||
- We recommend using mainstream Reranker models, such as: `BAAI/bge-reranker-v2-m3` or models provided by services like Jina.
|
||||
|
||||
### Quick Start for LightRAG Server
|
||||
|
||||
* For more information about LightRAG Server, please refer to [LightRAG Server](./lightrag/api/README.md).
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue