Update README
This commit is contained in:
parent
718025dbea
commit
8c6b5f4a3a
4 changed files with 62 additions and 110 deletions
35
README-zh.md
35
README-zh.md
|
|
@ -789,7 +789,7 @@ MongoDocStatusStorage MongoDB
|
|||
每一种存储类型的链接配置范例可以在 `env.example` 文件中找到。链接字符串中的数据库实例是需要你预先在数据库服务器上创建好的,LightRAG 仅负责在数据库实例中创建数据表,不负责创建数据库实例。如果使用 Redis 作为存储,记得给 Redis 配置自动持久化数据规则,否则 Redis 服务重启后数据会丢失。如果使用PostgreSQL数据库,推荐使用16.6版本或以上。
|
||||
|
||||
<details>
|
||||
<summary> <b>使用Neo4J进行存储</b> </summary>
|
||||
<summary> <b>使用Neo4J存储</b> </summary>
|
||||
|
||||
* 对于生产级场景,您很可能想要利用企业级解决方案
|
||||
* 进行KG存储。推荐在Docker中运行Neo4J以进行无缝本地测试。
|
||||
|
|
@ -827,7 +827,7 @@ async def initialize_rag():
|
|||
</details>
|
||||
|
||||
<details>
|
||||
<summary> <b>使用Faiss进行存储</b> </summary>
|
||||
<summary> <b>使用Faiss存储</b> </summary>
|
||||
在使用Faiss向量数据库之前必须手工安装`faiss-cpu`或`faiss-gpu`。
|
||||
|
||||
- 安装所需依赖:
|
||||
|
|
@ -864,18 +864,39 @@ rag = LightRAG(
|
|||
</details>
|
||||
|
||||
<details>
|
||||
<summary> <b>使用PostgreSQL进行存储</b> </summary>
|
||||
<summary> <b>使用PostgreSQL存储</b> </summary>
|
||||
|
||||
对于生产级场景,您很可能想要利用企业级解决方案。PostgreSQL可以为您提供一站式解决方案,作为KV存储、向量数据库(pgvector)和图数据库(apache AGE)。支持 PostgreSQL 版本为16.6或以上。
|
||||
对于生产级场景,您很可能想要利用企业级解决方案。PostgreSQL可以为您提供一站式储解解决方案,作为KV存储、向量数据库(pgvector)和图数据库(apache AGE)。支持 PostgreSQL 版本为16.6或以上。
|
||||
|
||||
* PostgreSQL很轻量,整个二进制发行版包括所有必要的插件可以压缩到40MB:参考[Windows发布版](https://github.com/ShanGor/apache-age-windows/releases/tag/PG17%2Fv1.5.0-rc0),它在Linux/Mac上也很容易安装。
|
||||
* 如果您是初学者并想避免麻烦,推荐使用docker,请从这个镜像开始(请务必阅读概述):https://hub.docker.com/r/shangor/postgres-for-rag
|
||||
* 如何开始?参考:[examples/lightrag_zhipu_postgres_demo.py](https://github.com/HKUDS/LightRAG/blob/main/examples/lightrag_zhipu_postgres_demo.py)
|
||||
|
||||
* Apache AGE的性能不如Neo4j。最求高性能的图数据库请使用Noe4j。
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary> <b>使用MogonDB存储</b> </summary>
|
||||
|
||||
MongoDB为LightRAG提供了一站式的存储解决方案。MongoDB提供原生的KV存储和向量存储。LightRAG使用MogoDB的集合实现了一个简易的图存储。MongoDB 官方的向量检索功能(`$vectorSearch`)目前必须依赖其官方的云服务 MongoDB Atlas。无法在自托管的 MongoDB Community/Enterprise 版本上使用此功能。
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary> <b>使用Redis存储</b> </summary>
|
||||
|
||||
LightRAG支持使用Reidis作为KV存储。使用Redis存储的时候需要注意进行持久化配置和内存使用量配置。以下是推荐的redis配置
|
||||
|
||||
```
|
||||
save 900 1
|
||||
save 300 10
|
||||
save 60 1000
|
||||
stop-writes-on-bgsave-error yes
|
||||
maxmemory 4gb
|
||||
maxmemory-policy noeviction
|
||||
maxclients 500
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
### LightRAG实例间的数据隔离
|
||||
|
||||
通过 workspace 参数可以不同实现不同LightRAG实例之间的存储数据隔离。LightRAG在初始化后workspace就已经确定,之后修改workspace是无效的。下面是不同类型的存储实现工作空间的方式:
|
||||
|
|
|
|||
30
README.md
30
README.md
|
|
@ -800,7 +800,7 @@ MongoDocStatusStorage MongoDB
|
|||
Example connection configurations for each storage type can be found in the `env.example` file. The database instance in the connection string needs to be created by you on the database server beforehand. LightRAG is only responsible for creating tables within the database instance, not for creating the database instance itself. If using Redis as storage, remember to configure automatic data persistence rules for Redis, otherwise data will be lost after the Redis service restarts. If using PostgreSQL, it is recommended to use version 16.6 or above.
|
||||
|
||||
<details>
|
||||
<summary> <b>Using Neo4J for Storage</b> </summary>
|
||||
<summary> <b>Using Neo4J Storage</b> </summary>
|
||||
|
||||
* For production level scenarios you will most likely want to leverage an enterprise solution
|
||||
* for KG storage. Running Neo4J in Docker is recommended for seamless local testing.
|
||||
|
|
@ -839,7 +839,7 @@ see test_neo4j.py for a working example.
|
|||
</details>
|
||||
|
||||
<details>
|
||||
<summary> <b>Using PostgreSQL for Storage</b> </summary>
|
||||
<summary> <b>Using PostgreSQL Storage</b> </summary>
|
||||
|
||||
For production level scenarios you will most likely want to leverage an enterprise solution. PostgreSQL can provide a one-stop solution for you as KV store, VectorDB (pgvector) and GraphDB (apache AGE). PostgreSQL version 16.6 or higher is supported.
|
||||
|
||||
|
|
@ -851,7 +851,7 @@ For production level scenarios you will most likely want to leverage an enterpri
|
|||
</details>
|
||||
|
||||
<details>
|
||||
<summary> <b>Using Faiss for Storage</b> </summary>
|
||||
<summary> <b>Using Faiss Storage</b> </summary>
|
||||
Before using Faiss vector database, you must manually install `faiss-cpu` or `faiss-gpu`.
|
||||
|
||||
- Install the required dependencies:
|
||||
|
|
@ -922,6 +922,30 @@ async def initialize_rag():
|
|||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary> <b>Using MongoDB Storage</b> </summary>
|
||||
|
||||
MongoDB provides a one-stop storage solution for LightRAG. MongoDB offers native KV storage and vector storage. LightRAG uses MongoDB collections to implement a simple graph storage. MongoDB's official vector search functionality (`$vectorSearch`) currently requires their official cloud service MongoDB Atlas. This functionality cannot be used on self-hosted MongoDB Community/Enterprise versions.
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary> <b>Using Redis Storage</b> </summary>
|
||||
|
||||
LightRAG supports using Redis as KV storage. When using Redis storage, attention should be paid to persistence configuration and memory usage configuration. The following is the recommended Redis configuration:
|
||||
|
||||
```
|
||||
save 900 1
|
||||
save 300 10
|
||||
save 60 1000
|
||||
stop-writes-on-bgsave-error yes
|
||||
maxmemory 4gb
|
||||
maxmemory-policy noeviction
|
||||
maxclients 500
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
### Data Isolation Between LightRAG Instances
|
||||
|
||||
The `workspace` parameter ensures data isolation between different LightRAG instances. Once initialized, the `workspace` is immutable and cannot be changed.Here is how workspaces are implemented for different types of storage:
|
||||
|
|
|
|||
|
|
@ -389,51 +389,9 @@ LightRAG 使用 4 种类型的存储用于不同目的:
|
|||
* GRAPH_STORAGE:实体关系图
|
||||
* DOC_STATUS_STORAGE:文档索引状态
|
||||
|
||||
每种存储类型都有几种实现:
|
||||
每种存储类型都有多种存储实现方式。LightRAG Server默认的存储实现为内存数据库,数据通过文件持久化保存到WORKING_DIR目录。LightRAG还支持PostgreSQL、MongoDB、FAISS、Milvus、Qdrant、Neo4j、Memgraph和Redis等存储实现方式。详细的存储支持方式请参考根目录下的`README.md`文件中关于存储的相关内容。
|
||||
|
||||
* KV_STORAGE 支持的实现名称
|
||||
|
||||
```
|
||||
JsonKVStorage JsonFile(默认)
|
||||
PGKVStorage Postgres
|
||||
RedisKVStorage Redis
|
||||
MongoKVStorage MogonDB
|
||||
```
|
||||
|
||||
* GRAPH_STORAGE 支持的实现名称
|
||||
|
||||
```
|
||||
NetworkXStorage NetworkX(默认)
|
||||
Neo4JStorage Neo4J
|
||||
PGGraphStorage PostgreSQL with AGE plugin
|
||||
```
|
||||
|
||||
> 在测试中Neo4j图形数据库相比PostgreSQL AGE有更好的性能表现。
|
||||
|
||||
* VECTOR_STORAGE 支持的实现名称
|
||||
|
||||
```
|
||||
NanoVectorDBStorage NanoVector(默认)
|
||||
PGVectorStorage Postgres
|
||||
MilvusVectorDBStorge Milvus
|
||||
FaissVectorDBStorage Faiss
|
||||
QdrantVectorDBStorage Qdrant
|
||||
MongoVectorDBStorage MongoDB
|
||||
```
|
||||
|
||||
* DOC_STATUS_STORAGE 支持的实现名称
|
||||
|
||||
```
|
||||
JsonDocStatusStorage JsonFile(默认)
|
||||
PGDocStatusStorage Postgres
|
||||
MongoDocStatusStorage MongoDB
|
||||
```
|
||||
|
||||
每一种存储类型的链接配置范例可以在 `env.example` 文件中找到。链接字符串中的数据库实例是需要你预先在数据库服务器上创建好的,LightRAG 仅负责在数据库实例中创建数据表,不负责创建数据库实例。如果使用 Redis 作为存储,记得给 Redis 配置自动持久化数据规则,否则 Redis 服务重启后数据会丢失。如果使用PostgreSQL数据库,推荐使用16.6版本或以上。
|
||||
|
||||
### 如何选择存储实现
|
||||
|
||||
您可以通过环境变量选择存储实现。在首次启动 API 服务器之前,您可以将以下环境变量设置为特定的存储实现名称:
|
||||
您可以通过环境变量选择存储实现。例如,在首次启动 API 服务器之前,您可以将以下环境变量设置为特定的存储实现名称:
|
||||
|
||||
```
|
||||
LIGHTRAG_KV_STORAGE=PGKVStorage
|
||||
|
|
@ -442,7 +400,7 @@ LIGHTRAG_GRAPH_STORAGE=PGGraphStorage
|
|||
LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage
|
||||
```
|
||||
|
||||
在向 LightRAG 添加文档后,您不能更改存储实现选择。目前尚不支持从一个存储实现迁移到另一个存储实现。更多信息请阅读示例 env 文件或 config.ini 文件。
|
||||
在向 LightRAG 添加文档后,您不能更改存储实现选择。目前尚不支持从一个存储实现迁移到另一个存储实现。更多配置信息请阅读示例 `env.exampl`e文件。
|
||||
|
||||
### LightRag API 服务器命令行选项
|
||||
|
||||
|
|
@ -453,18 +411,14 @@ LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage
|
|||
| --working-dir | ./rag_storage | RAG 存储的工作目录 |
|
||||
| --input-dir | ./inputs | 包含输入文档的目录 |
|
||||
| --max-async | 4 | 最大异步操作数 |
|
||||
| --max-tokens | 32768 | 最大 token 大小 |
|
||||
| --timeout | 150 | 超时时间(秒)。None 表示无限超时(不推荐) |
|
||||
| --log-level | INFO | 日志级别(DEBUG、INFO、WARNING、ERROR、CRITICAL) |
|
||||
| --verbose | - | 详细调试输出(True、False) |
|
||||
| --key | None | 用于认证的 API 密钥。保护 lightrag 服务器免受未授权访问 |
|
||||
| --ssl | False | 启用 HTTPS |
|
||||
| --ssl-certfile | None | SSL 证书文件路径(如果启用 --ssl 则必需) |
|
||||
| --ssl-keyfile | None | SSL 私钥文件路径(如果启用 --ssl 则必需) |
|
||||
| --top-k | 50 | 要检索的 top-k 项目数;在"local"模式下对应实体,在"global"模式下对应关系。 |
|
||||
| --cosine-threshold | 0.4 | 节点和关系检索的余弦阈值,与 top-k 一起控制节点和关系的检索。 |
|
||||
| --llm-binding | ollama | LLM 绑定类型(lollms、ollama、openai、openai-ollama、azure_openai) |
|
||||
| --embedding-binding | ollama | 嵌入绑定类型(lollms、ollama、openai、azure_openai) |
|
||||
| --llm-binding | ollama | LLM 绑定类型(lollms、ollama、openai、openai-ollama、azure_openai、aws_bedrock) |
|
||||
| --embedding-binding | ollama | 嵌入绑定类型(lollms、ollama、openai、azure_openai、aws_bedrock) |
|
||||
| auto-scan-at-startup | - | 扫描输入目录中的新文件并开始索引 |
|
||||
|
||||
### .env 文件示例
|
||||
|
|
|
|||
|
|
@ -390,52 +390,9 @@ LightRAG uses 4 types of storage for different purposes:
|
|||
* GRAPH_STORAGE: entity relation graph
|
||||
* DOC_STATUS_STORAGE: document indexing status
|
||||
|
||||
Each storage type has several implementations:
|
||||
LightRAG Server offers various storage implementations, with the default being an in-memory database that persists data to the WORKING_DIR directory. Additionally, LightRAG supports a wide range of storage solutions including PostgreSQL, MongoDB, FAISS, Milvus, Qdrant, Neo4j, Memgraph, and Redis. For detailed information on supported storage options, please refer to the storage section in the README.md file located in the root directory.
|
||||
|
||||
* KV_STORAGE supported implementations:
|
||||
|
||||
```
|
||||
JsonKVStorage JsonFile (default)
|
||||
PGKVStorage Postgres
|
||||
RedisKVStorage Redis
|
||||
MongoKVStorage MongoDB
|
||||
```
|
||||
|
||||
* GRAPH_STORAGE supported implementations:
|
||||
|
||||
```
|
||||
NetworkXStorage NetworkX (default)
|
||||
Neo4JStorage Neo4J
|
||||
PGGraphStorage PostgreSQL with AGE plugin
|
||||
MemgraphStorage. Memgraph
|
||||
```
|
||||
|
||||
> Testing has shown that Neo4J delivers superior performance in production environments compared to PostgreSQL with AGE plugin.
|
||||
|
||||
* VECTOR_STORAGE supported implementations:
|
||||
|
||||
```
|
||||
NanoVectorDBStorage NanoVector (default)
|
||||
PGVectorStorage Postgres
|
||||
MilvusVectorDBStorage Milvus
|
||||
FaissVectorDBStorage Faiss
|
||||
QdrantVectorDBStorage Qdrant
|
||||
MongoVectorDBStorage MongoDB
|
||||
```
|
||||
|
||||
* DOC_STATUS_STORAGE: supported implementations:
|
||||
|
||||
```
|
||||
JsonDocStatusStorage JsonFile (default)
|
||||
PGDocStatusStorage Postgres
|
||||
MongoDocStatusStorage MongoDB
|
||||
```
|
||||
Example connection configurations for each storage type can be found in the `env.example` file. The database instance in the connection string needs to be created by you on the database server beforehand. LightRAG is only responsible for creating tables within the database instance, not for creating the database instance itself. If using Redis as storage, remember to configure automatic data persistence rules for Redis, otherwise data will be lost after the Redis service restarts. If using PostgreSQL, it is recommended to use version 16.6 or above.
|
||||
|
||||
|
||||
### How to Select Storage Implementation
|
||||
|
||||
You can select storage implementation by environment variables. You can set the following environment variables to a specific storage implementation name before the first start of the API Server:
|
||||
You can select the storage implementation by configuring environment variables. For instance, prior to the initial launch of the API server, you can set the following environment variable to specify your desired storage implementation:
|
||||
|
||||
```
|
||||
LIGHTRAG_KV_STORAGE=PGKVStorage
|
||||
|
|
@ -455,16 +412,12 @@ You cannot change storage implementation selection after adding documents to Lig
|
|||
| --working-dir | ./rag_storage | Working directory for RAG storage |
|
||||
| --input-dir | ./inputs | Directory containing input documents |
|
||||
| --max-async | 4 | Maximum number of async operations |
|
||||
| --max-tokens | 32768 | Maximum token size |
|
||||
| --timeout | 150 | Timeout in seconds. None for infinite timeout (not recommended) |
|
||||
| --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
|
||||
| --verbose | - | Verbose debug output (True, False) |
|
||||
| --key | None | API key for authentication. Protects the LightRAG server against unauthorized access |
|
||||
| --ssl | False | Enable HTTPS |
|
||||
| --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
|
||||
| --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
|
||||
| --top-k | 50 | Number of top-k items to retrieve; corresponds to entities in "local" mode and relationships in "global" mode. |
|
||||
| --cosine-threshold | 0.4 | The cosine threshold for nodes and relation retrieval, works with top-k to control the retrieval of nodes and relations. |
|
||||
| --llm-binding | ollama | LLM binding type (lollms, ollama, openai, openai-ollama, azure_openai, aws_bedrock) |
|
||||
| --embedding-binding | ollama | Embedding binding type (lollms, ollama, openai, azure_openai, aws_bedrock) |
|
||||
| --auto-scan-at-startup| - | Scan input directory for new files and start indexing |
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue