* Initial plan * Initialize reverse documentation directory and create analysis plan Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com> * Add executive summary and comprehensive architecture documentation Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com> * Add comprehensive data models and dependency migration documentation Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com> * Complete comprehensive TypeScript migration documentation suite Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>
568 lines
32 KiB
Markdown
568 lines
32 KiB
Markdown
# Architecture Documentation: LightRAG System Design
|
|
|
|
## Table of Contents
|
|
1. [System Architecture Overview](#system-architecture-overview)
|
|
2. [Component Interaction Architecture](#component-interaction-architecture)
|
|
3. [Document Indexing Data Flow](#document-indexing-data-flow)
|
|
4. [Query Processing Data Flow](#query-processing-data-flow)
|
|
5. [Storage Layer Architecture](#storage-layer-architecture)
|
|
6. [Concurrency and State Management](#concurrency-and-state-management)
|
|
|
|
## System Architecture Overview
|
|
|
|
LightRAG follows a layered architecture pattern with clear separation of concerns. The system is structured into five primary layers, each with specific responsibilities and well-defined interfaces.
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph "Presentation Layer"
|
|
A1["WebUI (TypeScript)<br/>React + Vite"]
|
|
A2["API Clients<br/>REST/OpenAPI"]
|
|
end
|
|
|
|
subgraph "API Gateway Layer"
|
|
B1["FastAPI Server<br/>lightrag_server.py"]
|
|
B2["Authentication<br/>JWT + API Keys"]
|
|
B3["Request Validation<br/>Pydantic Models"]
|
|
B4["Route Handlers<br/>Query/Document/Graph"]
|
|
end
|
|
|
|
subgraph "Business Logic Layer"
|
|
C1["LightRAG Core<br/>lightrag.py"]
|
|
C2["Operations Module<br/>operate.py"]
|
|
C3["Utilities<br/>utils.py + utils_graph.py"]
|
|
C4["Prompt Templates<br/>prompt.py"]
|
|
end
|
|
|
|
subgraph "Integration Layer"
|
|
D1["LLM Providers<br/>OpenAI, Ollama, etc."]
|
|
D2["Embedding Providers<br/>text-embedding-*"]
|
|
D3["Storage Adapters<br/>KV/Vector/Graph/Status"]
|
|
end
|
|
|
|
subgraph "Infrastructure Layer"
|
|
E1["PostgreSQL<br/>Relational + Vector"]
|
|
E2["Neo4j/Memgraph<br/>Graph Database"]
|
|
E3["Redis/MongoDB<br/>Cache + NoSQL"]
|
|
E4["File System<br/>JSON + FAISS"]
|
|
end
|
|
|
|
A1 --> B1
|
|
A2 --> B1
|
|
B1 --> B2
|
|
B2 --> B3
|
|
B3 --> B4
|
|
B4 --> C1
|
|
C1 --> C2
|
|
C1 --> C3
|
|
C2 --> C4
|
|
C1 --> D1
|
|
C1 --> D2
|
|
C1 --> D3
|
|
D3 --> E1
|
|
D3 --> E2
|
|
D3 --> E3
|
|
D3 --> E4
|
|
|
|
style A1 fill:#E6F3FF
|
|
style B1 fill:#FFE6E6
|
|
style C1 fill:#E6FFE6
|
|
style D1 fill:#FFF5E6
|
|
style E1 fill:#FFE6F5
|
|
```
|
|
|
|
### Layer Responsibilities
|
|
|
|
**Presentation Layer**: Handles user interactions through a React-based WebUI and provides REST API client capabilities. Responsible for rendering data, handling user input, and managing client-side state. Written in TypeScript with React, this layer communicates with the API Gateway exclusively through HTTP/HTTPS.
|
|
|
|
**API Gateway Layer**: Manages all external communication with the system. Implements authentication and authorization using JWT tokens and API keys, validates incoming requests using Pydantic models, handles rate limiting, and routes requests to appropriate handlers. Built with FastAPI, it provides automatic OpenAPI documentation and request/response validation.
|
|
|
|
**Business Logic Layer**: Contains the core intelligence of LightRAG. The LightRAG class orchestrates all operations, managing document processing pipelines, query execution, and storage coordination. The Operations module handles entity extraction, graph merging, and retrieval algorithms. Utilities provide helper functions for text processing, tokenization, hashing, and caching. Prompt templates define structured prompts for LLM interactions.
|
|
|
|
**Integration Layer**: Abstracts external dependencies through consistent interfaces. LLM provider adapters normalize different API formats (OpenAI, Anthropic, Ollama, etc.) into a common interface. Embedding provider adapters handle various embedding services. Storage adapters implement the abstract storage interfaces (BaseKVStorage, BaseVectorStorage, BaseGraphStorage, DocStatusStorage) for different backends.
|
|
|
|
**Infrastructure Layer**: Provides the foundational data persistence and retrieval capabilities. Supports multiple database systems including PostgreSQL (with pgvector for vector storage), Neo4j and Memgraph (for graph storage), Redis and MongoDB (for caching and document storage), and file-based storage (JSON files, FAISS indexes) for development and small deployments.
|
|
|
|
## Component Interaction Architecture
|
|
|
|
This diagram illustrates how major components interact during typical operations, showing both document indexing and query execution flows.
|
|
|
|
```mermaid
|
|
graph LR
|
|
subgraph "External Systems"
|
|
LLM["LLM Service<br/>(OpenAI/Ollama)"]
|
|
Embed["Embedding Service"]
|
|
end
|
|
|
|
subgraph "Core Components"
|
|
RAG["LightRAG<br/>Core Engine"]
|
|
OPS["Operations<br/>Module"]
|
|
PIPE["Pipeline<br/>Manager"]
|
|
end
|
|
|
|
subgraph "Storage System"
|
|
KV["KV Storage<br/>Chunks/Cache"]
|
|
VEC["Vector Storage<br/>Embeddings"]
|
|
GRAPH["Graph Storage<br/>Entities/Relations"]
|
|
STATUS["Status Storage<br/>Pipeline State"]
|
|
end
|
|
|
|
subgraph "Processing Components"
|
|
CHUNK["Chunking<br/>Engine"]
|
|
EXTRACT["Entity<br/>Extraction"]
|
|
MERGE["Graph<br/>Merging"]
|
|
QUERY["Query<br/>Engine"]
|
|
end
|
|
|
|
RAG --> PIPE
|
|
PIPE --> CHUNK
|
|
CHUNK --> EXTRACT
|
|
EXTRACT --> OPS
|
|
OPS --> LLM
|
|
OPS --> Embed
|
|
OPS --> MERGE
|
|
MERGE --> VEC
|
|
MERGE --> GRAPH
|
|
CHUNK --> KV
|
|
PIPE --> STATUS
|
|
|
|
RAG --> QUERY
|
|
QUERY --> OPS
|
|
QUERY --> VEC
|
|
QUERY --> GRAPH
|
|
QUERY --> KV
|
|
QUERY --> LLM
|
|
|
|
style RAG fill:#E6FFE6
|
|
style LLM fill:#FFF5E6
|
|
style KV fill:#FFE6E6
|
|
style VEC fill:#FFE6E6
|
|
style GRAPH fill:#FFE6E6
|
|
style STATUS fill:#FFE6E6
|
|
```
|
|
|
|
### Component Interaction Patterns
|
|
|
|
**Document Ingestion Pattern**: Client submits documents to the API Gateway, which authenticates and validates the request before passing it to the LightRAG core. The core initializes a pipeline instance with a unique track ID, stores the document in KV storage, and updates its status in Status storage. The Pipeline Manager coordinates the chunking, extraction, merging, and indexing stages, maintaining progress information throughout.
|
|
|
|
**Entity Extraction Pattern**: The Operations module receives text chunks from the Pipeline Manager and constructs prompts using templates from prompt.py. These prompts are sent to the configured LLM service, which returns structured entity and relationship data. The Operations module parses the response, normalizes entity names, and prepares data for graph merging.
|
|
|
|
**Graph Merging Pattern**: When new entities and relationships are extracted, the Merge component compares them against existing graph data. For matching entities (based on name similarity), it consolidates descriptions, merges metadata, and updates source references. For relationships, it deduplicates based on source-target pairs and aggregates weights. The merged data is then stored in Graph storage and vector representations are computed and stored in Vector storage.
|
|
|
|
**Query Execution Pattern**: The Query Engine receives a user query and determines the appropriate retrieval strategy based on the query mode. It extracts high-level and low-level keywords using the LLM, retrieves relevant entities and relationships from Graph storage, fetches related chunks from Vector storage, builds a context respecting token budgets, and finally generates a response using the LLM with the assembled context.
|
|
|
|
## Document Indexing Data Flow
|
|
|
|
This sequence diagram details the complete flow of document processing from ingestion to indexing.
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Client
|
|
participant API as API Server
|
|
participant Core as LightRAG Core
|
|
participant Pipeline
|
|
participant Chunking
|
|
participant Extraction
|
|
participant Merging
|
|
participant KV as KV Storage
|
|
participant Vec as Vector Storage
|
|
participant Graph as Graph Storage
|
|
participant Status as Status Storage
|
|
participant LLM as LLM Service
|
|
participant Embed as Embedding Service
|
|
|
|
Client->>API: POST /documents/upload
|
|
API->>API: Authenticate & Validate
|
|
API->>Core: ainsert(document, file_path)
|
|
|
|
Note over Core: Initialization Phase
|
|
Core->>Core: Generate track_id
|
|
Core->>Status: Create doc status (PENDING)
|
|
Core->>KV: Store document content
|
|
Core->>Pipeline: apipeline_process_enqueue_documents()
|
|
|
|
Note over Pipeline: Chunking Phase
|
|
Pipeline->>Status: Update status (CHUNKING)
|
|
Pipeline->>Chunking: chunking_by_token_size()
|
|
Chunking->>Chunking: Tokenize & split by overlap
|
|
Chunking-->>Pipeline: chunks[]
|
|
Pipeline->>KV: Store chunks with metadata
|
|
|
|
Note over Pipeline: Extraction Phase
|
|
Pipeline->>Status: Update status (EXTRACTING)
|
|
loop For each chunk
|
|
Pipeline->>Extraction: extract_entities(chunk)
|
|
Extraction->>LLM: Generate entities/relations
|
|
LLM-->>Extraction: Structured output
|
|
Extraction->>Extraction: Parse & normalize
|
|
Extraction-->>Pipeline: entities[], relations[]
|
|
end
|
|
|
|
Note over Pipeline: Merging Phase
|
|
Pipeline->>Status: Update status (MERGING)
|
|
Pipeline->>Merging: merge_nodes_and_edges()
|
|
|
|
par Parallel Entity Processing
|
|
loop For each entity
|
|
Merging->>Graph: Check if entity exists
|
|
alt Entity exists
|
|
Merging->>LLM: Summarize descriptions
|
|
LLM-->>Merging: Merged description
|
|
Merging->>Graph: Update entity
|
|
else New entity
|
|
Merging->>Graph: Insert entity
|
|
end
|
|
Merging->>Embed: Generate embedding
|
|
Embed-->>Merging: embedding vector
|
|
Merging->>Vec: Store entity embedding
|
|
end
|
|
and Parallel Relationship Processing
|
|
loop For each relationship
|
|
Merging->>Graph: Check if relation exists
|
|
alt Relation exists
|
|
Merging->>Graph: Update weight & metadata
|
|
else New relation
|
|
Merging->>Graph: Insert relation
|
|
end
|
|
Merging->>Embed: Generate embedding
|
|
Embed-->>Merging: embedding vector
|
|
Merging->>Vec: Store relation embedding
|
|
end
|
|
end
|
|
|
|
Note over Pipeline: Indexing Phase
|
|
Pipeline->>Status: Update status (INDEXING)
|
|
Pipeline->>Vec: Index chunk embeddings
|
|
Pipeline->>Graph: Build graph indices
|
|
Pipeline->>KV: Commit cache
|
|
|
|
Note over Pipeline: Completion
|
|
Pipeline->>Status: Update status (COMPLETED)
|
|
Pipeline-->>Core: Success
|
|
Core-->>API: track_id, status
|
|
API-->>Client: 200 OK {track_id}
|
|
|
|
Note over Client: Client can poll /status/{track_id}
|
|
```
|
|
|
|
### Indexing Phase Details
|
|
|
|
**Document Reception and Validation**: The API server receives the document, validates the file format and size, authenticates the request, and generates a unique track ID for monitoring. The document content is immediately stored in KV storage with metadata including file path, upload timestamp, and original filename.
|
|
|
|
**Chunking Strategy**: Documents are split into overlapping chunks using a token-based approach. The system tokenizes the entire document using tiktoken, creates chunks of configurable size (default 1200 tokens), adds overlap between consecutive chunks (default 100 tokens) to preserve context, and stores each chunk with position metadata and references to the source document.
|
|
|
|
**Entity and Relationship Extraction**: For each chunk, the system constructs a specialized prompt that instructs the LLM to identify entities with specific types and relationships between them. The LLM returns structured output in a specific format (entity|name|type|description for entities, relation|source|target|keywords|description for relationships). The system parses this output, normalizes entity names using case-insensitive matching, and validates the structure before proceeding.
|
|
|
|
**Graph Construction and Merging**: New entities are compared against existing entities in the graph using fuzzy matching. When duplicates are found, descriptions are merged using either LLM-based summarization (for complex cases) or simple concatenation (for simple cases). Relationships are deduplicated based on source-target pairs, with weights aggregated when duplicates are found. All graph modifications are protected by keyed locks to ensure consistency in concurrent operations.
|
|
|
|
**Vector Embedding Generation**: Entity descriptions, relationship descriptions, and chunk content are sent to the embedding service in batches for efficient processing. Embeddings are generated using the configured model (e.g., text-embedding-3-small for OpenAI), and vectors are stored in Vector storage with metadata linking back to their source entities, relationships, or chunks. The system uses semaphores to limit concurrent embedding requests and prevent rate limit errors.
|
|
|
|
**Status Tracking Throughout**: Every stage updates the document status in Status storage, recording the current phase, progress percentage, error messages if any, and timing information. This enables clients to poll for progress and provides diagnostic information for debugging failed indexing operations.
|
|
|
|
## Query Processing Data Flow
|
|
|
|
This sequence diagram illustrates the retrieval and response generation process for different query modes.
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Client
|
|
participant API as API Server
|
|
participant Core as LightRAG Core
|
|
participant Query as Query Engine
|
|
participant KW as Keyword Extractor
|
|
participant Graph as Graph Storage
|
|
participant Vec as Vector Storage
|
|
participant KV as KV Storage
|
|
participant Context as Context Builder
|
|
participant LLM as LLM Service
|
|
participant Rerank as Rerank Service
|
|
|
|
Client->>API: POST /query (query, mode, params)
|
|
API->>API: Authenticate & Validate
|
|
API->>Core: aquery(query, QueryParam)
|
|
Core->>Query: Execute query
|
|
|
|
Note over Query: Keyword Extraction Phase
|
|
Query->>KW: Extract keywords
|
|
KW->>LLM: Generate high/low level keywords
|
|
LLM-->>KW: {hl_keywords[], ll_keywords[]}
|
|
KW-->>Query: keywords
|
|
|
|
alt Mode: local (Entity-centric)
|
|
Note over Query: Local Mode - Focus on Entities
|
|
Query->>Vec: Query entity vectors (ll_keywords)
|
|
Vec-->>Query: top_k entity_ids[]
|
|
Query->>Graph: Get entities by IDs
|
|
Graph-->>Query: entities[]
|
|
Query->>Graph: Get connected relations
|
|
Graph-->>Query: relations[]
|
|
|
|
else Mode: global (Relationship-centric)
|
|
Note over Query: Global Mode - Focus on Relations
|
|
Query->>Vec: Query relation vectors (hl_keywords)
|
|
Vec-->>Query: top_k relation_ids[]
|
|
Query->>Graph: Get relations by IDs
|
|
Graph-->>Query: relations[]
|
|
Query->>Graph: Get connected entities
|
|
Graph-->>Query: entities[]
|
|
|
|
else Mode: hybrid
|
|
Note over Query: Hybrid Mode - Combined
|
|
par Parallel Retrieval
|
|
Query->>Vec: Query entity vectors
|
|
Vec-->>Query: entity_ids[]
|
|
and
|
|
Query->>Vec: Query relation vectors
|
|
Vec-->>Query: relation_ids[]
|
|
end
|
|
Query->>Graph: Get entities and relations
|
|
Graph-->>Query: entities[], relations[]
|
|
Query->>Query: Merge with round-robin
|
|
|
|
else Mode: mix
|
|
Note over Query: Mix Mode - KG + Chunks
|
|
par Parallel Retrieval
|
|
Query->>Vec: Query entity vectors
|
|
Vec-->>Query: entity_ids[]
|
|
Query->>Graph: Get entities
|
|
Graph-->>Query: entities[]
|
|
and
|
|
Query->>Vec: Query chunk vectors
|
|
Vec-->>Query: chunk_ids[]
|
|
end
|
|
|
|
else Mode: naive
|
|
Note over Query: Naive Mode - Pure Vector
|
|
Query->>Vec: Query chunk vectors only
|
|
Vec-->>Query: top_k chunk_ids[]
|
|
end
|
|
|
|
Note over Query: Chunk Retrieval Phase
|
|
alt Mode != bypass
|
|
Query->>Query: Get related chunks from entities/relations
|
|
Query->>KV: Get chunks by IDs
|
|
KV-->>Query: chunks[]
|
|
|
|
opt Rerank enabled
|
|
Query->>Rerank: Rerank chunks
|
|
Rerank-->>Query: reranked_chunks[]
|
|
end
|
|
end
|
|
|
|
Note over Query: Context Building Phase
|
|
Query->>Context: Build context with token budget
|
|
Context->>Context: Allocate tokens (entities/relations/chunks)
|
|
Context->>Context: Truncate to fit budget
|
|
Context->>Context: Format entities/relations/chunks
|
|
Context-->>Query: context_string, references[]
|
|
|
|
Note over Query: Response Generation Phase
|
|
Query->>Query: Build prompt with context
|
|
opt Include conversation history
|
|
Query->>Query: Add history messages
|
|
end
|
|
|
|
alt Stream enabled
|
|
Query->>LLM: Stream generate (prompt, context)
|
|
loop Streaming chunks
|
|
LLM-->>Query: chunk
|
|
Query-->>API: chunk
|
|
API-->>Client: SSE chunk
|
|
end
|
|
else Stream disabled
|
|
Query->>LLM: Generate (prompt, context)
|
|
LLM-->>Query: response
|
|
Query-->>Core: response, references
|
|
Core-->>API: {response, references, metadata}
|
|
API-->>Client: 200 OK {response}
|
|
end
|
|
```
|
|
|
|
### Query Processing Phase Details
|
|
|
|
**Keyword Extraction Phase**: The system sends the user query to the LLM with a specialized prompt that asks for high-level keywords (abstract concepts and themes) and low-level keywords (specific entities and terms). The LLM returns structured JSON with both keyword types, which guide the subsequent retrieval strategy. This two-level keyword approach enables the system to retrieve both broad contextual information and specific detailed facts.
|
|
|
|
**Mode-Specific Retrieval Strategies**:
|
|
|
|
*Local Mode* focuses on entity-centric retrieval by querying the vector storage using low-level keywords to find the most relevant entities, retrieving full entity details including descriptions and metadata, and then fetching all relationships connected to those entities. This mode is optimal for questions about specific entities or localized information.
|
|
|
|
*Global Mode* emphasizes relationship-centric retrieval by querying vector storage using high-level keywords to find relevant relationships, retrieving relationship details including keywords and descriptions, and then fetching the entities connected by those relationships. This mode excels at questions about connections, patterns, and higher-level concepts.
|
|
|
|
*Hybrid Mode* combines both approaches by running local and global retrieval in parallel and then merging results using a round-robin strategy to balance entity and relationship information. This provides comprehensive coverage for complex queries that require both types of information.
|
|
|
|
*Mix Mode* integrates knowledge graph retrieval with direct chunk retrieval by querying entity vectors to get graph-based context, simultaneously querying chunk vectors for relevant document sections, and combining both types of results. This mode provides the most complete context by including both structured knowledge and raw document content.
|
|
|
|
*Naive Mode* performs pure vector similarity search without using the knowledge graph, simply retrieving the most similar chunks based on embedding distance. This mode is fastest and works well for simple similarity-based retrieval without needing entity or relationship context.
|
|
|
|
*Bypass Mode* skips retrieval entirely and sends the query directly to the LLM, useful for general questions that don't require specific document context or when testing the LLM's base knowledge.
|
|
|
|
**Context Building with Token Budgets**: The system implements a sophisticated token budget management system that allocates a maximum number of tokens across different context components. It allocates tokens to entity descriptions (default 6000 tokens), relationship descriptions (default 8000 tokens), and chunk content (remaining budget, with a cap defined by chunk_top_k). The system truncates each component to fit within its budget using the tokenizer, prioritizing higher-ranked items when truncation is necessary, and ensures the total context doesn't exceed the max_total_tokens limit (default 30000 tokens).
|
|
|
|
**Reranking for Improved Relevance**: When enabled, the reranking phase takes retrieved chunks and reranks them using a specialized reranking model (like Cohere rerank or Jina rerank). This cross-encoder approach provides more accurate relevance scoring than pure vector similarity, especially for semantic matching. Chunks below the minimum rerank score threshold are filtered out, and only the top-k chunks after reranking are included in the final context.
|
|
|
|
**Response Generation with Streaming**: For streaming responses, the system establishes a connection to the LLM with stream=True, receives response tokens incrementally, and immediately forwards them to the client via Server-Sent Events (SSE). This provides real-time feedback to users and reduces perceived latency. For non-streaming responses, the system waits for the complete LLM response before returning it to the client along with metadata about entities, relationships, and chunks used in the context.
|
|
|
|
## Storage Layer Architecture
|
|
|
|
The storage layer implements a plugin architecture with abstract base classes defining the contract for each storage type.
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph "Abstract Interfaces"
|
|
BASE["StorageNameSpace<br/>(Base Class)"]
|
|
KV_BASE["BaseKVStorage"]
|
|
VEC_BASE["BaseVectorStorage"]
|
|
GRAPH_BASE["BaseGraphStorage"]
|
|
STATUS_BASE["DocStatusStorage"]
|
|
end
|
|
|
|
subgraph "KV Storage Implementations"
|
|
JSON_KV["JsonKVStorage<br/>(File-based)"]
|
|
PG_KV["PGKVStorage<br/>(PostgreSQL)"]
|
|
MONGO_KV["MongoKVStorage<br/>(MongoDB)"]
|
|
REDIS_KV["RedisKVStorage<br/>(Redis)"]
|
|
end
|
|
|
|
subgraph "Vector Storage Implementations"
|
|
NANO["NanoVectorDBStorage<br/>(In-memory)"]
|
|
FAISS["FaissVectorDBStorage<br/>(FAISS)"]
|
|
PG_VEC["PGVectorStorage<br/>(pgvector)"]
|
|
MILVUS["MilvusVectorStorage<br/>(Milvus)"]
|
|
QDRANT["QdrantVectorStorage<br/>(Qdrant)"]
|
|
end
|
|
|
|
subgraph "Graph Storage Implementations"
|
|
NX["NetworkXStorage<br/>(NetworkX)"]
|
|
NEO4J["Neo4jStorage<br/>(Neo4j)"]
|
|
MEMGRAPH["MemgraphStorage<br/>(Memgraph)"]
|
|
PG_GRAPH["PGGraphStorage<br/>(PostgreSQL)"]
|
|
end
|
|
|
|
subgraph "Doc Status Implementations"
|
|
JSON_STATUS["JsonDocStatusStorage<br/>(File-based)"]
|
|
PG_STATUS["PGDocStatusStorage<br/>(PostgreSQL)"]
|
|
MONGO_STATUS["MongoDocStatusStorage<br/>(MongoDB)"]
|
|
end
|
|
|
|
BASE --> KV_BASE
|
|
BASE --> VEC_BASE
|
|
BASE --> GRAPH_BASE
|
|
BASE --> STATUS_BASE
|
|
|
|
KV_BASE --> JSON_KV
|
|
KV_BASE --> PG_KV
|
|
KV_BASE --> MONGO_KV
|
|
KV_BASE --> REDIS_KV
|
|
|
|
VEC_BASE --> NANO
|
|
VEC_BASE --> FAISS
|
|
VEC_BASE --> PG_VEC
|
|
VEC_BASE --> MILVUS
|
|
VEC_BASE --> QDRANT
|
|
|
|
GRAPH_BASE --> NX
|
|
GRAPH_BASE --> NEO4J
|
|
GRAPH_BASE --> MEMGRAPH
|
|
GRAPH_BASE --> PG_GRAPH
|
|
|
|
STATUS_BASE --> JSON_STATUS
|
|
STATUS_BASE --> PG_STATUS
|
|
STATUS_BASE --> MONGO_STATUS
|
|
|
|
style BASE fill:#E6FFE6
|
|
style JSON_KV fill:#E6F3FF
|
|
style NANO fill:#FFE6E6
|
|
style NX fill:#FFF5E6
|
|
style JSON_STATUS fill:#FFE6F5
|
|
```
|
|
|
|
### Storage Interface Contracts
|
|
|
|
**BaseKVStorage Interface**: Key-value storage manages cached data, text chunks, and full documents. Core methods include get_by_id(id) for retrieving a single value, get_by_ids(ids) for batch retrieval, filter_keys(keys) to check which keys don't exist, upsert(data) for inserting or updating entries, delete(ids) for removing entries, and index_done_callback() for persisting changes to disk. This interface supports both in-memory implementations with persistence and direct database implementations.
|
|
|
|
**BaseVectorStorage Interface**: Vector storage handles embeddings for entities, relationships, and chunks. Core methods include query(query, top_k, query_embedding) for similarity search, upsert(data) for storing vectors with metadata, delete(ids) for removing vectors, delete_entity(entity_name) for removing entity-related vectors, delete_entity_relation(entity_name) for removing relationship vectors, get_by_id(id) and get_by_ids(ids) for retrieving full vector data, and get_vectors_by_ids(ids) for efficient vector-only retrieval. All implementations must support cosine similarity search and metadata filtering.
|
|
|
|
**BaseGraphStorage Interface**: Graph storage maintains the entity-relationship graph structure. Core methods include has_node(node_id) and has_edge(source, target) for existence checks, node_degree(node_id) and edge_degree(src, tgt) for connectivity metrics, get_node(node_id) and upsert_node(node_id, data) for node operations, upsert_edge(source, target, data) for relationship operations, get_knowledge_graph(node_label, max_depth, max_nodes) for graph traversal and export, and delete operations for nodes and edges. The interface treats all relationships as undirected unless explicitly specified.
|
|
|
|
**DocStatusStorage Interface**: Document status storage tracks processing pipeline state for each document. Core methods include upsert(status) for updating document status, get_by_id(doc_id) for retrieving status, get_by_ids(doc_ids) for batch retrieval, filter_ids(doc_ids) for checking existence, delete(doc_ids) for cleanup, get_by_status(status) for finding documents in a specific state, and count_by_status() for pipeline metrics. This enables comprehensive monitoring and recovery of document processing operations.
|
|
|
|
### Storage Implementation Patterns
|
|
|
|
**File-Based Storage (JSON)**: Simple implementations store data in JSON files with in-memory caching for performance. All modifications are held in memory until index_done_callback() triggers a write to disk. These implementations are suitable for development, small deployments, and single-process scenarios. They provide atomic writes using temporary files and rename operations, handle concurrent access through file locking, and support workspace isolation through directory structure.
|
|
|
|
**PostgreSQL Storage**: Comprehensive implementations that leverage PostgreSQL's capabilities including JSON columns for flexible metadata, pgvector extension for vector similarity search, advisory locks for distributed coordination, connection pooling for performance, and transaction support for consistency. PostgreSQL implementations can handle all four storage types in a single database, simplifying deployment and backup. They support multi-tenant deployments through schema-based isolation and provide excellent performance for mixed workloads.
|
|
|
|
**Specialized Vector Databases**: Dedicated vector storage implementations like FAISS, Milvus, and Qdrant provide optimized vector similarity search with features like approximate nearest neighbor (ANN) search, GPU acceleration for large-scale similarity search, advanced indexing strategies (IVF, HNSW), and high-performance batch operations. These are recommended for deployments with large document sets (>1M chunks) or high query throughput requirements.
|
|
|
|
**Graph Databases (Neo4j/Memgraph)**: Specialized graph implementations optimize graph traversal and pattern matching with native graph storage and indexing, Cypher query language for complex graph queries, visualization capabilities for knowledge graph exploration, and optimized algorithms for shortest path, centrality, and community detection. These are ideal for use cases requiring complex graph analytics and when graph visualization is a primary feature.
|
|
|
|
## Concurrency and State Management
|
|
|
|
LightRAG implements sophisticated concurrency control to handle parallel document processing and query execution.
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph "Concurrency Control"
|
|
SEM1["Semaphore<br/>LLM Calls<br/>(max_async)"]
|
|
SEM2["Semaphore<br/>Embeddings<br/>(embedding_func_max_async)"]
|
|
SEM3["Semaphore<br/>Graph Merging<br/>(graph_max_async)"]
|
|
LOCK1["Keyed Locks<br/>Entity Processing"]
|
|
LOCK2["Keyed Locks<br/>Relation Processing"]
|
|
LOCK3["Pipeline Status Lock"]
|
|
end
|
|
|
|
subgraph "State Management"
|
|
GLOBAL["Global Config<br/>workspace, paths, settings"]
|
|
PIPELINE["Pipeline Status<br/>current job, progress"]
|
|
NAMESPACE["Namespace Data<br/>storage instances, locks"]
|
|
end
|
|
|
|
subgraph "Task Coordination"
|
|
QUEUE["Task Queue<br/>Document Processing"]
|
|
PRIORITY["Priority Limiter<br/>Async Function Calls"]
|
|
TRACK["Track ID System<br/>Monitoring & Logging"]
|
|
end
|
|
|
|
SEM1 --> PRIORITY
|
|
SEM2 --> PRIORITY
|
|
SEM3 --> PRIORITY
|
|
|
|
LOCK1 --> NAMESPACE
|
|
LOCK2 --> NAMESPACE
|
|
LOCK3 --> PIPELINE
|
|
|
|
QUEUE --> TRACK
|
|
PRIORITY --> TRACK
|
|
|
|
GLOBAL --> NAMESPACE
|
|
PIPELINE --> TRACK
|
|
|
|
style SEM1 fill:#FFE6E6
|
|
style GLOBAL fill:#E6FFE6
|
|
style QUEUE fill:#E6F3FF
|
|
```
|
|
|
|
### Concurrency Patterns
|
|
|
|
**Semaphore-Based Rate Limiting**: The system uses asyncio semaphores to limit concurrent operations and prevent overwhelming external services or exhausting resources. Different semaphores control different types of operations: LLM calls are limited by max_async (default 4), embedding function calls by embedding_func_max_async (default 8), and graph merging operations by graph_max_async (calculated as llm_model_max_async * 2). These semaphores ensure respectful API usage and prevent rate limit errors.
|
|
|
|
**Keyed Locks for Data Consistency**: When processing entities and relationships concurrently, the system uses keyed locks to ensure that multiple processes don't modify the same entity or relationship simultaneously. Each entity or relationship gets a unique lock based on its identifier, preventing race conditions during graph merging while still allowing parallel processing of different entities. This pattern enables high concurrency without sacrificing data consistency.
|
|
|
|
**Pipeline Status Tracking**: A shared pipeline status object tracks the current state of document processing including the active job name, number of documents being processed, current batch number, latest status message, and history of messages for debugging. This status is protected by an async lock and can be queried by clients to monitor progress. The status persists across the entire pipeline and provides visibility into long-running operations.
|
|
|
|
**Workspace Isolation**: The workspace concept provides multi-tenant isolation by prefixing all storage namespaces with a workspace identifier. Different workspaces maintain completely separate data including separate storage instances, independent configuration, isolated locks and semaphores, and separate pipeline status. This enables running multiple LightRAG instances in the same infrastructure without interference.
|
|
|
|
### TypeScript Migration Considerations for Concurrency
|
|
|
|
**Semaphore Implementation**: Node.js doesn't have built-in semaphores, but the pattern can be implemented using the p-limit library, which provides similar functionality with a cleaner API. Example: `const limiter = pLimit(4); await limiter(() => callLLM())`.
|
|
|
|
**Keyed Locks**: For single-process deployments, a Map<string, Promise> can implement keyed locks. For multi-process deployments, consider using Redis with the Redlock algorithm or a dedicated lock service. The key insight is ensuring that operations on the same entity/relationship are serialized while different entities can be processed in parallel.
|
|
|
|
**Shared State Management**: Python's global dictionaries need to be replaced with class-based state management in TypeScript. For multi-process deployments, shared state should be externalized to Redis or a similar store. For single-process deployments, singleton classes can manage state with proper TypeScript visibility controls.
|
|
|
|
**Pipeline Status Updates**: Real-time status updates can be implemented using EventEmitter in Node.js for in-process communication, or Redis Pub/Sub for multi-process scenarios. WebSocket connections can provide real-time updates to clients without polling.
|
|
|
|
## Summary
|
|
|
|
This architecture documentation provides a comprehensive view of LightRAG's design, from high-level layer organization to detailed component interactions and concurrency patterns. The system's modular design, with clear interfaces and abstractions, makes it well-suited for migration to TypeScript. The key architectural principles—layered separation of concerns, plugin-based storage abstraction, async-first concurrency, and comprehensive state tracking—translate well to TypeScript and Node.js idioms.
|
|
|
|
The subsequent documentation sections build on this architectural foundation, providing detailed specifications for data models, storage implementations, LLM integrations, and API contracts. Together, these documents form a complete blueprint for implementing a production-ready TypeScript version of LightRAG.
|