LightRAG/docs/archives/adr/006-architecture-diagrams-alternatives.md
Raphael MANSUY 2b292d4924
docs: Enterprise Edition & Multi-tenancy attribution (#5)
* Remove outdated documentation files: Quick Start Guide, Apache AGE Analysis, and Scratchpad.

* Add multi-tenant testing strategy and ADR index documentation

- Introduced ADR 008 detailing the multi-tenant testing strategy for the ./starter environment, covering compatibility and multi-tenant modes, testing scenarios, and implementation details.
- Created a comprehensive ADR index (README.md) summarizing all architecture decision records related to the multi-tenant implementation, including purpose, key sections, and reading paths for different roles.

* feat(docs): Add comprehensive multi-tenancy guide and README for LightRAG Enterprise

- Introduced `0008-multi-tenancy.md` detailing multi-tenancy architecture, key concepts, roles, permissions, configuration, and API endpoints.
- Created `README.md` as the main documentation index, outlining features, quick start, system overview, and deployment options.
- Documented the LightRAG architecture, storage backends, LLM integrations, and query modes.
- Established a task log (`2025-01-21-lightrag-documentation-log.md`) summarizing documentation creation actions, decisions, and insights.
2025-12-04 18:09:15 +08:00

500 lines
26 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# ADR 006: Architecture Diagrams and Alternatives Analysis
## Status: Proposed
## Proposed Architecture Diagram
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ LightRAG Multi-Tenant System │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ FastAPI Application │ │
│ ├──────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ Request Middleware Layer │ │ │
│ │ ├─────────────────────────────────────────────────────────┤ │ │
│ │ │ • CORS Middleware │ │ │
│ │ │ • HTTPS Redirect │ │ │
│ │ │ • Rate Limiting (per tenant) │ │ │
│ │ │ • Request Logging & Audit │ │ │
│ │ │ • Idempotency Key Handling │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ ↓ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ Authentication & Tenant Context Extraction │ │ │
│ │ ├─────────────────────────────────────────────────────────┤ │ │
│ │ │ 1. Parse JWT token or API key from headers │ │ │
│ │ │ 2. Validate signature and expiration │ │ │
│ │ │ 3. Extract tenant_id, kb_id, user_id, permissions │ │ │
│ │ │ 4. Verify token.tenant_id == path.tenant_id │ │ │
│ │ │ 5. Verify user can access kb_id │ │ │
│ │ │ → Returns TenantContext object │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ ↓ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ API Routing Layer │ │ │
│ │ ├─────────────────────────────────────────────────────────┤ │ │
│ │ │ /api/v1/tenants/{tenant_id}/ │ │ │
│ │ │ ├─ knowledge-bases/{kb_id}/documents/* │ │ │
│ │ │ ├─ knowledge-bases/{kb_id}/query* │ │ │
│ │ │ ├─ knowledge-bases/{kb_id}/graph/* │ │ │
│ │ │ ├─ knowledge-bases/* │ │ │
│ │ │ └─ api-keys/* │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ ↓ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ Request Handlers (with TenantContext injected) │ │ │
│ │ ├─────────────────────────────────────────────────────────┤ │ │
│ │ │ • Validate permissions on TenantContext │ │ │
│ │ │ • Get tenant-specific RAG instance │ │ │
│ │ │ • Pass context to business logic │ │ │
│ │ │ • Return response with audit trail │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Tenant-Aware LightRAG Instance Manager │ │
│ ├──────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ Instance Cache: │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ (tenant_1, kb_1) → LightRAG@memory │ │ │
│ │ │ (tenant_1, kb_2) → LightRAG@memory │ │ │
│ │ │ (tenant_2, kb_1) → LightRAG@memory │ │ │
│ │ │ (tenant_3, kb_1) → LightRAG@memory │ │ │
│ │ │ ... │ │ │
│ │ │ Max: 100 instances (configurable) │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Each LightRAG instance: │ │
│ │ • Uses tenant-specific configuration (LLM, embedding models) │ │
│ │ • Works with dedicated namespace: {tenant_id}_{kb_id} │ │
│ │ • Isolated storage connections │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Storage Access Layer (with Tenant Filtering) │ │
│ ├──────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ Query Modification: │ │
│ │ Before: SELECT * FROM documents WHERE doc_id = 'abc' │ │
│ │ After: SELECT * FROM documents │ │
│ │ WHERE tenant_id = 'acme' AND kb_id = 'docs' │ │
│ │ AND doc_id = 'abc' │ │
│ │ │ │
│ │ • All queries automatically scoped to current tenant/KB │ │
│ │ • Prevents accidental cross-tenant data access │ │
│ │ • Storage layer enforces isolation (defense in depth) │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Storage Backends (Shared) │ │
│ ├──────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────┐ ┌────────────────────┐ │ │
│ │ │ PostgreSQL │ │ Neo4j │ │ Milvus/Qdrant │ │ │
│ │ │ (Shared DB) │ │ (Shared) │ │ (Vector Store) │ │ │
│ │ ├─────────────────┤ ├─────────────┤ ├────────────────────┤ │ │
│ │ │ • Documents │ │ • Entities │ │ • Embeddings │ │ │
│ │ │ • Chunks │ │ • Relations │ │ • Entity vectors │ │ │
│ │ │ • Entities │ │ │ │ │ │ │
│ │ │ • API Keys │ │ Each node │ │ Each vector │ │ │
│ │ │ • Tenants │ │ tagged with │ │ tagged with │ │ │
│ │ │ • KBs │ │ tenant_id + │ │ tenant_id + kb_id │ │ │
│ │ │ │ │ kb_id │ │ │ │ │
│ │ │ Filtered by: │ │ │ │ Filtered by: │ │ │
│ │ │ tenant_id, │ │ Filtered by:│ │ tenant_id, │ │ │
│ │ │ kb_id in WHERE │ │ tenant_id + │ │ kb_id in query │ │ │
│ │ │ │ │ kb_id │ │ │ │ │
│ │ └─────────────────┘ └─────────────┘ └────────────────────┘ │ │
│ │ │ │
│ │ All with tenant/KB isolation at schema/data level │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
## Data Flow Diagrams
### Query Execution Flow
```
1. Client Request
├─ POST /api/v1/tenants/acme/knowledge-bases/docs/query
├─ Body: {"query": "What is..."}
└─ Header: Authorization: Bearer <token>
2. Middleware Validation
├─ Extract tenant_id, kb_id from URL path
├─ Extract token from Authorization header
├─ Validate token signature and expiration
├─ Extract user_id, tenant_id_in_token, permissions
└─ VERIFY: tenant_id (path) == tenant_id_in_token
3. Dependency Injection
├─ Create TenantContext(
│ tenant_id="acme",
│ kb_id="docs",
│ user_id="john",
│ role="editor",
│ permissions={"query:run": true}
└─ )
4. Handler Authorization
├─ Check TenantContext.permissions["query:run"] == true
└─ If false → 403 Forbidden
5. Get RAG Instance
├─ RAGManager.get_instance(tenant_id="acme", kb_id="docs")
├─ Check cache → Found → Use cached instance
└─ (If not cached: create new with tenant config)
6. Execute Query
├─ RAG.aquery(query="What is...", tenant_context=context)
│ └─ All internal queries will include tenant/kb filters:
│ └─ Storage layer automatically adds:
│ WHERE tenant_id='acme' AND kb_id='docs'
7. Storage Layer Filtering
├─ Vector search: Find embeddings WHERE tenant_id='acme' AND kb_id='docs'
├─ Graph query: Match entities {tenant_id:'acme', kb_id:'docs'}
├─ KV lookup: Get items with key prefix 'acme:docs:'
└─ Returns only acme/docs data (NO cross-tenant leakage possible)
8. Response Generation
├─ RAG generates response from filtered data
├─ Response object created
└─ Handler receives response with TenantContext
9. Audit Logging
├─ Log: {
│ user_id: "john",
│ tenant_id: "acme",
│ kb_id: "docs",
│ action: "query_executed",
│ status: "success",
│ timestamp: <now>
└─ }
10. Response Returned to Client
└─ HTTP 200 with query result
```
### Document Upload Flow
```
1. Client uploads document
├─ POST /api/v1/tenants/acme/knowledge-bases/docs/documents/add
├─ File: document.pdf
└─ Header: Authorization: Bearer <token>
2. Authentication & Authorization
├─ Validate token, extract TenantContext
├─ Check permission: document:create
└─ Verify tenant_id matches path and token
3. File Validation
├─ Check file type (PDF, DOCX, etc.)
├─ Check file size < quota
├─ Sanitize filename
└─ Generate unique doc_id
4. Queue Document Processing
├─ Store temp file: /{working_dir}/{tenant_id}/{kb_id}/__tmp__/{doc_id}
├─ Create DocStatus record with status="processing"
├─ Return to client: {status: "processing", track_id: "..."}
└─ Start async processing task
5. Async Document Processing (background task)
├─ Get RAG instance for (acme, docs)
├─ Insert document:
│ └─ RAG.ainsert(file_path, tenant_id="acme", kb_id="docs")
│ └─ Internal processing automatically tags data with:
│ └─ tenant_id="acme", kb_id="docs"
├─ Update DocStatus:
│ ├─ status → "success"
│ ├─ chunks_processed → 42
│ └─ entities_extracted → 15
└─ Move file: __tmp__ → {kb_id}/documents/
6. Storage Writes (tenant-scoped)
├─ PostgreSQL:
│ └─ INSERT INTO chunks (tenant_id, kb_id, doc_id, content)
│ VALUES ('acme', 'docs', 'doc-123', '...')
├─ Neo4j:
│ └─ CREATE (e:Entity {tenant_id:'acme', kb_id:'docs', name:'...'})-[:IN_KB]->(kb)
└─ Milvus:
└─ Insert vector with metadata: {tenant_id:'acme', kb_id:'docs'}
7. Client Polls for Status
├─ GET /api/v1/tenants/acme/knowledge-bases/docs/documents/{doc_id}/status
├─ Returns: {status: "success", chunks: 42, entities: 15}
└─ Client confirms upload complete
```
## Alternatives Considered
### Alternative 1: Separate Database Per Tenant
**Architecture:**
- Each tenant gets dedicated PostgreSQL database
- Separate Neo4j instances per tenant
- Separate Milvus collections per tenant
```
Tenant A Server → PostgreSQL A
→ Neo4j A
→ Milvus A
Tenant B Server → PostgreSQL B
→ Neo4j B
→ Milvus B
```
**Pros:**
- Maximum isolation (physical separation)
- Easier compliance (HIPAA, GDPR)
- Better disaster recovery per tenant
- Easier scaling (scale out per tenant)
**Cons:**
- ❌ Massive operational overhead
- Each database needs separate backup, upgrade, monitoring
- 100 tenants = 100 databases to manage
- Database licensing costs multiply (100x more expensive)
- ❌ Complex deployment & maintenance
- Infrastructure-as-Code becomes complex
- Database credentials management nightmare
- Harder debugging with distributed databases
- ❌ Impossible resource sharing
- Cannot leverage shared compute resources
- Cannot optimize resource usage globally
- Waste of resources (each DB has minimum overhead)
- ❌ Cross-tenant features impossible
- Data sharing between tenants difficult
- Consolidated reporting/analytics hard to implement
**Decision: REJECTED**
Too expensive and operationally complex for moderate scale.
---
### Alternative 2: Dedicated Server Per Tenant
**Architecture:**
- Each tenant runs own LightRAG instance
- Own Python process, own configurations
- Own memory/CPU allocation
```
Tenant A → LightRAG Process A (port 9621)
Tenant B → LightRAG Process B (port 9622)
Tenant C → LightRAG Process C (port 9623)
```
**Pros:**
- Complete isolation (separate processes)
- Easy to manage per-tenant configs
- Can use different models per tenant
**Cons:**
- ❌ Massive resource waste
- Minimum ~500MB RAM per instance × 100 tenants = 50GB+ RAM
- Minimum CPU overhead per process
- ❌ Extremely expensive at scale
- 100 tenants × 4GB allocated = 400GB RAM needed
- Infrastructure costs prohibitive
- ❌ Operational nightmare
- 100 processes to monitor
- 100 upgrades/patches to manage
- Complex deployment orchestration
- ❌ Poor utilization
- Most tenants underutilize their resources
- Cannot rebalance resources dynamically
- Peak loads unpredictable per tenant
**Decision: REJECTED**
Not economically viable for enterprise deployments.
---
### Alternative 3: Simple Workspace Rename (No Knowledge Base)
**Architecture:**
- Rename "workspace" to "tenant"
- No KB concept
- Assume 1 KB per tenant
```
POST /api/v1/workspaces/{workspace_id}/query
→ becomes
POST /api/v1/tenants/{tenant_id}/query
```
**Pros:**
- Minimal code changes
- Backward compatible
- Quick implementation (1 week)
**Cons:**
- ❌ No knowledge base isolation
- Tenant with multiple unrelated KBs must share config
- Cannot have tenant-specific KB settings
- All data mixed together
- ❌ Cannot enforce cross-tenant access prevention
- Workspace is just a directory/field
- No API-level enforcement
- Easy to make mistakes
- ❌ No RBAC
- Cannot grant access to specific KBs
- All-or-nothing tenant access
- No fine-grained permissions
- ❌ No tenant-specific configuration
- All tenants must use same LLM/embedding models
- Cannot customize per tenant needs
- ❌ Limited compliance features
- No audit trails of who accessed what
- Difficult to enforce data residency
- No resource quotas
**Decision: REJECTED**
Doesn't meet business requirements for true multi-tenancy.
---
### Alternative 4: Shared Single LightRAG for All Tenants
**Architecture:**
- One LightRAG instance for all tenants
- Single namespace, single graph
- Tenant filtering only at API layer
```
API Layer → Filters query by tenant → Single LightRAG Instance
```
**Pros:**
- Minimal resource usage
- Single deployment
- Simple to maintain
**Cons:**
- ❌ Data isolation risk is CRITICAL
- Single point of failure for all tenants
- One query mistake → cross-tenant data leak
- Cannot be patched without affecting all
- ❌ Performance bottleneck
- Single instance cannot scale with tenants
- All LLM calls compete for resources
- All embedding calls serialized
- ❌ Tenant-specific configuration impossible
- All tenants share same models
- Cannot customize chunk size, top_k, etc per tenant
- ❌ No blast radius isolation
- One tenant's bad data can corrupt all
- One tenant's quota exhaustion affects all
- ❌ Compliance impossible
- Data residency requirements: cannot guarantee where data is
- GDPR right to deletion: must delete entire system
- Audit requirements: cannot track per-tenant operations
**Decision: REJECTED**
Unacceptable security and operational risks.
---
### Alternative 5: Sharding by Tenant Hash
**Architecture:**
- Hash tenant ID
- Route to specific shard server
- Multiple instances with different tenant ranges
```
Tenant Hash % 3
├─ Shard 0: LightRAG A (tenants 0, 3, 6, 9...)
├─ Shard 1: LightRAG B (tenants 1, 4, 7, 10...)
└─ Shard 2: LightRAG C (tenants 2, 5, 8, 11...)
```
**Pros:**
- Distributes load across instances
- Better than single instance
- Can grow to 3+ instances
**Cons:**
- ❌ Breaks operational simplicity
- Need load balancer + routing logic
- Shards must be preconfigured
- Adding tenant requires determining shard
- ❌ Rebalancing is complex
- Adding new shard requires data migration
- Tenant addition might change shard assignment
- Hotspots impossible to fix dynamically
- ❌ Doesn't reduce fundamental costs
- Still need multiple instances
- Each instance has full overhead
- Only slightly better than per-tenant instances
- ❌ More complex than multi-tenant single instance
- Routing logic adds latency
- Debugging harder (data could be on any shard)
- Cross-shard features harder to implement
**Decision: REJECTED**
Introduces complexity without enough benefit over single instance per tenant approach.
---
### Comparison Table
| Approach | Isolation | Cost | Complexity | Scalability | Selected |
|----------|-----------|------|-----------|-------------|----------|
| **Proposed: Single Instance Multi-Tenant** | ✓ Good | ✓ Low | ✓ Medium | ✓ Excellent | **✓ YES** |
| Alt 1: DB Per Tenant | ✓✓ Perfect | ✗✗ 100x | ✗✗ Very High | ✗ Limited | ✗ |
| Alt 2: Server Per Tenant | ✓ Good | ✗✗ 50x | ✗ High | ✗ Limited | ✗ |
| Alt 3: Workspace Rename | ~ Weak | ✓ Very Low | ✓ Very Low | ✓ Good | ✗ |
| Alt 4: Single Instance | ✗ Poor | ✓ Very Low | ✓ Very Low | ✗ Poor | ✗ |
| Alt 5: Sharding | ✓ Good | ✗ 10-20x | ✗✗ High | ✓ Good | ✗ |
## Why This Approach Wins
The proposed **single instance, multi-tenant, multi-KB** architecture offers the optimal balance:
1. **Security**: Complete tenant isolation through multiple layers
2. **Cost**: Efficient resource sharing (100 tenants ≈ 1.1x cost of single tenant)
3. **Complexity**: Manageable (dependency injection handles most complexity)
4. **Scalability**: Single instance can serve 100s of tenants, scales vertically well
5. **Compliance**: Audit trails and data isolation support compliance needs
6. **Features**: Supports RBAC, per-tenant config, resource quotas
---
**Document Version**: 1.0
**Last Updated**: 2025-11-20
**Related Files**: 001-multi-tenant-architecture-overview.md