* Remove outdated documentation files: Quick Start Guide, Apache AGE Analysis, and Scratchpad. * Add multi-tenant testing strategy and ADR index documentation - Introduced ADR 008 detailing the multi-tenant testing strategy for the ./starter environment, covering compatibility and multi-tenant modes, testing scenarios, and implementation details. - Created a comprehensive ADR index (README.md) summarizing all architecture decision records related to the multi-tenant implementation, including purpose, key sections, and reading paths for different roles. * feat(docs): Add comprehensive multi-tenancy guide and README for LightRAG Enterprise - Introduced `0008-multi-tenancy.md` detailing multi-tenancy architecture, key concepts, roles, permissions, configuration, and API endpoints. - Created `README.md` as the main documentation index, outlining features, quick start, system overview, and deployment options. - Documented the LightRAG architecture, storage backends, LLM integrations, and query modes. - Established a task log (`2025-01-21-lightrag-documentation-log.md`) summarizing documentation creation actions, decisions, and insights.
26 KiB
26 KiB
ADR 006: Architecture Diagrams and Alternatives Analysis
Status: Proposed
Proposed Architecture Diagram
┌─────────────────────────────────────────────────────────────────────────────┐
│ LightRAG Multi-Tenant System │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ FastAPI Application │ │
│ ├──────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ Request Middleware Layer │ │ │
│ │ ├─────────────────────────────────────────────────────────┤ │ │
│ │ │ • CORS Middleware │ │ │
│ │ │ • HTTPS Redirect │ │ │
│ │ │ • Rate Limiting (per tenant) │ │ │
│ │ │ • Request Logging & Audit │ │ │
│ │ │ • Idempotency Key Handling │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ ↓ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ Authentication & Tenant Context Extraction │ │ │
│ │ ├─────────────────────────────────────────────────────────┤ │ │
│ │ │ 1. Parse JWT token or API key from headers │ │ │
│ │ │ 2. Validate signature and expiration │ │ │
│ │ │ 3. Extract tenant_id, kb_id, user_id, permissions │ │ │
│ │ │ 4. Verify token.tenant_id == path.tenant_id │ │ │
│ │ │ 5. Verify user can access kb_id │ │ │
│ │ │ → Returns TenantContext object │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ ↓ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ API Routing Layer │ │ │
│ │ ├─────────────────────────────────────────────────────────┤ │ │
│ │ │ /api/v1/tenants/{tenant_id}/ │ │ │
│ │ │ ├─ knowledge-bases/{kb_id}/documents/* │ │ │
│ │ │ ├─ knowledge-bases/{kb_id}/query* │ │ │
│ │ │ ├─ knowledge-bases/{kb_id}/graph/* │ │ │
│ │ │ ├─ knowledge-bases/* │ │ │
│ │ │ └─ api-keys/* │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ ↓ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ Request Handlers (with TenantContext injected) │ │ │
│ │ ├─────────────────────────────────────────────────────────┤ │ │
│ │ │ • Validate permissions on TenantContext │ │ │
│ │ │ • Get tenant-specific RAG instance │ │ │
│ │ │ • Pass context to business logic │ │ │
│ │ │ • Return response with audit trail │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Tenant-Aware LightRAG Instance Manager │ │
│ ├──────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ Instance Cache: │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ (tenant_1, kb_1) → LightRAG@memory │ │ │
│ │ │ (tenant_1, kb_2) → LightRAG@memory │ │ │
│ │ │ (tenant_2, kb_1) → LightRAG@memory │ │ │
│ │ │ (tenant_3, kb_1) → LightRAG@memory │ │ │
│ │ │ ... │ │ │
│ │ │ Max: 100 instances (configurable) │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Each LightRAG instance: │ │
│ │ • Uses tenant-specific configuration (LLM, embedding models) │ │
│ │ • Works with dedicated namespace: {tenant_id}_{kb_id} │ │
│ │ • Isolated storage connections │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Storage Access Layer (with Tenant Filtering) │ │
│ ├──────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ Query Modification: │ │
│ │ Before: SELECT * FROM documents WHERE doc_id = 'abc' │ │
│ │ After: SELECT * FROM documents │ │
│ │ WHERE tenant_id = 'acme' AND kb_id = 'docs' │ │
│ │ AND doc_id = 'abc' │ │
│ │ │ │
│ │ • All queries automatically scoped to current tenant/KB │ │
│ │ • Prevents accidental cross-tenant data access │ │
│ │ • Storage layer enforces isolation (defense in depth) │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Storage Backends (Shared) │ │
│ ├──────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────┐ ┌────────────────────┐ │ │
│ │ │ PostgreSQL │ │ Neo4j │ │ Milvus/Qdrant │ │ │
│ │ │ (Shared DB) │ │ (Shared) │ │ (Vector Store) │ │ │
│ │ ├─────────────────┤ ├─────────────┤ ├────────────────────┤ │ │
│ │ │ • Documents │ │ • Entities │ │ • Embeddings │ │ │
│ │ │ • Chunks │ │ • Relations │ │ • Entity vectors │ │ │
│ │ │ • Entities │ │ │ │ │ │ │
│ │ │ • API Keys │ │ Each node │ │ Each vector │ │ │
│ │ │ • Tenants │ │ tagged with │ │ tagged with │ │ │
│ │ │ • KBs │ │ tenant_id + │ │ tenant_id + kb_id │ │ │
│ │ │ │ │ kb_id │ │ │ │ │
│ │ │ Filtered by: │ │ │ │ Filtered by: │ │ │
│ │ │ tenant_id, │ │ Filtered by:│ │ tenant_id, │ │ │
│ │ │ kb_id in WHERE │ │ tenant_id + │ │ kb_id in query │ │ │
│ │ │ │ │ kb_id │ │ │ │ │
│ │ └─────────────────┘ └─────────────┘ └────────────────────┘ │ │
│ │ │ │
│ │ All with tenant/KB isolation at schema/data level │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Data Flow Diagrams
Query Execution Flow
1. Client Request
├─ POST /api/v1/tenants/acme/knowledge-bases/docs/query
├─ Body: {"query": "What is..."}
└─ Header: Authorization: Bearer <token>
│
▼
2. Middleware Validation
├─ Extract tenant_id, kb_id from URL path
├─ Extract token from Authorization header
├─ Validate token signature and expiration
├─ Extract user_id, tenant_id_in_token, permissions
└─ VERIFY: tenant_id (path) == tenant_id_in_token
│
▼
3. Dependency Injection
├─ Create TenantContext(
│ tenant_id="acme",
│ kb_id="docs",
│ user_id="john",
│ role="editor",
│ permissions={"query:run": true}
└─ )
│
▼
4. Handler Authorization
├─ Check TenantContext.permissions["query:run"] == true
└─ If false → 403 Forbidden
│
▼
5. Get RAG Instance
├─ RAGManager.get_instance(tenant_id="acme", kb_id="docs")
├─ Check cache → Found → Use cached instance
└─ (If not cached: create new with tenant config)
│
▼
6. Execute Query
├─ RAG.aquery(query="What is...", tenant_context=context)
│ └─ All internal queries will include tenant/kb filters:
│ └─ Storage layer automatically adds:
│ WHERE tenant_id='acme' AND kb_id='docs'
│
▼
7. Storage Layer Filtering
├─ Vector search: Find embeddings WHERE tenant_id='acme' AND kb_id='docs'
├─ Graph query: Match entities {tenant_id:'acme', kb_id:'docs'}
├─ KV lookup: Get items with key prefix 'acme:docs:'
└─ Returns only acme/docs data (NO cross-tenant leakage possible)
│
▼
8. Response Generation
├─ RAG generates response from filtered data
├─ Response object created
└─ Handler receives response with TenantContext
│
▼
9. Audit Logging
├─ Log: {
│ user_id: "john",
│ tenant_id: "acme",
│ kb_id: "docs",
│ action: "query_executed",
│ status: "success",
│ timestamp: <now>
└─ }
│
▼
10. Response Returned to Client
└─ HTTP 200 with query result
Document Upload Flow
1. Client uploads document
├─ POST /api/v1/tenants/acme/knowledge-bases/docs/documents/add
├─ File: document.pdf
└─ Header: Authorization: Bearer <token>
│
▼
2. Authentication & Authorization
├─ Validate token, extract TenantContext
├─ Check permission: document:create
└─ Verify tenant_id matches path and token
│
▼
3. File Validation
├─ Check file type (PDF, DOCX, etc.)
├─ Check file size < quota
├─ Sanitize filename
└─ Generate unique doc_id
│
▼
4. Queue Document Processing
├─ Store temp file: /{working_dir}/{tenant_id}/{kb_id}/__tmp__/{doc_id}
├─ Create DocStatus record with status="processing"
├─ Return to client: {status: "processing", track_id: "..."}
└─ Start async processing task
│
▼
5. Async Document Processing (background task)
├─ Get RAG instance for (acme, docs)
├─ Insert document:
│ └─ RAG.ainsert(file_path, tenant_id="acme", kb_id="docs")
│ └─ Internal processing automatically tags data with:
│ └─ tenant_id="acme", kb_id="docs"
│
├─ Update DocStatus:
│ ├─ status → "success"
│ ├─ chunks_processed → 42
│ └─ entities_extracted → 15
│
└─ Move file: __tmp__ → {kb_id}/documents/
│
▼
6. Storage Writes (tenant-scoped)
├─ PostgreSQL:
│ └─ INSERT INTO chunks (tenant_id, kb_id, doc_id, content)
│ VALUES ('acme', 'docs', 'doc-123', '...')
│
├─ Neo4j:
│ └─ CREATE (e:Entity {tenant_id:'acme', kb_id:'docs', name:'...'})-[:IN_KB]->(kb)
│
└─ Milvus:
└─ Insert vector with metadata: {tenant_id:'acme', kb_id:'docs'}
│
▼
7. Client Polls for Status
├─ GET /api/v1/tenants/acme/knowledge-bases/docs/documents/{doc_id}/status
├─ Returns: {status: "success", chunks: 42, entities: 15}
└─ Client confirms upload complete
Alternatives Considered
Alternative 1: Separate Database Per Tenant
Architecture:
- Each tenant gets dedicated PostgreSQL database
- Separate Neo4j instances per tenant
- Separate Milvus collections per tenant
Tenant A Server → PostgreSQL A
→ Neo4j A
→ Milvus A
Tenant B Server → PostgreSQL B
→ Neo4j B
→ Milvus B
Pros:
- Maximum isolation (physical separation)
- Easier compliance (HIPAA, GDPR)
- Better disaster recovery per tenant
- Easier scaling (scale out per tenant)
Cons:
- ❌ Massive operational overhead
- Each database needs separate backup, upgrade, monitoring
- 100 tenants = 100 databases to manage
- Database licensing costs multiply (100x more expensive)
- ❌ Complex deployment & maintenance
- Infrastructure-as-Code becomes complex
- Database credentials management nightmare
- Harder debugging with distributed databases
- ❌ Impossible resource sharing
- Cannot leverage shared compute resources
- Cannot optimize resource usage globally
- Waste of resources (each DB has minimum overhead)
- ❌ Cross-tenant features impossible
- Data sharing between tenants difficult
- Consolidated reporting/analytics hard to implement
Decision: REJECTED Too expensive and operationally complex for moderate scale.
Alternative 2: Dedicated Server Per Tenant
Architecture:
- Each tenant runs own LightRAG instance
- Own Python process, own configurations
- Own memory/CPU allocation
Tenant A → LightRAG Process A (port 9621)
Tenant B → LightRAG Process B (port 9622)
Tenant C → LightRAG Process C (port 9623)
Pros:
- Complete isolation (separate processes)
- Easy to manage per-tenant configs
- Can use different models per tenant
Cons:
- ❌ Massive resource waste
- Minimum ~500MB RAM per instance × 100 tenants = 50GB+ RAM
- Minimum CPU overhead per process
- ❌ Extremely expensive at scale
- 100 tenants × 4GB allocated = 400GB RAM needed
- Infrastructure costs prohibitive
- ❌ Operational nightmare
- 100 processes to monitor
- 100 upgrades/patches to manage
- Complex deployment orchestration
- ❌ Poor utilization
- Most tenants underutilize their resources
- Cannot rebalance resources dynamically
- Peak loads unpredictable per tenant
Decision: REJECTED Not economically viable for enterprise deployments.
Alternative 3: Simple Workspace Rename (No Knowledge Base)
Architecture:
- Rename "workspace" to "tenant"
- No KB concept
- Assume 1 KB per tenant
POST /api/v1/workspaces/{workspace_id}/query
→ becomes
POST /api/v1/tenants/{tenant_id}/query
Pros:
- Minimal code changes
- Backward compatible
- Quick implementation (1 week)
Cons:
- ❌ No knowledge base isolation
- Tenant with multiple unrelated KBs must share config
- Cannot have tenant-specific KB settings
- All data mixed together
- ❌ Cannot enforce cross-tenant access prevention
- Workspace is just a directory/field
- No API-level enforcement
- Easy to make mistakes
- ❌ No RBAC
- Cannot grant access to specific KBs
- All-or-nothing tenant access
- No fine-grained permissions
- ❌ No tenant-specific configuration
- All tenants must use same LLM/embedding models
- Cannot customize per tenant needs
- ❌ Limited compliance features
- No audit trails of who accessed what
- Difficult to enforce data residency
- No resource quotas
Decision: REJECTED Doesn't meet business requirements for true multi-tenancy.
Alternative 4: Shared Single LightRAG for All Tenants
Architecture:
- One LightRAG instance for all tenants
- Single namespace, single graph
- Tenant filtering only at API layer
API Layer → Filters query by tenant → Single LightRAG Instance
Pros:
- Minimal resource usage
- Single deployment
- Simple to maintain
Cons:
- ❌ Data isolation risk is CRITICAL
- Single point of failure for all tenants
- One query mistake → cross-tenant data leak
- Cannot be patched without affecting all
- ❌ Performance bottleneck
- Single instance cannot scale with tenants
- All LLM calls compete for resources
- All embedding calls serialized
- ❌ Tenant-specific configuration impossible
- All tenants share same models
- Cannot customize chunk size, top_k, etc per tenant
- ❌ No blast radius isolation
- One tenant's bad data can corrupt all
- One tenant's quota exhaustion affects all
- ❌ Compliance impossible
- Data residency requirements: cannot guarantee where data is
- GDPR right to deletion: must delete entire system
- Audit requirements: cannot track per-tenant operations
Decision: REJECTED Unacceptable security and operational risks.
Alternative 5: Sharding by Tenant Hash
Architecture:
- Hash tenant ID
- Route to specific shard server
- Multiple instances with different tenant ranges
Tenant Hash % 3
├─ Shard 0: LightRAG A (tenants 0, 3, 6, 9...)
├─ Shard 1: LightRAG B (tenants 1, 4, 7, 10...)
└─ Shard 2: LightRAG C (tenants 2, 5, 8, 11...)
Pros:
- Distributes load across instances
- Better than single instance
- Can grow to 3+ instances
Cons:
- ❌ Breaks operational simplicity
- Need load balancer + routing logic
- Shards must be preconfigured
- Adding tenant requires determining shard
- ❌ Rebalancing is complex
- Adding new shard requires data migration
- Tenant addition might change shard assignment
- Hotspots impossible to fix dynamically
- ❌ Doesn't reduce fundamental costs
- Still need multiple instances
- Each instance has full overhead
- Only slightly better than per-tenant instances
- ❌ More complex than multi-tenant single instance
- Routing logic adds latency
- Debugging harder (data could be on any shard)
- Cross-shard features harder to implement
Decision: REJECTED Introduces complexity without enough benefit over single instance per tenant approach.
Comparison Table
| Approach | Isolation | Cost | Complexity | Scalability | Selected |
|---|---|---|---|---|---|
| Proposed: Single Instance Multi-Tenant | ✓ Good | ✓ Low | ✓ Medium | ✓ Excellent | ✓ YES |
| Alt 1: DB Per Tenant | ✓✓ Perfect | ✗✗ 100x | ✗✗ Very High | ✗ Limited | ✗ |
| Alt 2: Server Per Tenant | ✓ Good | ✗✗ 50x | ✗ High | ✗ Limited | ✗ |
| Alt 3: Workspace Rename | ~ Weak | ✓ Very Low | ✓ Very Low | ✓ Good | ✗ |
| Alt 4: Single Instance | ✗ Poor | ✓ Very Low | ✓ Very Low | ✗ Poor | ✗ |
| Alt 5: Sharding | ✓ Good | ✗ 10-20x | ✗✗ High | ✓ Good | ✗ |
Why This Approach Wins
The proposed single instance, multi-tenant, multi-KB architecture offers the optimal balance:
- Security: Complete tenant isolation through multiple layers
- Cost: Efficient resource sharing (100 tenants ≈ 1.1x cost of single tenant)
- Complexity: Manageable (dependency injection handles most complexity)
- Scalability: Single instance can serve 100s of tenants, scales vertically well
- Compliance: Audit trails and data isolation support compliance needs
- Features: Supports RBAC, per-tenant config, resource quotas
Document Version: 1.0
Last Updated: 2025-11-20
Related Files: 001-multi-tenant-architecture-overview.md