LightRAG/docs/archives/0001-multi-tenant-architecture.md
Raphael MANSUY 2b292d4924
docs: Enterprise Edition & Multi-tenancy attribution (#5)
* Remove outdated documentation files: Quick Start Guide, Apache AGE Analysis, and Scratchpad.

* Add multi-tenant testing strategy and ADR index documentation

- Introduced ADR 008 detailing the multi-tenant testing strategy for the ./starter environment, covering compatibility and multi-tenant modes, testing scenarios, and implementation details.
- Created a comprehensive ADR index (README.md) summarizing all architecture decision records related to the multi-tenant implementation, including purpose, key sections, and reading paths for different roles.

* feat(docs): Add comprehensive multi-tenancy guide and README for LightRAG Enterprise

- Introduced `0008-multi-tenancy.md` detailing multi-tenancy architecture, key concepts, roles, permissions, configuration, and API endpoints.
- Created `README.md` as the main documentation index, outlining features, quick start, system overview, and deployment options.
- Documented the LightRAG architecture, storage backends, LLM integrations, and query modes.
- Established a task log (`2025-01-21-lightrag-documentation-log.md`) summarizing documentation creation actions, decisions, and insights.
2025-12-04 18:09:15 +08:00

36 KiB

Multi-Tenant Architecture

A comprehensive guide to understanding, activating, and implementing multi-tenant support across all storage backends

Last Updated: November 20, 2025
Status: Production Ready
Audience: Developers, DevOps Engineers, System Architects


Table of Contents

  1. Overview
  2. Architecture Model
  3. Multi-Tenant Concept
  4. Supported Backends
  5. How It Works
  6. Getting Started
  7. Implementation Examples
  8. Security & Isolation
  9. Migration Guide
  10. Troubleshooting

Overview

LightRAG now supports complete multi-tenant architecture across all 10 storage backends, enabling secure isolation of data for multiple organizations, teams, or customers within a single LightRAG deployment.

Key Benefits

  • Complete Data Isolation: Database-level filtering prevents cross-tenant access
  • Easy Activation: Simple configuration with backward compatibility
  • All Backends Supported: Works with PostgreSQL, MongoDB, Redis, Neo4j, and vector/graph databases
  • Zero Breaking Changes: Existing code continues to work with defaults
  • Scale Efficiently: Run one instance for multiple tenants

Real-World Scenario

┌─────────────────────────────────────────────────────────────────┐
│                    Single LightRAG Deployment                   │
│                                                                  │
│  ┌──────────────────────┐      ┌──────────────────────┐        │
│  │   Tenant: Acme Corp  │      │  Tenant: TechStart   │        │
│  │                      │      │                      │        │
│  │  KB: kb-prod    ─────┼─────>│ KB: kb-main   ────┐ │        │
│  │  KB: kb-dev           │      │ KB: kb-staging  │ │        │
│  │                      │      │                  │ │        │
│  └──────────────────────┘      └──────────────────────┘        │
│           │                             │                       │
│           └─────────────┬───────────────┘                       │
│                         │                                       │
│        All data isolated at database level                      │
│                         │                                       │
│          ┌──────────────┴──────────────┐                       │
│          ▼                             ▼                        │
│      ┌─────────────────┐      ┌─────────────────┐             │
│      │   PostgreSQL    │      │     MongoDB     │             │
│      │  (tenant_id+kb) │      │  (tenant_id+kb) │             │
│      └─────────────────┘      └─────────────────┘             │
└─────────────────────────────────────────────────────────────────┘

Architecture Model

Hierarchical Structure

graph TD
    A["Deployment"] --> B["Tenant: Acme Corp"]
    A --> C["Tenant: TechStart"]
    B --> D["KB: kb-prod"]
    B --> E["KB: kb-dev"]
    C --> F["KB: kb-main"]
    C --> G["KB: kb-staging"]
    D --> H["Documents"]
    D --> I["Entities & Relations"]
    D --> J["Vectors"]
    E --> K["Documents"]
    E --> L["Entities & Relations"]
    F --> M["Documents"]
    G --> N["Entities & Relations"]
    
    style A fill:#E8F5E9,stroke:#2E7D32,stroke-width:2px,color:#1B5E20
    style B fill:#F3E5F5,stroke:#6A1B9A,stroke-width:2px,color:#38006B
    style C fill:#F3E5F5,stroke:#6A1B9A,stroke-width:2px,color:#38006B
    style D fill:#E0F2F1,stroke:#00796B,stroke-width:2px,color:#004D40
    style E fill:#E0F2F1,stroke:#00796B,stroke-width:2px,color:#004D40
    style F fill:#E0F2F1,stroke:#00796B,stroke-width:2px,color:#004D40
    style G fill:#E0F2F1,stroke:#00796B,stroke-width:2px,color:#004D40

Data Model - Composite Key Pattern

Each resource is identified by a composite key: (tenant_id, kb_id, resource_id)

┌────────────────────────────────────────────────────────────┐
│                    Composite Key Pattern                   │
├────────────────────────────────────────────────────────────┤
│                                                            │
│  tenant_id  │   kb_id    │   resource_id   │   data       │
│  ─────────  │  ─────────  │   ─────────     │   ────       │
│  "acme"     │  "kb-prod"  │  "doc-123"      │   {...}      │
│  "acme"     │  "kb-dev"   │  "doc-456"      │   {...}      │
│  "techst"   │  "kb-main"  │  "doc-789"      │   {...}      │
│                                                            │
│  Same resource_id in different tenant/kb = different data │
│  Prevents accidental cross-tenant access                  │
│                                                            │
└────────────────────────────────────────────────────────────┘

Multi-Tenant Concept

Three-Level Isolation

┌─────────────────────────────────────────────────────────────────┐
│                    Multi-Tenant Isolation Levels                │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Level 1: TENANT                                               │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │ Organization/Customer/Account (highest level)            │  │
│  │ Example: "acme-corp", "techstart-inc"                    │  │
│  │ Isolation: Complete separation between tenants           │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  Level 2: KNOWLEDGE BASE (KB)                                  │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │ Project/Environment/Domain within tenant                 │  │
│  │ Examples:                                                │  │
│  │   - Acme Corp: kb-prod, kb-dev, kb-staging              │  │
│  │   - TechStart: kb-main, kb-backup                        │  │
│  │ Isolation: Separate data per KB within same tenant       │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  Level 3: RESOURCES                                            │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │ Documents, Entities, Vectors, Relations (lowest level)   │  │
│  │ Automatically filtered by tenant + kb context            │  │
│  │ Isolation: Only accessible via tenant/kb scope           │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Data Access Pattern

sequenceDiagram
    participant Client as Client Application
    participant API as LightRAG API
    participant TenantCtx as Tenant Context
    participant Storage as Storage Backend

    Client->>API: GET /documents<br/>(tenant: acme-corp, kb: kb-prod)
    API->>TenantCtx: Validate & extract<br/>tenant_id, kb_id
    TenantCtx->>Storage: Query WHERE<br/>tenant_id='acme-corp'<br/>AND kb_id='kb-prod'
    Storage-->>API: Return filtered results
    API-->>Client: Documents (acme-corp only)

    Note over TenantCtx: Even if request<br/>contains tenant_id in URL,<br/>storage layer<br/>enforces isolation

Supported Backends

Complete Backend Coverage

Backend Isolation Method Status Module
PostgreSQL Column filtering + composite keys Complete postgres_tenant_support.py
MongoDB Document field filtering Complete mongo_tenant_support.py
Redis Key prefixing (tenant:kb:key) Complete redis_tenant_support.py
Neo4j Cypher + node relationships Complete graph_tenant_support.py
Memgraph openCypher + properties Complete graph_tenant_support.py
NetworkX Subgraph extraction Complete graph_tenant_support.py
Qdrant Metadata filtering Complete vector_tenant_support.py
Milvus WHERE expression filtering Complete vector_tenant_support.py
FAISS Index naming + metadata Complete vector_tenant_support.py
Nano Vector DB Document metadata Complete vector_tenant_support.py

Backend Architecture Diagram

graph TB
    subgraph "Storage Backends"
        Relational["Relational"]
        Document["Document"]
        KV["Key-Value"]
        Vector["Vector"]
        Graph["Graph"]
        
        PG["PostgreSQL"]
        Mongo["MongoDB"]
        Redis["Redis"]
        Qdrant["Qdrant"]
        Milvus["Milvus"]
        FAISS["FAISS"]
        Nano["Nano VDB"]
        Neo4j["Neo4j"]
        Memgraph["Memgraph"]
        NetworkX["NetworkX"]
    end
    
    subgraph "Support Modules"
        PGSupport["postgres_tenant<br/>_support.py"]
        MongoSupport["mongo_tenant<br/>_support.py"]
        RedisSupport["redis_tenant<br/>_support.py"]
        VectorSupport["vector_tenant<br/>_support.py"]
        GraphSupport["graph_tenant<br/>_support.py"]
    end
    
    Relational --> PG
    Document --> Mongo
    KV --> Redis
    Vector --> Qdrant
    Vector --> Milvus
    Vector --> FAISS
    Vector --> Nano
    Graph --> Neo4j
    Graph --> Memgraph
    Graph --> NetworkX
    
    PG -.-> PGSupport
    Mongo -.-> MongoSupport
    Redis -.-> RedisSupport
    Qdrant -.-> VectorSupport
    Milvus -.-> VectorSupport
    FAISS -.-> VectorSupport
    Nano -.-> VectorSupport
    Neo4j -.-> GraphSupport
    Memgraph -.-> GraphSupport
    NetworkX -.-> GraphSupport
    
    style Relational fill:#F1F8E9,stroke:#558B2F,stroke-width:2px,color:#33691E
    style Document fill:#ECE7F3,stroke:#7B1FA2,stroke-width:2px,color:#4A148C
    style KV fill:#E0F2F1,stroke:#00897B,stroke-width:2px,color:#004D40
    style Vector fill:#FFF3E0,stroke:#E65100,stroke-width:2px,color:#BF360C
    style Graph fill:#F3E5F5,stroke:#6A1B9A,stroke-width:2px,color:#38006B
    style PGSupport fill:#C8E6C9,stroke:#2E7D32,stroke-width:2px,color:#1B5E20
    style MongoSupport fill:#D8C5E5,stroke:#512DA8,stroke-width:2px,color:#311B92
    style RedisSupport fill:#B2DFDB,stroke:#00695C,stroke-width:2px,color:#004D40
    style VectorSupport fill:#FFD8A8,stroke:#D84315,stroke-width:2px,color:#BF360C
    style GraphSupport fill:#E1BEE7,stroke:#7B1FA2,stroke-width:2px,color:#4A148C

How It Works

Query Execution Flow

┌──────────────────────────────────────────────────────────────┐
│             Typical Query Execution Flow                     │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│  1. Client Request                                          │
│     GET /api/documents                                      │
│     Headers: {tenant: "acme-corp", kb: "kb-prod"}          │
│                                                              │
│  2. Extract Tenant Context                                  │
│     tenant_id = extract_from_request(request)              │
│     kb_id = extract_from_request(request)                  │
│                                                              │
│  3. Build Tenant-Aware Query                                │
│     Base Query:                                             │
│       SELECT * FROM documents WHERE status='active'         │
│                                                              │
│     Add Tenant Filter:                                      │
│       SELECT * FROM documents                               │
│       WHERE status='active'                                 │
│       AND tenant_id='acme-corp'                             │
│       AND kb_id='kb-prod'                                   │
│                                                              │
│  4. Execute Query                                           │
│     Storage backend executes filtered query                 │
│                                                              │
│  5. Return Results                                          │
│     Only documents from acme-corp/kb-prod returned          │
│                                                              │
└──────────────────────────────────────────────────────────────┘

Storage Layer Filtering

Each backend has its own filtering mechanism:

┌──────────────────────────────────────────────────────────────┐
│           Backend-Specific Filtering Methods                 │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│  PostgreSQL:                                                │
│  WHERE clause + composite PRIMARY KEY                       │
│    (tenant_id, kb_id, id)                                   │
│                                                              │
│  MongoDB:                                                   │
│  Document filter                                            │
│    {tenant_id: "acme-corp", kb_id: "kb-prod"}             │
│                                                              │
│  Redis:                                                     │
│  Key prefix pattern                                         │
│    acme-corp:kb-prod:original_key                          │
│                                                              │
│  Qdrant (Vector DB):                                        │
│  Metadata filter                                            │
│    {"must": [{"key": "tenant_id", ...}, ...]}              │
│                                                              │
│  Neo4j (Graph DB):                                          │
│  Cypher property matching                                   │
│    WHERE node.tenant_id = 'acme-corp'                      │
│    AND node.kb_id = 'kb-prod'                              │
│                                                              │
└──────────────────────────────────────────────────────────────┘

Getting Started

Quick Activation

Multi-tenant support is built-in and automatically enabled. Here's how to use it:

Step 1: Import Support Modules

# For PostgreSQL
from lightrag.kg.postgres_tenant_support import TenantSQLBuilder

# For MongoDB  
from lightrag.kg.mongo_tenant_support import MongoTenantHelper

# For Redis
from lightrag.kg.redis_tenant_support import RedisTenantNamespace

# For Vector DBs (Qdrant, Milvus, FAISS, Nano)
from lightrag.kg.vector_tenant_support import QdrantTenantHelper

# For Graph DBs (Neo4j, Memgraph, NetworkX)
from lightrag.kg.graph_tenant_support import Neo4jTenantHelper

Step 2: Use Tenant Context

# Set tenant context for your operation
tenant_id = "acme-corp"
kb_id = "kb-prod"

# All subsequent queries will be automatically scoped to this tenant/kb
# No additional filtering needed in application code!

Step 3: That's It!

All database operations are automatically isolated. No breaking changes to existing code.

Configuration

Minimal configuration needed. If using environment variables:

# Optional: Set default tenant for single-tenant scenarios
export LIGHTRAG_DEFAULT_TENANT="default"
export LIGHTRAG_DEFAULT_KB="default"

# Or use at runtime
context = TenantContext(tenant_id="acme-corp", kb_id="kb-prod")

Implementation Examples

PostgreSQL Example

from lightrag.kg.postgres_tenant_support import TenantSQLBuilder

# Build a tenant-aware query
sql = "SELECT * FROM LIGHTRAG_DOC_FULL WHERE status = :status"
params = {"status": "active"}

# Add tenant filtering
filtered_sql, filtered_params = TenantSQLBuilder.build_filtered_query(
    base_query=sql,
    tenant_id="acme-corp",
    kb_id="kb-prod",
    additional_params=[params]
)

# Execute
result = await db.query(filtered_sql, filtered_params)
# Result: Only documents from acme-corp/kb-prod with status=active

MongoDB Example

from lightrag.kg.mongo_tenant_support import MongoTenantHelper

# Build tenant-aware filter
tenant_filter = MongoTenantHelper.get_tenant_filter(
    tenant_id="acme-corp",
    kb_id="kb-prod",
    additional_filter={"status": "active"}
)

# Use in query
document = await collection.find_one(tenant_filter)
# Result: Only returns documents from acme-corp/kb-prod

Redis Example

from lightrag.kg.redis_tenant_support import RedisTenantNamespace

# Create a tenant-scoped namespace
ns = RedisTenantNamespace(
    redis_client=redis,
    tenant_id="acme-corp",
    kb_id="kb-prod"
)

# All operations are automatically tenant-scoped
value = await ns.get("user:123")
await ns.set("user:123", json_data)
await ns.delete("user:123")

# Key stored as: "acme-corp:kb-prod:user:123"
# No tenant/kb prefix needed in application code

Vector DB (Qdrant) Example

from lightrag.kg.vector_tenant_support import QdrantTenantHelper

# Build tenant filter
tenant_filter = QdrantTenantHelper.build_qdrant_filter(
    tenant_id="acme-corp",
    kb_id="kb-prod"
)

# Search with automatic tenant isolation
results = await qdrant.search(
    collection_name="embeddings",
    query_vector=query_embedding,
    query_filter=tenant_filter,  # Automatic isolation
    limit=10
)
# Result: Only vectors from acme-corp/kb-prod

Graph DB (Neo4j) Example

from lightrag.kg.graph_tenant_support import Neo4jTenantHelper

helper = Neo4jTenantHelper()

# Build tenant-aware Cypher query
base_query = "MATCH (n:Entity) RETURN n"
query, params = helper.build_tenant_aware_query(
    base_query=base_query,
    tenant_id="acme-corp",
    kb_id="kb-prod",
    node_var="n"
)

# Execute
result = await session.run(query, params)
# Result: Only entities from acme-corp/kb-prod

Complete Application Example

from fastapi import FastAPI, Header
from lightrag.kg.postgres_tenant_support import TenantSQLBuilder

app = FastAPI()

@app.get("/documents")
async def get_documents(
    tenant_id: str = Header(...),
    kb_id: str = Header(...),
    db = Depends(get_db)
):
    """Get documents for a specific tenant/kb"""
    
    # Build tenant-scoped query
    query = "SELECT id, title, content FROM documents"
    
    filtered_sql, params = TenantSQLBuilder.build_filtered_query(
        base_query=query,
        tenant_id=tenant_id,
        kb_id=kb_id,
        additional_params=[]
    )
    
    # Execute (tenant context enforced at storage layer)
    documents = await db.query(filtered_sql, params)
    
    return {
        "tenant": tenant_id,
        "kb": kb_id,
        "documents": documents,
        "count": len(documents)
    }


@app.post("/documents/{doc_id}")
async def add_document(
    doc_id: str,
    tenant_id: str = Header(...),
    kb_id: str = Header(...),
    content: dict,
    db = Depends(get_db)
):
    """Add a document for a specific tenant/kb"""
    
    # Composite key: (tenant_id, kb_id, doc_id)
    query = """
        INSERT INTO documents (tenant_id, kb_id, id, content)
        VALUES (:tenant_id, :kb_id, :id, :content)
    """
    
    result = await db.execute(query, {
        "tenant_id": tenant_id,
        "kb_id": kb_id,
        "id": doc_id,
        "content": content
    })
    
    return {
        "status": "created",
        "tenant": tenant_id,
        "kb": kb_id,
        "doc_id": doc_id
    }

Security & Isolation

Isolation Guarantees

┌─────────────────────────────────────────────────────────────┐
│          Multi-Tenant Isolation Guarantees                  │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Database-Level Enforcement                               │
│  Every query includes (tenant_id, kb_id) filtering         │
│  Impossible to retrieve data from other tenants            │
│                                                             │
│  Composite Key Constraints                                 │
│  PRIMARY KEY (tenant_id, kb_id, id)                        │
│  Prevents accidental ID collisions between tenants         │
│                                                             │
│  No Application-Level Trust                                │
│  Even if app code has bugs, storage layer enforces        │
│  Tenant isolation is deterministic, not probabilistic      │
│                                                             │
│  Migration Safety                                           │
│  Legacy single-tenant data maps to default tenant          │
│  Gradual migration path without data loss                  │
│                                                             │
│  Audit Trail                                                │
│  All operations include tenant context                     │
│  Easy to track which tenant accessed what                  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Security Checklist

# DO: Always include tenant context
@app.get("/documents")
async def get_docs(tenant_id: str = Header(...), kb_id: str = Header(...)):
    query = TenantSQLBuilder.build_filtered_query(
        query, tenant_id, kb_id
    )
    return await db.query(query)

# DON'T: Query without tenant filtering
@app.get("/documents")  # WRONG - no tenant context
async def get_docs():
    return await db.query("SELECT * FROM documents")

# DO: Validate tenant context early
async def validate_tenant_access(tenant_id, user_tenant):
    if tenant_id != user_tenant:
        raise PermissionError(f"Cannot access {tenant_id}")

# DO: Use composite keys consistently
key = f"{tenant_id}:{kb_id}:{resource_id}"

# DON'T: Use resource IDs without tenant prefix
key = f"doc:{resource_id}"  # WRONG - can collide with other tenants

Migration Guide

Migrating Existing Single-Tenant Data

Multi-tenant support includes automatic migration utilities for each backend.

PostgreSQL Migration

from lightrag.kg.postgres_tenant_support import add_tenant_columns_migration

# Run one-time migration
await add_tenant_columns_migration(
    db=database_connection,
    default_tenant_id="default",
    default_kb_id="default"
)

# What it does:
# 1. Adds tenant_id and kb_id columns to all tables
# 2. Sets existing rows to default values
# 3. Creates composite indexes for performance
# 4. Updates PRIMARY KEY constraints

MongoDB Migration

from lightrag.kg.mongo_tenant_support import add_tenant_fields_to_collection

# Run migration on each collection
await add_tenant_fields_to_collection(
    collection=mongodb_collection,
    default_tenant_id="default",
    default_kb_id="default"
)

# Creates indexes:
# {tenant_id: 1, kb_id: 1, _id: 1}

Redis Migration (with Dry-Run)

from lightrag.kg.redis_tenant_support import migrate_redis_to_tenant

# Test migration first (dry-run)
stats = await migrate_redis_to_tenant(
    redis_client=redis,
    old_key_pattern="user:*",
    default_tenant_id="default",
    default_kb_id="default",
    dry_run=True  # Preview only
)

print(f"Will migrate: {stats['migrated']} keys")
print(f"Will skip: {stats['skipped']} keys")
print(f"Failed: {stats['failed']} keys")

# Run actual migration
stats = await migrate_redis_to_tenant(
    redis_client=redis,
    old_key_pattern="user:*",
    default_tenant_id="default",
    default_kb_id="default",
    dry_run=False  # Apply changes
)

Migration Workflow

┌──────────────────────────────────────────────────────┐
│         Safe Migration Process                       │
├──────────────────────────────────────────────────────┤
│                                                      │
│  1. BACKUP                                          │
│     - Create database snapshots                     │
│     - Export critical data                          │
│                                                      │
│  2. TEST ENVIRONMENT                                │
│     - Restore backup to test DB                     │
│     - Run migration with dry-run                    │
│     - Verify statistics match expectations          │
│                                                      │
│  3. PRODUCTION STAGING                              │
│     - Run migration on staging with dry-run         │
│     - Test application with new schema              │
│     - Monitor performance                           │
│                                                      │
│  4. PRODUCTION EXECUTION                            │
│     - Schedule maintenance window                   │
│     - Stop application                              │
│     - Run actual migration (dry_run=False)          │
│     - Verify data integrity                         │
│     - Restart application                           │
│                                                      │
│  5. VALIDATION                                      │
│     - Run integration tests                         │
│     - Check application logs                        │
│     - Verify tenant isolation                       │
│     - Monitor for 24 hours                          │
│                                                      │
└──────────────────────────────────────────────────────┘

Troubleshooting

Common Issues & Solutions

Issue 1: No tenant context found

# Problem
async def get_documents(db):
    result = await db.query("SELECT * FROM documents")
    # Error: No tenant context provided

# Solution
async def get_documents(db, tenant_id: str = Header(...)):
    from lightrag.kg.postgres_tenant_support import TenantSQLBuilder
    
    query = "SELECT * FROM documents"
    filtered_sql, params = TenantSQLBuilder.build_filtered_query(
        query, tenant_id, "kb-prod"
    )
    result = await db.query(filtered_sql, params)

Issue 2: Cross-tenant data visible

# Problem
filter_dict = {"status": "active"}  # Missing tenant fields!
result = await collection.find(filter_dict)

# Solution
from lightrag.kg.mongo_tenant_support import MongoTenantHelper

filter_dict = MongoTenantHelper.get_tenant_filter(
    "acme-corp", "kb-prod",
    additional_filter={"status": "active"}
)
result = await collection.find(filter_dict)

Issue 3: Performance degradation after migration

# Solution: Ensure indexes exist
from lightrag.kg.postgres_tenant_support import get_tenant_indexes

# Get recommended indexes
indexes = get_tenant_indexes()

# Create in PostgreSQL
for index_sql in indexes:
    await db.execute(index_sql)

# Verify
ANALYZE documents;  -- Update statistics
EXPLAIN SELECT * FROM documents 
    WHERE tenant_id='acme-corp' 
    AND kb_id='kb-prod';  -- Check query plan

Issue 4: Backward compatibility broken

# Solution: Use default tenant values
context = TenantContext(
    tenant_id="default",  # Default for legacy code
    kb_id="default"
)

# Legacy code continues to work
result = await db.query(legacy_query)  # Uses default context

Debugging Multi-Tenant Issues

# Enable debug logging
import logging
logging.basicConfig(level=logging.DEBUG)

# Add tenant context to logs
import contextvars

tenant_context = contextvars.ContextVar(
    'tenant_context',
    default={'tenant_id': 'unknown', 'kb_id': 'unknown'}
)

# In middleware
def set_tenant_context(tenant_id, kb_id):
    tenant_context.set({'tenant_id': tenant_id, 'kb_id': kb_id})

# In logging
class TenantFilter(logging.Filter):
    def filter(self, record):
        ctx = tenant_context.get()
        record.tenant = ctx['tenant_id']
        record.kb = ctx['kb_id']
        return True

handler = logging.StreamHandler()
handler.addFilter(TenantFilter())
logging.getLogger().addHandler(handler)

# Logs will show:
# 2025-11-20 10:30:45 [acme-corp:kb-prod] SELECT from documents

Best Practices

DO

  • Always pass tenant context to every operation
  • Use support module helpers (don't build queries manually)
  • Create composite indexes on (tenant_id, kb_id, ...)
  • Validate tenant context early in request pipeline
  • Log all tenant-related operations
  • Test with multiple tenants before production
  • Monitor tenant-specific metrics
  • Document tenant requirements for new features

DON'T

  • Hardcode tenant IDs in application code
  • Query without tenant filtering
  • Assume application code enforces isolation
  • Skip index creation after migration
  • Mix tenants in a single transaction
  • Cache results across tenants without keying
  • Forget to pass tenant context to batch operations
  • Assume default values work for production

Performance Optimization

Index Strategy

# PostgreSQL - Composite index on all three columns
CREATE INDEX idx_doc_tenant_kb_id 
ON documents(tenant_id, kb_id, id);

# For range queries
CREATE INDEX idx_doc_tenant_kb_created
ON documents(tenant_id, kb_id, created_at DESC);

# MongoDB - Compound index
db.documents.createIndex({
    tenant_id: 1,
    kb_id: 1,
    _id: 1
})

# For sorting
db.documents.createIndex({
    tenant_id: 1,
    kb_id: 1,
    created_at: -1
})

Query Optimization Tips

# Good: Specific tenant filter
SELECT * FROM documents 
WHERE tenant_id='acme-corp' 
AND kb_id='kb-prod'
AND status='active'
ORDER BY created_at DESC;

# Bad: Full table scan
SELECT * FROM documents 
WHERE status='active'
ORDER BY created_at DESC;

# Good: Use indexes
EXPLAIN SELECT * FROM documents 
WHERE tenant_id='acme-corp' 
AND kb_id='kb-prod' 
AND created_at > NOW() - INTERVAL '7 days';

# Result should show: "Index Scan" (not "Seq Scan")

Summary

Multi-Tenant Architecture at a Glance

┌─────────────────────────────────────────────────────┐
│   Multi-Tenant Architecture Summary                 │
├─────────────────────────────────────────────────────┤
│                                                     │
│  Goal: Securely isolate data for multiple          │
│         tenants in a single deployment             │
│                                                     │
│  Method: Database-level filtering by               │
│           (tenant_id, kb_id)                       │
│                                                     │
│  Supported: All 10 storage backends                │
│                                                     │
│  Activation: Use support modules, pass             │
│              tenant context to every operation     │
│                                                     │
│  Backward Compatible: Existing code works          │
│                       with default values          │
│                                                     │
│  Secure: Storage layer enforces isolation          │
│          even if application has bugs              │
│                                                     │
│  Scalable: One instance, unlimited tenants         │
│                                                     │
└─────────────────────────────────────────────────────┘

Next Steps

  1. Review Implementation Examples above for your backend
  2. Run Tests: pytest tests/test_multi_tenant_backends.py -v
  3. Plan Migration: Use migration utilities with dry-run first
  4. Deploy: Follow safe migration workflow in Troubleshooting section
  5. Monitor: Watch tenant-specific metrics in production

Additional Resources

  • Complete API Reference: See QUICK_REFERENCE_MULTI_TENANT.md
  • Deployment Guide: See PHASE4_COMPLETE_MULTI_TENANT_SUMMARY.md
  • Architecture Details: See MULTI_TENANT_COMPLETE_IMPLEMENTATION.md
  • Code Examples: See support modules in lightrag/kg/
  • Test Suite: See tests/test_multi_tenant_backends.py

Status: Production Ready
Last Updated: November 20, 2025
Questions? Check the Troubleshooting section or review code examples