docs: Enterprise Edition & Multi-tenancy attribution (#5 )

* Remove outdated documentation files: Quick Start Guide, Apache AGE Analysis, and Scratchpad.

* Add multi-tenant testing strategy and ADR index documentation

- Introduced ADR 008 detailing the multi-tenant testing strategy for the ./starter environment, covering compatibility and multi-tenant modes, testing scenarios, and implementation details.
- Created a comprehensive ADR index (README.md) summarizing all architecture decision records related to the multi-tenant implementation, including purpose, key sections, and reading paths for different roles.

* feat(docs): Add comprehensive multi-tenancy guide and README for LightRAG Enterprise

- Introduced `0008-multi-tenancy.md` detailing multi-tenancy architecture, key concepts, roles, permissions, configuration, and API endpoints.
- Created `README.md` as the main documentation index, outlining features, quick start, system overview, and deployment options.
- Documented the LightRAG architecture, storage backends, LLM integrations, and query modes.
- Established a task log (`2025-01-21-lightrag-documentation-log.md`) summarizing documentation creation actions, decisions, and insights.

2025-12-04 18:09:15 +08:00

10 KiB

Raw Blame History

Storage Layer Multi-Tenant Audit

Date: November 29, 2025
Status: In Progress

Overview

This document audits the multi-tenant implementation in the LightRAG storage layer, including PostgreSQL, Redis, and Vector databases.

Components Under Audit

1. PostgreSQL Multi-Tenant Support

Table Schema (`kg/postgres_tenant_support.py`)

Table DDL Pattern:

CREATE TABLE LIGHTRAG_DOC_STATUS (
    tenant_id VARCHAR(255) NOT NULL,
    kb_id VARCHAR(255) NOT NULL,
    workspace VARCHAR(255),
    id VARCHAR(255) NOT NULL,
    ...
    CONSTRAINT LIGHTRAG_DOC_STATUS_PK PRIMARY KEY (tenant_id, kb_id, id)
)

✅ Strengths:

All tables have tenant_id and kb_id columns
Composite primary keys enforce uniqueness per tenant/KB
Indexes designed for tenant-scoped queries

Tables with Multi-Tenant Support:

Table	tenant_id	kb_id	Composite PK
LIGHTRAG_DOC_FULL	✅	✅	✅
LIGHTRAG_DOC_CHUNKS	✅	✅	✅
LIGHTRAG_VDB_CHUNKS	✅	✅	✅
LIGHTRAG_VDB_ENTITY	✅	✅	✅
LIGHTRAG_VDB_RELATION	✅	✅	✅
LIGHTRAG_LLM_CACHE	✅	✅	✅
LIGHTRAG_DOC_STATUS	✅	✅	✅
LIGHTRAG_FULL_ENTITIES	✅	✅	✅
LIGHTRAG_FULL_RELATIONS	✅	✅	✅

SQL Builder (`TenantSQLBuilder`)

@staticmethod
def add_tenant_filter(sql: str, table_alias: str = "", param_index: int = 1) -> Tuple[str, int]:
    tenant_filter = f"{prefix}tenant_id=${param_index} AND {prefix}kb_id=${param_index + 1}"
    if "WHERE" in sql:
        sql = sql.replace("WHERE", f"WHERE {tenant_filter} AND", 1)
    else:
        sql += f" WHERE {tenant_filter}"
    return sql, param_index + 2

✅ Strengths:

Automatic injection of tenant filters
Parameterized queries (SQL injection safe)
Handles both existing WHERE and new WHERE clauses

⚠️ Potential Issues:

Simple string replacement - could fail on complex queries
No validation of sql input

Context Variable (`utils_context.py`)

tenant_id_var: ContextVar[Optional[str]] = ContextVar("tenant_id", default=None)

def get_current_tenant_id() -> Optional[str]:
    return tenant_id_var.get()

✅ Strengths:

Thread-safe and async-safe via ContextVar
Can be accessed deep in the call stack

⚠️ Potential Issues:

Returns None by default (needs checking by callers)
No kb_id context variable observed

PostgreSQL RLS (`postgres_rls.sql`)

Purpose: Row-Level Security for additional protection.

-- Tenant RLS policy
CREATE POLICY tenant_isolation ON LIGHTRAG_DOC_STATUS
    USING (tenant_id = current_setting('app.current_tenant', true));

✅ Strengths:

Defense-in-depth security
Database-level enforcement
Even if application bypasses, RLS blocks access

⚠️ Potential Issues:

Requires setting app.current_tenant before each query
May impact performance

2. Redis Multi-Tenant Support (`kg/redis_tenant_support.py`)

Key Pattern

@staticmethod
def make_tenant_key(tenant_id: str, kb_id: str, original_key: str) -> str:
    return f"{tenant_id}:{kb_id}:{original_key}"

Format: tenant_id:kb_id:original_key

Examples:

acme:kb-prod:doc-123
techstart:kb-main:entity-456

✅ Strengths:

Consistent namespace prefixing
Easy to scan for tenant-specific keys
Clear separation of concerns

⚠️ Potential Issues:

Keys with : in original_key could cause parsing issues
No encryption of tenant data

Namespace Manager (`RedisTenantNamespace`)

class RedisTenantNamespace:
    async def get(self, key: str) -> Optional[Any]:
        tenant_key = RedisTenantHelper.make_tenant_key(self.tenant_id, self.kb_id, key)
        return await self.redis.get(tenant_key)

✅ Strengths:

Encapsulates tenant logic
Prevents accidental access to other tenants
Batch operations supported

3. Vector Database Multi-Tenant Support (`kg/vector_tenant_support.py`)

Metadata Injection

@staticmethod
def add_tenant_metadata(payload: Dict[str, Any], tenant_id: str, kb_id: str) -> Dict[str, Any]:
    payload["tenant_id"] = tenant_id
    payload["kb_id"] = kb_id
    return payload

Query Filtering

Qdrant Filter:

def build_qdrant_filter(tenant_id: str, kb_id: str, additional_filter: Dict = None) -> Dict[str, Any]:
    must_conditions = [
        {"key": "tenant_id", "match": {"value": tenant_id}},
        {"key": "kb_id", "match": {"value": kb_id}}
    ]
    return {"must": must_conditions}

Milvus Expression:

def build_milvus_expr(tenant_id: str, kb_id: str, additional_expr: str = None) -> str:
    expr = f'tenant_id == "{tenant_id}" && kb_id == "{kb_id}"'

✅ Strengths:

Supports multiple vector DB backends
Filter-based isolation (no collection per tenant needed)
Efficient for large number of tenants

⚠️ Potential Issues:

Filter overhead on every query
No index on tenant_id/kb_id in some backends

Collection Naming (Alternative Approach)

@staticmethod
def create_tenant_collection_name(base_name: str, tenant_id: str, kb_id: str) -> str:
    return f"{base_name}_{tenant_id}_{kb_id}".replace("-", "_")

Use Case: Separate collections per tenant for:

Stronger isolation
Easier tenant deletion
Independent scaling

Detailed Findings

Finding STG-001: No kb_id in ContextVar

Severity: Medium
Location: utils_context.py

Description: Only tenant_id is stored in ContextVar. The kb_id must be passed explicitly, which could lead to inconsistencies.

Recommendation: Add kb_id_var: ContextVar[Optional[str]] for complete context propagation.

Finding STG-002: Simple SQL String Replacement

Severity: Low
Location: postgres_tenant_support.py

Description: The add_tenant_filter function uses simple string replacement:

sql = sql.replace("WHERE", f"WHERE {tenant_filter} AND", 1)

This could fail on:

CTEs with nested WHERE clauses
Complex subqueries
Case variations (where vs WHERE)

Recommendation: Use proper SQL parsing or ORM-based filtering.

Finding STG-003: Redis Key Collision Risk

Severity: Low
Location: redis_tenant_support.py

Description: If original_key contains :, parsing could return incorrect results:

parts = tenant_key.split(":", 2)
# With key "acme:kb-prod:my:special:key"
# Returns: tenant_id="acme", kb_id="kb-prod", original_key="my:special:key" ✅

The split(2) handles this correctly, but there's no validation preventing : in tenant_id or kb_id.

Recommendation: Validate that tenant_id and kb_id don't contain the separator character.

Finding STG-004: RLS Setting Not Always Applied

Severity: Medium
Location: postgres_impl.py

Description: The tenant context is set in specific places:

tenant_id = get_current_tenant_id()
if tenant_id:
    await connection.execute(f"SET app.current_tenant = '{tenant_id}'")

If get_current_tenant_id() returns None, RLS may block all access.

Recommendation: Ensure tenant context is always set before any database operation.

Finding STG-005: Vector Metadata Not Indexed

Severity: Low
Location: Vector DB implementations

Description: Tenant filtering adds overhead to every vector query. Without proper indexing on tenant_id/kb_id, queries may be slow with many tenants.

Recommendation:

Create index on tenant_id, kb_id metadata fields
Consider partition collection by tenant for high-volume deployments

Data Isolation Verification

Test: PostgreSQL Isolation

-- Verify tenant_id is always set
SELECT COUNT(*) FROM lightrag_doc_status WHERE tenant_id IS NULL;
-- Expected: 0

-- Verify no cross-tenant data
SELECT tenant_id, kb_id, COUNT(*) 
FROM lightrag_doc_status 
GROUP BY tenant_id, kb_id;
-- Each row should show isolated counts

-- Test RLS (should return empty without setting tenant)
SELECT * FROM lightrag_doc_status LIMIT 5;
-- With RLS enabled and no app.current_tenant set: 0 rows

Test: Redis Isolation

# List all keys for a tenant
redis-cli KEYS "tenant_a:*"

# Verify no keys without tenant prefix
redis-cli KEYS "*" | grep -v ":"
# Should be empty (all keys should be tenant-prefixed)

Test: Vector DB Isolation

# Query without tenant filter (should fail or return nothing)
results = collection.search(query_vector)
# Expected: Empty or error

# Query with correct tenant filter
results = collection.search(
    query_vector,
    filter={"tenant_id": "tenant_a", "kb_id": "kb_1"}
)
# Expected: Only tenant_a data

Composite Key Pattern

The multi-tenant system uses composite keys throughout:

Layer	Key Format
PostgreSQL PK	`(tenant_id, kb_id, id)`
Redis Key	`tenant_id:kb_id:original_key`
Vector ID	`tenant_id:kb_id:original_id`
Vector Metadata	`{tenant_id, kb_id, ...}`

Benefits:

Consistent isolation pattern
Easy to identify tenant ownership
Natural grouping for batch operations

Drawbacks:

Longer keys/IDs
Parsing overhead
Can't use simple auto-increment IDs

Migration Support

Adding Tenant Columns

async def add_tenant_columns_migration(db, table_name: str, tenant_id: str = "default", kb_id: str = "default"):
    # Adds tenant_id and kb_id columns
    # Populates with default values for existing data

✅ Strengths:

Safe migration for existing deployments
Default values prevent null issues

⚠️ Caution: Existing data in a "default" tenant should be migrated to proper tenants.

Conclusion

The storage layer has comprehensive multi-tenant support:

PostgreSQL: Composite PKs, parameterized queries, RLS support
Redis: Namespace prefixes, helper classes
Vector DBs: Metadata filtering, collection naming

Key concerns:

Medium: No kb_id in ContextVar
Medium: RLS not always applied if context missing
Low: Simple SQL string replacement
Low: Potential key parsing edge cases

Recommendations:

Add kb_id to ContextVar for complete context
Validate tenant context is set before all DB operations
Add index on tenant metadata in vector DBs
Consider SQL parsing library for complex queries

10 KiB Raw Blame History

Storage Layer Multi-Tenant Audit

Overview

Components Under Audit

1. PostgreSQL Multi-Tenant Support

Table Schema (kg/postgres_tenant_support.py)

SQL Builder (TenantSQLBuilder)

Context Variable (utils_context.py)

PostgreSQL RLS (postgres_rls.sql)

2. Redis Multi-Tenant Support (kg/redis_tenant_support.py)

Key Pattern

Namespace Manager (RedisTenantNamespace)

3. Vector Database Multi-Tenant Support (kg/vector_tenant_support.py)

Metadata Injection

Query Filtering

Collection Naming (Alternative Approach)

Detailed Findings

Finding STG-001: No kb_id in ContextVar

Finding STG-002: Simple SQL String Replacement

Finding STG-003: Redis Key Collision Risk

Finding STG-004: RLS Setting Not Always Applied

Finding STG-005: Vector Metadata Not Indexed

Data Isolation Verification

Test: PostgreSQL Isolation

Test: Redis Isolation

Test: Vector DB Isolation

Composite Key Pattern

Migration Support

Adding Tenant Columns

Conclusion

10 KiB

Raw Blame History

Table Schema (`kg/postgres_tenant_support.py`)

SQL Builder (`TenantSQLBuilder`)

Context Variable (`utils_context.py`)

PostgreSQL RLS (`postgres_rls.sql`)

2. Redis Multi-Tenant Support (`kg/redis_tenant_support.py`)

Namespace Manager (`RedisTenantNamespace`)

3. Vector Database Multi-Tenant Support (`kg/vector_tenant_support.py`)