docs: Enterprise Edition & Multi-tenancy attribution (#5 )

* Remove outdated documentation files: Quick Start Guide, Apache AGE Analysis, and Scratchpad.

* Add multi-tenant testing strategy and ADR index documentation

- Introduced ADR 008 detailing the multi-tenant testing strategy for the ./starter environment, covering compatibility and multi-tenant modes, testing scenarios, and implementation details.
- Created a comprehensive ADR index (README.md) summarizing all architecture decision records related to the multi-tenant implementation, including purpose, key sections, and reading paths for different roles.

* feat(docs): Add comprehensive multi-tenancy guide and README for LightRAG Enterprise

- Introduced `0008-multi-tenancy.md` detailing multi-tenancy architecture, key concepts, roles, permissions, configuration, and API endpoints.
- Created `README.md` as the main documentation index, outlining features, quick start, system overview, and deployment options.
- Documented the LightRAG architecture, storage backends, LLM integrations, and query modes.
- Established a task log (`2025-01-21-lightrag-documentation-log.md`) summarizing documentation creation actions, decisions, and insights.

2025-12-04 18:09:15 +08:00

21 KiB

Raw Blame History

ADR 008: Multi-Tenant Testing Strategy for ./starter Environment

Status: Proposed

Context

The ./starter directory provides a local development and testing environment for the LightRAG multi-tenant implementation. This environment must support two distinct operational modes:

No Multi-Tenant Mode (Compatibility Mode): Behaves like the original single-workspace LightRAG, maintaining backward compatibility with the main branch
Multi-Tenant Mode (Production Mode): Demonstrates full multi-tenant isolation with multiple tenants and knowledge bases

The testing strategy must ensure both modes work correctly and that switching between them is seamless and reproducible.

Problem Statement

Current state (as of November 2025):

The multi-tenant architecture is implemented in feat/multi-tenant branch
The ./starter directory uses Docker Compose with PostgreSQL, Redis, LightRAG API, and WebUI
The environment supports automatic tenant/KB resolution with "default" tenant and KB names
No clear testing protocol exists for validating both single-tenant and multi-tenant scenarios
Documentation doesn't reflect the actual implementation details

Key Challenges

Backward Compatibility: Must verify that old single-tenant code paths still work
Switching Modes: Need clear procedures to enable/disable multi-tenancy for testing
Data Isolation: Must verify tenant/KB isolation at both API and database levels
Reproducibility: Tests must be idempotent and produce consistent results
Environment Configuration: Starter must clearly document all configuration options for each mode

Decision

Multi-Mode Testing Architecture

We implement a configurable testing environment in ./starter that can operate in two modes via environment variables:

Mode Selection:
├─ MULTITENANT_MODE=off    → Single-tenant compatibility mode (like main branch)
├─ MULTITENANT_MODE=on     → Full multi-tenant mode (default)
└─ MULTITENANT_MODE=demo   → Multi-tenant demo mode (2 pre-configured tenants)

Scenario 1: No Multi-Tenant Mode (Compatibility Mode)

Goal: Verify that LightRAG works exactly as it did before multi-tenancy was added.

Configuration:

MULTITENANT_MODE=off
DEFAULT_TENANT=default
DEFAULT_KB=default
WORKSPACE_ISOLATION_TYPE=legacy

Behavior:

API endpoints work WITHOUT requiring X-Tenant-ID or X-KB-ID headers
All operations use "default" tenant and "default" KB internally
Storage uses legacy workspace namespace: tenant_id_kb_id → default_default
No tenant context validation errors
Fully backward compatible with main branch code

Testing Scenarios:

Test Case	Description	Expected Result
T1.1	Upload document without tenant headers	✓ Document stored in default workspace
T1.2	Query without tenant headers	✓ Results returned from default workspace
T1.3	Create knowledge graph without tenant headers	✓ Graph entities stored in default workspace
T1.4	Access WebUI without specifying tenant	✓ UI works with implicit default tenant
T1.5	Database contains no tenant_id/kb_id fields	✓ Tables use legacy workspace field only
T1.6	Mix of old client and new client requests	✓ Both work without conflicts
T1.7	Verify authorization doesn't require tenant access	✓ Auth tokens work with role only (no tenant scope)
T1.8	All stored data is in `default_default` namespace	✓ Data isolation via workspace namespace only

Backward Compatibility Verification:

# Should work identically to main branch
rag = LightRAG(working_dir="./rag_data")  # No tenant_id/kb_id
await rag.insert("document text")
results = await rag.query("query text")

# Should NOT require X-Tenant-ID header
curl -X POST http://localhost:8000/api/v1/documents/insert \
  -H "Authorization: Bearer token"  \
  -H "Content-Type: application/json" \
  -d '{"document": "text"}'

Database Schema:

Uses legacy tables: lightrag_doc_full, lightrag_doc_chunks, etc.
workspace field acts as tenant/KB namespace
No tenant_id, kb_id columns added
Composite indexes: (workspace, id) not (tenant_id, kb_id, id)

Scenario 2: Single Tenant with Multiple KBs (Controlled Multi-Tenant)

Goal: Test multi-tenant architecture with a single tenant serving multiple knowledge bases.

Configuration:

MULTITENANT_MODE=on
DEFAULT_TENANT=tenant-1
DEFAULT_KB=kb-default
CREATE_DEFAULT_KB=kb-default,kb-secondary,kb-experimental

Behavior:

API requires X-Tenant-ID (resolves to tenant-1 if not provided)
API requires X-KB-ID (can be any KB in the tenant)
Different KBs can coexist for the same tenant
Complete data isolation: tenant-1:kb-default:entity1 vs tenant-1:kb-secondary:entity1
Same tenant has access to all its KBs

Testing Scenarios:

Test Case	Description	Expected Result
T2.1	Insert into kb-default	✓ Document stored in tenant-1_kb-default namespace
T2.2	Insert into kb-secondary	✓ Document stored in tenant-1_kb-secondary namespace
T2.3	Query from kb-default returns kb-default data only	✓ No cross-KB data leakage
T2.4	Query from kb-secondary returns kb-secondary data only	✓ No cross-KB data leakage
T2.5	Switch KB in same request sequence	✓ Isolation maintained at database level
T2.6	Create duplicate entity names in different KBs	✓ Database allows duplicates (namespace isolated)
T2.7	Generate graph for kb-default	✓ Graph contains only kb-default entities
T2.8	Database contains tenant_id and kb_id columns	✓ Composite keys prevent collisions

Query Examples:

# Query KB-1
curl -X POST http://localhost:8000/api/v1/query \
  -H "Authorization: Bearer token" \
  -H "X-Tenant-ID: tenant-1" \
  -H "X-KB-ID: kb-default" \
  -d '{"query": "text"}'

# Query KB-2 (same tenant)
curl -X POST http://localhost:8000/api/v1/query \
  -H "Authorization: Bearer token" \
  -H "X-Tenant-ID: tenant-1" \
  -H "X-KB-ID: kb-secondary" \
  -d '{"query": "text"}'

Database Schema:

Tables include both workspace (legacy) and tenant_id, kb_id (new)
Workspace auto-generated as tenant_id_kb_id for backward compatibility
Composite indexes: (tenant_id, kb_id, id)
Unique constraints prevent accidental cross-KB entity conflicts

Scenario 3: Multiple Tenants (Full Multi-Tenant)

Goal: Demonstrate complete multi-tenant isolation with multiple independent organizations.

Configuration:

MULTITENANT_MODE=demo
# Pre-configured demo tenants:
# - Tenant 1: "acme-corp"
#   ├─ kb-prod (Production knowledge base)
#   └─ kb-dev (Development knowledge base)
# - Tenant 2: "techstart"
#   ├─ kb-main (Main knowledge base)
#   └─ kb-backup (Backup knowledge base)

Behavior:

API requires X-Tenant-ID and X-KB-ID on every request
No "default" fallback - must be explicit
Complete isolation: acme-corp cannot access techstart data
Separate resource quotas per tenant
Independent configurations (LLM models, embedding models)

Testing Scenarios:

Test Case	Description	Expected Result
T3.1	acme-corp inserts into kb-prod	✓ Data isolated in acme-corp_kb-prod
T3.2	techstart inserts into kb-main	✓ Data isolated in techstart_kb-main
T3.3	acme-corp queries kb-prod	✓ Returns only acme-corp_kb-prod data
T3.4	techstart queries kb-main	✓ Returns only techstart_kb-main data
T3.5	acme-corp attempts to query techstart KB	✗ 403 Forbidden (access denied)
T3.6	User with acme-corp JWT tries techstart KB	✗ Permission denied at API layer
T3.7	Different entity names in different tenants	✓ Database allows identical names in different tenant_id values
T3.8	Database NEVER returns cross-tenant data	✓ Query filters enforce `tenant_id` constraint
T3.9	Even with DB admin creds, workspace isolation works	✓ Cannot accidentally query wrong tenant at SQL level
T3.10	Delete acme-corp entity doesn't affect techstart	✓ DELETE uses composite key (tenant_id, kb_id, id)
T3.11	Verify JWT tokens scope to specific tenant	✓ Token contains tenant_id, prevents cross-tenant access
T3.12	Resource quotas enforced per tenant	✓ acme-corp quota limits don't affect techstart

Request Examples:

# acme-corp accessing kb-prod
curl -X POST http://localhost:8000/api/v1/query \
  -H "Authorization: Bearer acme-corp-token" \
  -H "X-Tenant-ID: acme-corp" \
  -H "X-KB-ID: kb-prod" \
  -d '{"query": "revenue"}'

# techstart accessing kb-main
curl -X POST http://localhost:8000/api/v1/query \
  -H "Authorization: Bearer techstart-token" \
  -H "X-Tenant-ID: techstart" \
  -H "X-KB-ID: kb-main" \
  -d '{"query": "funding"}'

# acme-corp trying to access techstart data (should fail)
curl -X POST http://localhost:8000/api/v1/query \
  -H "Authorization: Bearer acme-corp-token" \
  -H "X-Tenant-ID: techstart" \
  -H "X-KB-ID: kb-main" \
  -d '{"query": "data"}' \
# Response: 403 Forbidden - "User does not have access to tenant"

Database Verification:

-- Verify tenant isolation at SQL level
SELECT COUNT(*) FROM lightrag_doc_full 
WHERE tenant_id = 'acme-corp' AND kb_id = 'kb-prod';  -- Count acme-corp documents

SELECT COUNT(*) FROM lightrag_doc_full 
WHERE tenant_id = 'techstart' AND kb_id = 'kb-main';  -- Count techstart documents

-- Verify no cross-tenant data in single query
SELECT DISTINCT tenant_id, kb_id FROM lightrag_doc_full;  -- Should see: acme-corp, techstart only

-- Verify composite indexes exist
\di lightrag_doc_full*  -- Should show idx on (tenant_id, kb_id, id)

Implementation Details

Environment Variable Configuration

File: ./starter/env.example (updated with new options):

# ============================================================================
# TESTING MODE CONFIGURATION
# ============================================================================

# Choose testing mode:
#   off   = Single-tenant compatibility mode (like main branch)
#   on    = Multi-tenant mode with single default tenant
#   demo  = Multi-tenant mode with 2 pre-configured tenants
MULTITENANT_MODE=demo

# For MULTITENANT_MODE=on, create additional KBs
# Format: kb_name,kb_name,kb_name
CREATE_DEFAULT_KB=kb-default,kb-secondary,kb-experimental

# ============================================================================
# TENANT CONFIGURATION (Used when MULTITENANT_MODE != off)
# ============================================================================

DEFAULT_TENANT=default
DEFAULT_KB=default

# Pre-configured demo tenants (for MULTITENANT_MODE=demo)
DEMO_TENANT_1_NAME=acme-corp
DEMO_TENANT_1_KBS=kb-prod,kb-dev

DEMO_TENANT_2_NAME=techstart
DEMO_TENANT_2_KBS=kb-main,kb-backup

# ============================================================================

Docker Compose Modifications

File: ./starter/docker-compose.yml (add initialization script):

services:
  postgres:
    environment:
      MULTITENANT_MODE: ${MULTITENANT_MODE:-demo}
    volumes:
      - ./init-postgres.sql:/docker-entrypoint-initdb.d/01-init.sql:ro
      - ./init-demo-tenants.sql:/docker-entrypoint-initdb.d/02-demo-tenants.sql:ro  # New

  lightrag-api:
    environment:
      MULTITENANT_MODE: ${MULTITENANT_MODE:-demo}
      DEFAULT_TENANT: ${DEFAULT_TENANT:-default}
      DEFAULT_KB: ${DEFAULT_KB:-default}
      CREATE_DEFAULT_KB: ${CREATE_DEFAULT_KB:-kb-default}

Initialization SQL Scripts

New File: ./starter/init-demo-tenants.sql

Creates the demo tenants and KBs when MULTITENANT_MODE=demo:

-- Only run if MULTITENANT_MODE=demo
-- This is handled via environment variable in docker-entrypoint

INSERT INTO tenants (tenant_id, tenant_name, description, is_active)
VALUES 
  ('acme-corp', 'ACME Corporation', 'Demo tenant 1: Large enterprise', true),
  ('techstart', 'TechStart Inc', 'Demo tenant 2: Startup', true);

INSERT INTO knowledge_bases (kb_id, tenant_id, kb_name, description, is_active)
VALUES 
  ('kb-prod', 'acme-corp', 'Production KB', 'Live production data', true),
  ('kb-dev', 'acme-corp', 'Development KB', 'Dev/staging data', true),
  ('kb-main', 'techstart', 'Main KB', 'Primary knowledge base', true),
  ('kb-backup', 'techstart', 'Backup KB', 'Backup and archival', true);

Testing Procedures

Procedure 1: Run Compatibility Mode Tests

cd ./starter

# 1. Configure for compatibility mode
cp env.example .env
echo "MULTITENANT_MODE=off" >> .env
echo "WORKSPACE_ISOLATION_TYPE=legacy" >> .env

# 2. Start services
make setup
make up
make init-db

# 3. Run compatibility tests
pytest ../tests/test_backward_compatibility.py -v

# 4. Run manual tests
python3 reproduce_issue.py  # Should work without tenant headers

Expected Results:

All T1.x test cases pass
No authorization failures due to missing tenant context
Database uses workspace namespace only
Queries return data regardless of tenant headers (or missing headers)

Procedure 2: Run Single-Tenant Multi-KB Tests

cd ./starter

# 1. Configure for single tenant with multiple KBs
cp env.example .env
echo "MULTITENANT_MODE=on" >> .env
echo "DEFAULT_TENANT=tenant-1" >> .env
echo "CREATE_DEFAULT_KB=kb-default,kb-secondary,kb-experimental" >> .env

# 2. Start services
make setup
make up
make init-db

# 3. Run isolation tests
pytest ../tests/test_multi_tenant_backends.py::TestTenantIsolation -v

# 4. Manual test: Verify KB isolation
# Create document in kb-default
curl -X POST http://localhost:8000/api/v1/documents/insert \
  -H "X-Tenant-ID: tenant-1" \
  -H "X-KB-ID: kb-default" \
  -d '{"document": "document in kb-default"}'

# Query should return document
curl -X POST http://localhost:8000/api/v1/query \
  -H "X-Tenant-ID: tenant-1" \
  -H "X-KB-ID: kb-default"

# Query in different KB should return NOTHING
curl -X POST http://localhost:8000/api/v1/query \
  -H "X-Tenant-ID: tenant-1" \
  -H "X-KB-ID: kb-secondary"

Expected Results:

All T2.x test cases pass
Data isolated by KB within same tenant
No cross-KB data leakage at API or database level
Composite keys work correctly

Procedure 3: Run Full Multi-Tenant Tests

cd ./starter

# 1. Configure for full multi-tenant demo mode (default)
cp env.example .env
echo "MULTITENANT_MODE=demo" >> .env

# 2. Start services
make setup
make up
make init-db

# 3. Run full multi-tenant tests
pytest ../tests/test_multi_tenant_backends.py -v
pytest ../tests/test_tenant_security.py -v

# 4. Manual test: Verify cross-tenant isolation
# Insert as acme-corp
curl -X POST http://localhost:8000/api/v1/documents/insert \
  -H "Authorization: Bearer acme-token" \
  -H "X-Tenant-ID: acme-corp" \
  -H "X-KB-ID: kb-prod" \
  -d '{"document": "acme document"}'

# Try to query as techstart (should fail or return empty)
curl -X POST http://localhost:8000/api/v1/query \
  -H "Authorization: Bearer acme-token" \
  -H "X-Tenant-ID: techstart" \
  -H "X-KB-ID: kb-main"
# Expected: 403 Forbidden

# Query as acme should succeed
curl -X POST http://localhost:8000/api/v1/query \
  -H "Authorization: Bearer acme-token" \
  -H "X-Tenant-ID: acme-corp" \
  -H "X-KB-ID: kb-prod"
# Expected: 200 OK with results

# 5. Database verification
make db-shell
# SELECT COUNT(*) FROM lightrag_doc_full WHERE tenant_id='acme-corp';
# SELECT DISTINCT tenant_id FROM lightrag_doc_full;

Expected Results:

All T3.x test cases pass
Cross-tenant access denied (403)
Complete data isolation at database level
No authorization bypasses

Testing Matrix

Mode	Tenant Headers Required	Default Tenant	Multiple KBs	Multiple Tenants	Test File
off	No	Always default	❌ Single workspace	❌ Single tenant	`test_backward_compatibility.py`
on	Yes	Provided/resolved	✓ Multiple per tenant	❌ Single tenant only	`test_multi_tenant_backends.py`
demo	Yes	None/explicit	✓ Multiple per tenant	✓ 2 pre-configured	`test_tenant_security.py`

Test Coverage

Unit Tests

test_backward_compatibility.py: Validates old code paths still work
test_multi_tenant_backends.py: Validates storage layer isolation
test_tenant_models.py: Validates data models
test_tenant_security.py: Validates permission/authorization

Integration Tests

API Layer: test_tenant_api_routes.py
Database: test_tenant_storage_phase3.py
Graph Operations: test_graph_storage.py

Manual Verification

Database schema validation (composite keys, indexes)
Cross-tenant access attempts (should fail)
KB isolation verification
Authorization enforcement

Consequences

Positive

Flexible Testing: Can test backward compatibility and new multi-tenant features
Clear Procedures: Step-by-step procedures for each testing scenario
Reproducibility: Environment variables make tests repeatable
Safety: Explicit mode selection prevents accidental data mixing
Documentation: Clear understanding of what each mode does
Validation: Comprehensive test matrix covers all scenarios

Negative/Tradeoffs

Configuration Complexity: Three modes add configuration overhead
Initialization Scripts: Must maintain both legacy and multi-tenant initialization
Testing Duration: Running all three modes sequentially takes time
Documentation Maintenance: Must keep mode-specific docs up to date
Docker Image Size: Includes both legacy and new code paths

Rollback/Migration

From Compatibility Mode to Multi-Tenant

# 1. Back up existing data
make db-backup

# 2. Switch mode
sed -i 's/MULTITENANT_MODE=off/MULTITENANT_MODE=on/' .env

# 3. Migrate database schema
# This requires: DROP old tables, CREATE new tables with tenant columns
# Migration script: scripts/migrate_to_multitenant.sql

# 4. Restart services
make restart

From Multi-Tenant back to Compatibility Mode

# 1. Back up multi-tenant data
make db-backup

# 2. Switch mode
sed -i 's/MULTITENANT_MODE=on/MULTITENANT_MODE=off/' .env

# 3. Extract single tenant data
# SELECT * FROM lightrag_doc_full WHERE tenant_id='default' 
# INTO workspace-based tables

# 4. Restart services
make restart

Verification Checklist

Before considering the ADR complete:

MULTITENANT_MODE=off works identically to main branch
MULTITENANT_MODE=on prevents cross-KB data access
MULTITENANT_MODE=demo prevents cross-tenant data access
Environment variable switching is seamless
All test cases (T1-T3) pass in their respective modes
Database schema matches mode requirements
Documentation reflects actual implementation
Integration tests run successfully
Manual verification procedures validate isolation
Authorization failures work correctly (403, 401, etc.)

References

ADR 001: Multi-Tenant Architecture Overview
ADR 002: Implementation Strategy
ADR 003: Data Models and Storage
ADR 004: API Design
ADR 005: Security Analysis

Implementation Files

lightrag/models/tenant.py: TenantContext, Tenant, KnowledgeBase models
lightrag/tenant_rag_manager.py: Per-tenant instance management
lightrag/api/dependencies.py: Tenant context extraction
tests/test_backward_compatibility.py: Legacy compatibility tests
tests/test_multi_tenant_backends.py: Multi-tenant backend tests
tests/test_tenant_security.py: Security validation

Starter Files

starter/docker-compose.yml: Service orchestration
starter/env.example: Configuration template
starter/Makefile: Testing procedures
starter/init-postgres.sql: Database initialization

Next Steps

Implement environment variable handling in docker-entrypoint-initdb.d scripts
Create demo tenant initialization SQL script (init-demo-tenants.sql)
Update Makefile with mode-specific test targets
Create test runner script that runs appropriate tests for each mode
Document mode selection in README.md
Create CI/CD workflow to test all three modes automatically
Add health checks that validate mode-specific expectations
Create migration scripts for switching between modes
Update all existing ADRs to reference this testing strategy
Add mode detection to API startup (warn if wrong mode configuration)

Document Version: 1.0
Last Updated: 2025-11-22
Author: Architecture Design Process
Status: Proposed - Ready for Implementation Review

Implementation Notes:

Based on actual code examination of feat/multi-tenant branch
Verified against: tenant.py, tenant_rag_manager.py, dependencies.py, docker-compose.yml
Tested scenarios aligned with actual test files in tests/ directory
Configuration options match env.example and existing environment setup

21 KiB Raw Blame History

ADR 008: Multi-Tenant Testing Strategy for ./starter Environment

Status: Proposed

Context

Problem Statement

Key Challenges

Decision

Multi-Mode Testing Architecture

Scenario 1: No Multi-Tenant Mode (Compatibility Mode)

Scenario 2: Single Tenant with Multiple KBs (Controlled Multi-Tenant)

Scenario 3: Multiple Tenants (Full Multi-Tenant)

Implementation Details

Environment Variable Configuration

Docker Compose Modifications

Initialization SQL Scripts

Testing Procedures

Procedure 1: Run Compatibility Mode Tests

Procedure 2: Run Single-Tenant Multi-KB Tests

Procedure 3: Run Full Multi-Tenant Tests

Testing Matrix

Test Coverage

Unit Tests

Integration Tests

Manual Verification

Consequences

Positive

Negative/Tradeoffs

Rollback/Migration

From Compatibility Mode to Multi-Tenant

From Multi-Tenant back to Compatibility Mode

Verification Checklist

References

Related ADRs

Implementation Files

Starter Files

Next Steps

21 KiB

Raw Blame History