Enable a single LightRAG server instance to serve multiple isolated workspaces via HTTP header-based routing. This allows multi-tenant SaaS deployments where each tenant's data is completely isolated. Key features: - Header-based workspace routing (LIGHTRAG-WORKSPACE, X-Workspace-ID fallback) - Process-local pool of LightRAG instances with LRU eviction - FastAPI dependency (get_rag) for workspace resolution per request - Full backward compatibility - existing deployments work unchanged - Strict multi-tenant mode option (LIGHTRAG_ALLOW_DEFAULT_WORKSPACE=false) - Configurable pool size (LIGHTRAG_MAX_WORKSPACES_IN_POOL) - Graceful shutdown with workspace finalization Configuration: - LIGHTRAG_DEFAULT_WORKSPACE: Default workspace (falls back to WORKSPACE) - LIGHTRAG_ALLOW_DEFAULT_WORKSPACE: Require explicit header when false - LIGHTRAG_MAX_WORKSPACES_IN_POOL: Max concurrent workspace instances (default: 50) Files: - New: lightrag/api/workspace_manager.py (core multi-workspace module) - New: tests/test_multi_workspace_server.py (17 unit tests) - New: render.yaml (Render deployment blueprint) - Modified: All route files to use get_rag dependency - Updated: README.md, env.example with documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
7.7 KiB
Research: Multi-Workspace Server Support
Date: 2025-12-01 Feature: 001-multi-workspace-server
Executive Summary
Research confirms that the existing LightRAG codebase provides solid foundation for multi-workspace support at the server level. The core library already has workspace isolation; the gap is purely at the API server layer.
Research Findings
1. Existing Workspace Support in LightRAG Core
Decision: Leverage existing workspace parameter in LightRAG class
Findings:
LightRAGclass acceptsworkspace: strparameter (default:os.getenv("WORKSPACE", ""))- Storage implementations use
get_final_namespace(namespace, workspace)to create isolated keys - Namespace format:
"{workspace}:{namespace}"when workspace is set, else just"{namespace}" - Pipeline status, locks, and in-memory state are all workspace-aware via
shared_storage.py DocumentManagercreates workspace-specific input directories
Evidence:
# lightrag/lightrag.py
workspace: str = field(default_factory=lambda: os.getenv("WORKSPACE", ""))
# lightrag/kg/shared_storage.py
def get_final_namespace(namespace: str, workspace: str | None = None) -> str:
if workspace is None:
workspace = get_default_workspace()
if not workspace:
return namespace
return f"{workspace}:{namespace}"
Implications: No changes needed to core isolation; just need to instantiate separate LightRAG objects with different workspace values.
2. Current Server Architecture
Decision: Refactor from closure pattern to FastAPI dependency injection
Findings:
- Server creates a single global
LightRAGinstance increate_app(args) - Routes receive the RAG instance via closure (factory function pattern):
def create_document_routes(rag: LightRAG, doc_manager, api_key): @router.post("/scan") async def scan_for_new_documents(...): # rag captured from enclosing scope - This pattern prevents per-request workspace switching
Alternative Considered: Keep closure pattern and add workspace switching to existing instance
- Rejected Because: LightRAG instance configuration is immutable after creation; switching workspace would require re-initializing storage connections
Chosen Approach: Replace closure with FastAPI Depends() that resolves workspace → instance
3. Instance Pool Design
Decision: Use asyncio.Lock protected dictionary with LRU eviction
Findings:
- Python's
asyncio.Lockis appropriate for protecting async operations - LRU eviction via
collections.OrderedDictor manual tracking - Instance initialization is async (
await rag.initialize_storages()) - Concurrent requests for same new workspace must share initialization
Pattern:
_instances: dict[str, LightRAG] = {}
_lock = asyncio.Lock()
_lru_order: list[str] = [] # Most recent at end
async def get_instance(workspace: str) -> LightRAG:
async with _lock:
if workspace in _instances:
# Move to end of LRU list
_lru_order.remove(workspace)
_lru_order.append(workspace)
return _instances[workspace]
# Evict if at capacity
if len(_instances) >= max_pool_size:
oldest = _lru_order.pop(0)
await _instances[oldest].finalize_storages()
del _instances[oldest]
# Create and initialize
instance = LightRAG(workspace=workspace, ...)
await instance.initialize_storages()
_instances[workspace] = instance
_lru_order.append(workspace)
return instance
Alternative Considered: Use async_lru library or cachetools.TTLCache
- Rejected Because: Adds external dependency; simple dict+lock is sufficient and well-understood
4. Header Routing Strategy
Decision: LIGHTRAG-WORKSPACE primary, X-Workspace-ID fallback
Findings:
- Custom headers conventionally use
X-prefix, but this is deprecated per RFC 6648 - Product-specific headers (e.g.,
LIGHTRAG-WORKSPACE) are clearer and recommended - Fallback to common convention (
X-Workspace-ID) aids adoption
Implementation:
def get_workspace_from_request(request: Request) -> str | None:
workspace = request.headers.get("LIGHTRAG-WORKSPACE", "").strip()
if not workspace:
workspace = request.headers.get("X-Workspace-ID", "").strip()
return workspace or None
5. Configuration Schema
Decision: Three new environment variables
| Variable | Type | Default | Description |
|---|---|---|---|
LIGHTRAG_DEFAULT_WORKSPACE |
str | "" (from WORKSPACE) |
Default workspace when no header |
LIGHTRAG_ALLOW_DEFAULT_WORKSPACE |
bool | true |
If false, reject requests without header |
LIGHTRAG_MAX_WORKSPACES_IN_POOL |
int | 50 |
Maximum concurrent workspace instances |
Rationale:
LIGHTRAG_prefix namespaces new vars to avoid conflictsALLOW_DEFAULT_WORKSPACE=falseenables strict multi-tenant mode- Default pool size of 50 balances memory vs. reinitialization overhead
6. Workspace Identifier Validation
Decision: Alphanumeric, hyphens, underscores; 1-64 characters
Findings:
- Must be safe for filesystem paths (workspace creates subdirectories)
- Must be safe for database keys (used in storage namespacing)
- Must prevent injection attacks (path traversal, SQL injection)
Validation Regex: ^[a-zA-Z0-9][a-zA-Z0-9_-]{0,63}$
- Starts with alphanumeric (prevents hidden directories like
.hidden) - Allows hyphens and underscores for readability
- Max 64 chars (reasonable for identifiers, fits in most DB column sizes)
7. Error Handling
Decision: Return 400 for missing/invalid workspace; 503 for initialization failures
| Scenario | HTTP Status | Error Message |
|---|---|---|
| Missing header, default disabled | 400 | Missing LIGHTRAG-WORKSPACE header |
| Invalid workspace identifier | 400 | Invalid workspace identifier: must be alphanumeric... |
| Workspace initialization fails | 503 | Failed to initialize workspace: {details} |
8. Logging Strategy
Decision: Log workspace identifier at INFO level; never log credentials
Implementation:
- Log workspace on request:
logger.info(f"Request to workspace: {workspace}") - Log pool events:
logger.info(f"Initialized workspace: {workspace}") - Log evictions:
logger.info(f"Evicted workspace from pool: {workspace}") - NEVER log: API keys, storage credentials, auth tokens
9. Test Strategy
Decision: Pytest with markers following existing patterns
Test Categories:
- Unit tests (
@pytest.mark.offline): Workspace resolution, validation, pool logic - Integration tests (
@pytest.mark.integration): Full request flow with mock LLM/embedding - Backward compatibility tests (
@pytest.mark.offline): Single-workspace mode unchanged
Key Test Scenarios:
- Two workspaces → ingest document in A → query from B returns nothing
- No header +
ALLOW_DEFAULT_WORKSPACE=true→ uses default - No header +
ALLOW_DEFAULT_WORKSPACE=false→ returns 400 - Pool at capacity → evicts LRU → new workspace initializes
Resolved Questions
| Question | Resolution |
|---|---|
| How to handle concurrent init of same workspace? | asyncio.Lock ensures single initialization; others wait |
| Should evicted workspace finalize storage? | Yes, call finalize_storages() to release resources |
| How to share config between instances? | Clone config; only workspace differs per instance |
| Where to put pool management code? | New module workspace_manager.py |
Next Steps
- Create
data-model.mdwith entity definitions - Document contracts (no new API endpoints; header-based routing is transparent)
- Create
quickstart.mdfor multi-workspace deployment