LightRAG/docs/archives/action_plan/02-implementation-plan.md
Raphael MANSUY 2b292d4924
docs: Enterprise Edition & Multi-tenancy attribution (#5)
* Remove outdated documentation files: Quick Start Guide, Apache AGE Analysis, and Scratchpad.

* Add multi-tenant testing strategy and ADR index documentation

- Introduced ADR 008 detailing the multi-tenant testing strategy for the ./starter environment, covering compatibility and multi-tenant modes, testing scenarios, and implementation details.
- Created a comprehensive ADR index (README.md) summarizing all architecture decision records related to the multi-tenant implementation, including purpose, key sections, and reading paths for different roles.

* feat(docs): Add comprehensive multi-tenancy guide and README for LightRAG Enterprise

- Introduced `0008-multi-tenancy.md` detailing multi-tenancy architecture, key concepts, roles, permissions, configuration, and API endpoints.
- Created `README.md` as the main documentation index, outlining features, quick start, system overview, and deployment options.
- Documented the LightRAG architecture, storage backends, LLM integrations, and query modes.
- Established a task log (`2025-01-21-lightrag-documentation-log.md`) summarizing documentation creation actions, decisions, and insights.
2025-12-04 18:09:15 +08:00

64 lines
3.2 KiB
Markdown

# Multi-Tenancy Implementation Plan
**Goal**: Upgrade LightRAG to a battle-tested, production-grade multi-tenant architecture.
## Phase 1: Tenant Identification & Middleware
- [ ] **Step 1.1**: Create `lightrag/api/middleware/tenant.py`.
- Implement `TenantMiddleware` to extract tenant from subdomain (optional) and JWT.
- Use Redis to cache `subdomain -> tenant_id` resolution.
- Set `request.state.tenant_id`.
- [ ] **Step 1.2**: Update `lightrag/api/dependencies.py`.
- Update `get_tenant_context` to read from `request.state`.
- Remove reliance on `X-Tenant-ID` header when subdomain/JWT is present (enforce source of truth).
## Phase 2: PostgreSQL Row-Level Security (RLS)
- [ ] **Step 2.1**: Update `lightrag/kg/postgres_tenant_support.py`.
- Add SQL to enable RLS on tables: `ALTER TABLE ... ENABLE ROW LEVEL SECURITY`.
- Add SQL to create policies: `CREATE POLICY ... USING (tenant_id = current_setting('app.tenant_id'))`.
- [ ] **Step 2.2**: Update Database Connection Logic.
- In `lightrag/kg/postgres_impl.py` (or equivalent), ensure `app.tenant_id` is set for each session/connection.
- Use `SET LOCAL app.tenant_id = ...` at the start of transactions.
## Phase 3: MongoDB Strict Scoping
- [ ] **Step 3.1**: Create `lightrag/kg/mongo_repo.py`.
- Implement `MongoTenantRepo` class.
- It should take `tenant_id` in `__init__`.
- Override `find`, `find_one`, `insert_one`, etc., to automatically inject `tenant_id`.
- [ ] **Step 3.2**: Refactor `lightrag/kg/mongo_impl.py`.
- Use `MongoTenantRepo` instead of raw `motor` collection.
## Phase 4: Graph Database Session Wrapper (Neo4j, Memgraph)
- [ ] **Step 4.1**: Create `lightrag/kg/graph_session.py`.
- Implement `GraphTenantSession` abstract base class.
- Implement `Neo4jTenantSession` and `MemgraphTenantSession`.
- Wrap `run` method to inject `tenant_id` parameter and append `WHERE n.tenant_id = $tenant_id` if missing (or rely on strict parameterized queries).
- [ ] **Step 4.2**: Refactor `lightrag/kg/neo4j_impl.py` and `memgraph_impl.py`.
- Use `GraphTenantSession`.
## Phase 5: Vector Database Strict Scoping
- [ ] **Step 5.1**: Create `lightrag/kg/vector_repo.py`.
- Implement `VectorTenantRepo` abstract base class.
- Implement specific repositories for Qdrant, Milvus, FAISS, Nano.
- **Qdrant**: Automatically add `must` filter for `tenant_id` and `kb_id` to all searches.
- **Milvus**: Automatically append `tenant_id == "..."` to expressions.
- **FAISS**: Manage tenant-specific indices (e.g., `index_tenant_kb`) to avoid scanning all vectors.
- **Nano**: Enforce metadata filtering.
- [ ] **Step 5.2**: Refactor Vector Implementations.
- Update `qdrant_impl.py`, `milvus_impl.py`, `faiss_impl.py`, `nano_vector_db_impl.py` to use the new repositories.
## Phase 6: Redis Strict Prefixing
- [ ] **Step 6.1**: Enforce `RedisTenantNamespace`.
- Ensure all Redis interactions in `lightrag/kg/redis_impl.py` use the namespace wrapper.
## Phase 7: Verification
- [ ] **Step 7.1**: Create tests in `tests/test_multi_tenant_security.py`.
- Test RLS: Try to access another tenant's data via raw SQL.
- Test Middleware: Verify subdomain resolution.
- Test Isolation: Verify data separation across all backends (SQL, NoSQL, Graph, Vector).