* feat: Implement multi-tenant architecture with tenant and knowledge base models - Added data models for tenants, knowledge bases, and related configurations. - Introduced role and permission management for users in the multi-tenant system. - Created a service layer for managing tenants and knowledge bases, including CRUD operations. - Developed a tenant-aware instance manager for LightRAG with caching and isolation features. - Added a migration script to transition existing workspace-based deployments to the new multi-tenant architecture. * chore: ignore lightrag/api/webui/assets/ directory * chore: stop tracking lightrag/api/webui/assets (ignore in .gitignore) * feat: Initialize LightRAG Multi-Tenant Stack with PostgreSQL - Added README.md for project overview, setup instructions, and architecture details. - Created docker-compose.yml to define services: PostgreSQL, Redis, LightRAG API, and Web UI. - Introduced env.example for environment variable configuration. - Implemented init-postgres.sql for PostgreSQL schema initialization with multi-tenant support. - Added reproduce_issue.py for testing default tenant access via API. * feat: Enhance TenantSelector and update related components for improved multi-tenant support * feat: Enhance testing capabilities and update documentation - Updated Makefile to include new test commands for various modes (compatibility, isolation, multi-tenant, security, coverage, and dry-run). - Modified API health check endpoint in Makefile to reflect new port configuration. - Updated QUICK_START.md and README.md to reflect changes in service URLs and ports. - Added environment variables for testing modes in env.example. - Introduced run_all_tests.sh script to automate testing across different modes. - Created conftest.py for pytest configuration, including database fixtures and mock services. - Implemented database helper functions for streamlined database operations in tests. - Added test collection hooks to skip tests based on the current MULTITENANT_MODE. * feat: Implement multi-tenant support with demo mode enabled by default - Added multi-tenant configuration to the environment and Docker setup. - Created pre-configured demo tenants (acme-corp and techstart) for testing. - Updated API endpoints to support tenant-specific data access. - Enhanced Makefile commands for better service management and database operations. - Introduced user-tenant membership system with role-based access control. - Added comprehensive documentation for multi-tenant setup and usage. - Fixed issues with document visibility in multi-tenant environments. - Implemented necessary database migrations for user memberships and legacy support. * feat(audit): Add final audit report for multi-tenant implementation - Documented overall assessment, architecture overview, test results, security findings, and recommendations. - Included detailed findings on critical security issues and architectural concerns. fix(security): Implement security fixes based on audit findings - Removed global RAG fallback and enforced strict tenant context. - Configured super-admin access and required user authentication for tenant access. - Cleared localStorage on logout and improved error handling in WebUI. chore(logs): Create task logs for audit and security fixes implementation - Documented actions, decisions, and next steps for both audit and security fixes. - Summarized test results and remaining recommendations. chore(scripts): Enhance development stack management scripts - Added scripts for cleaning, starting, and stopping the development stack. - Improved output messages and ensured graceful shutdown of services. feat(starter): Initialize PostgreSQL with AGE extension support - Created initialization scripts for PostgreSQL extensions including uuid-ossp, vector, and AGE. - Ensured successful installation and verification of extensions. * feat: Implement auto-select for first tenant and KB on initial load in WebUI - Removed WEBUI_INITIAL_STATE_FIX.md as the issue is resolved. - Added useTenantInitialization hook to automatically select the first available tenant and KB on app load. - Integrated the new hook into the Root component of the WebUI. - Updated RetrievalTesting component to ensure a KB is selected before allowing user interaction. - Created end-to-end tests for multi-tenant isolation and real service interactions. - Added scripts for starting, stopping, and cleaning the development stack. - Enhanced API and tenant routes to support tenant-specific pipeline status initialization. - Updated constants for backend URL to reflect the correct port. - Improved error handling and logging in various components. * feat: Add multi-tenant support with enhanced E2E testing scripts and client functionality * update client * Add integration and unit tests for multi-tenant API, models, security, and storage - Implement integration tests for tenant and knowledge base management endpoints in `test_tenant_api_routes.py`. - Create unit tests for tenant isolation, model validation, and role permissions in `test_tenant_models.py`. - Add security tests to enforce role-based permissions and context validation in `test_tenant_security.py`. - Develop tests for tenant-aware storage operations and context isolation in `test_tenant_storage_phase3.py`. * feat(e2e): Implement OpenAI model support and database reset functionality * Add comprehensive test suite for gpt-5-nano compatibility - Introduced tests for parameter normalization, embeddings, and entity extraction. - Implemented direct API testing for gpt-5-nano. - Validated .env configuration loading and OpenAI API connectivity. - Analyzed reasoning token overhead with various token limits. - Documented test procedures and expected outcomes in README files. - Ensured all tests pass for production readiness. * kg(postgres_impl): ensure AGE extension is loaded in session and configure graph initialization * dev: add hybrid dev helper scripts, Makefile, docker-compose.dev-db and local development docs * feat(dev): add dev helper scripts and local development documentation for hybrid setup * feat(multi-tenant): add detailed specifications and logs for multi-tenant improvements, including UX, backend handling, and ingestion pipeline * feat(migration): add generated tenant/kb columns, indexes, triggers; drop unused tables; update schema and docs * test(backward-compat): adapt tests to new StorageNameSpace/TenantService APIs (use concrete dummy storages) * chore: multi-tenant and UX updates — docs, webui, storage, tenant service adjustments * tests: stabilize integration tests + skip external services; fix multi-tenant API behavior and idempotency - gpt5_nano_compatibility: add pytest-asyncio markers, skip when OPENAI key missing, prevent module-level asyncio.run collection, add conftest - Ollama tests: add server availability check and skip markers; avoid pytest collection warnings by renaming helper classes - Graph storage tests: rename interactive test functions to avoid pytest collection - Document & Tenant routes: support external_ids for idempotency; ensure HTTPExceptions are re-raised - LightRAG core: support external_ids in apipeline_enqueue_documents and idempotent logic - Tests updated to match API changes (tenant routes & document routes) - Add logs and scripts for inspection and audit
174 lines
No EOL
16 KiB
Text
174 lines
No EOL
16 KiB
Text
## Multi-tenant UX & Backend Improvements (v1)
|
|
|
|
This document describes a set of concrete, testable improvements to multi-tenant behavior across UI, routing, backend APIs, ingestion pipeline, testing, and documentation. The goal is to make tenant switching predictable, bookmarkable, efficient, and well-tested.
|
|
|
|
Scope / Goals
|
|
- Provide a clear, improved multi-tenant selector UX for first-time users and returning users.
|
|
- Keep UI state serializable for bookmarking and sharing, but do NOT expose tenant identifiers in the URL for security. Tenant context will be provided by the `X-Tenant-ID` header; share/bookmark behavior should use tenant-aware server-side snapshots or short-lived tokens for cross-user sharing within the same tenant.
|
|
- Ensure backend APIs and data model support efficient tenant-scoped retrievals at scale.
|
|
- Make the ingestion pipeline tenant-aware and robust, including logging and error handling.
|
|
- Add automated tests (unit, integration, e2e) that cover tenant switching and state preservation.
|
|
- Update developer and user documentation describing the behaviour and configuration.
|
|
|
|
UX / Frontend Behaviour
|
|
|
|
- Multi-tenant landing: refine the `Multi tenant selection` page (image `assets/multi_tenant-view.png`) with clearer tenant cards, a searchable list, and a persisted "last selected tenant" hint.
|
|
|
|
- Per-tenant state preservation:
|
|
- For every major page (Documents, Knowledge Graph, Retrieval, Chat/Conversations, API) maintain a per-tenant state object containing: `currentKB`, `page`, `pageSize`, `filters`, `sort`, `viewMode` (list/card), and any UI-specific settings.
|
|
- When switching tenants in the UI, the application restores the previously saved state for that tenant and route.
|
|
|
|
- Per-KB state:
|
|
- When a tenant has multiple KBs, switching KBs within a tab should preserve page/filter/sort for that KB as well. The currently selected KB must be persisted as part of the tenant+route state.
|
|
|
|
- URL encoding (bookmarkable & shareable):
|
|
- For security tenant identifiers MUST NOT be included in browser URLs or route paths. Tenant context is supplied by the `X-Tenant-ID` header and validated by the backend.
|
|
- Routes should therefore be tenant-agnostic and only describe UI state, e.g. `/documents?kb=:kbId&page=3&pageSize=25&filters=status:active,owner:me&sort=created_desc`.
|
|
- Examples (tenant provided via header):
|
|
- Documents tab for KB `backup`: `/documents?kb=backup&page=3&pageSize=25&filters=status:active` (valid when `X-Tenant-ID` header identifies the tenant)
|
|
- Knowledge Graph for KB `master`: `/graph?kb=master&view=graph&filters=entityType:company`
|
|
- Because URLs are tenant-agnostic, sharing a raw URL does not guarantee the same tenant-scoped view across users. To enable secure sharing/bookmarking across users in the same tenant, implement a server-side snapshot/share-token (opaque id) that is tenant-scoped and must be accessed with a matching `X-Tenant-ID` header.
|
|
|
|
- State storage strategy (frontend):
|
|
- Primary: URL (query parameters) — stores route-level UI settings (page, pageSize, filters, sort, viewMode) but MUST NOT include tenant-identifying data so URLs remain tenant-agnostic.
|
|
- Secondary: sessionStorage (per browser session) for quick restores when switching between tenants without navigation (faster UX). Key format: `lightrag:tenant:<tenantId>:route:<routeName>` storing a compact JSON of the last state.
|
|
- Tertiary: In-memory store for fast runtime access.
|
|
- Rules: URL overrides sessionStorage; sessionStorage only used when URL doesn't provide that particular state. When storing per-tenant state in sessionStorage, the key MUST include the tenant id sourced from `X-Tenant-ID` (opaque value), for example `lightrag:tenant:<tenantId>:route:<routeName>`. Never expose that tenant id in shared URLs.
|
|
|
|
Frontend Implementation Notes
|
|
- Centralize tenant+route state handling in a single client-side module (e.g., `tenantStateManager`) that exposes:
|
|
- `getState(tenantId, routeName)`
|
|
- `setState(tenantId, routeName, state)`
|
|
- `hydrateFromURL()` and `syncToURL(routeName)` — URL sync is intentionally tenant-agnostic. When reading/writing per-tenant session or in-memory storage, the runtime must provide the `tenantId` from `X-Tenant-ID` or auth claims to scope keys appropriately.
|
|
- `onTenantSwitch(oldTenant, newTenant)` hook to trigger restore and UI re-render.
|
|
- Use debouncing when syncing heavy state to URL (e.g., typing in filters) to avoid flooding history.
|
|
- When navigating programmatically (e.g., tenant card click), use `history.replace` for initial load and `history.push` for explicit user navigation.
|
|
|
|
Routing / API Contract (Frontend <-> Backend)
|
|
|
|
All APIs that return tenant-scoped resources must derive tenant context from a secure source: the `X-Tenant-ID` header or an Authorization token's tenant claim. The frontend must NOT encode tenant identifiers into the URL path or request body for normal user flows (server-side validation is required when admin operations accept tenant IDs in the body).
|
|
- Suggested REST endpoints (examples):
|
|
- `GET /api/documents?kb=:kbId&page=3&pageSize=25&filters=...` — include header `X-Tenant-ID: <tenantId>`
|
|
- `GET /api/graph?kb=:kbId&query=...` — include header `X-Tenant-ID: <tenantId>`
|
|
- `POST /api/ingest` — include header `X-Tenant-ID: <tenantId>`; payloads must include `kb` and optional `external_id` for dedup/idempotency.
|
|
- Ensure APIs return pagination metadata and any applied-filter echo to help the UI render consistent state.
|
|
|
|
Reality check — what I found in the repo
|
|
- The project already implements header-based tenant scoping across the stack, so the `X-Tenant-ID` / `X-KB-ID` approach in this spec is consistent with the codebase.
|
|
- Frontend (WebUI): the client adds tenant and KB headers from localStorage using an Axios interceptor in `lightrag_webui/src/api/client.ts` (and built dist assets). The WebUI stores selection objects in `localStorage` keys like `SELECTED_TENANT` and `SELECTED_KB` and the interceptor injects `X-Tenant-ID` and `X-KB-ID` into requests.
|
|
- Hooks/API clients: `lightrag_webui/src/hooks/useTenantContext.ts` and `lightrag_webui/src/api/tenant.ts` call APIs with `X-Tenant-ID` headers when appropriate.
|
|
- Backend: `lightrag/api/dependencies.py` (and the built library under `build/lib/lightrag/api/dependencies.py`) already reads `X-Tenant-ID` and falls back to token/subdomain logic in some helper methods. There are explicit failure logs and behaviors when headers are missing.
|
|
- Ingestion & tests: e2e scripts and tests (e.g., `e2e/client.py`, `tests/e2e_real_service/test_api_isolation.py`) already call ingestion and queries with `X-Tenant-ID` and `X-KB-ID` headers. The project's starter docs and scripts also show curl examples with `X-Tenant-ID` usage.
|
|
|
|
Pragmatic conclusions from the audit
|
|
- This spec is realistic and practical: the codebase already uses header-based tenancy and local client-side tenant selection (X-Tenant-ID/X-KB-ID), so the required architectural changes are incremental rather than wholesale.
|
|
- Minimal gaps to implement the spec:
|
|
- Frontend already injects headers via Axios interceptor. The main work is adding a structured, test-covered `tenantStateManager` that:
|
|
- Re-uses existing `localStorage` keys (SELECTED_TENANT / SELECTED_KB) in a secure way, or migrates to sessionStorage depending on retention needs.
|
|
- Serializes UI state to tenant-agnostic URLs (page, filters, sort) while persisting tenant-scoped state keyed by `X-Tenant-ID` in sessionStorage.
|
|
- Integrates with the existing Axios interceptor (`lightrag_webui/src/api/client.ts`) so requests continue to receive `X-Tenant-ID`/`X-KB-ID` automatically.
|
|
- Backend already supports header-based tenant resolution (see `lightrag/api/dependencies.py` and `lightrag/api/routers/tenant_routes.py`), so most API work will be validation + adding tests and any migration endpoints (snapshots/tokens).
|
|
- Ingestion already used in e2e tests — ensure that ingestion endpoints require/validate `X-Tenant-ID` and honor `external_id` dedup keys.
|
|
- Security note: localStorage is currently used to hold selected tenant/KB objects. That is acceptable with opaque tenant IDs and server validation, but be mindful that localStorage is accessible to JS in the page — avoid putting sensitive info in it and never serialize tenant IDs into shareable URLs. Prefer server-side, tenant-scoped snapshot tokens for cross-user sharing/bookmarking.
|
|
|
|
Low-effort next steps based on repository reality
|
|
- Implement `tenantStateManager` in the WebUI that integrates with `lightrag_webui/src/api/client.ts` interceptor and `SELECTED_TENANT/SELECTED_KB` storage.
|
|
- Add unit tests for the manager and end-to-end tests that simulate header swaps by changing `X-Tenant-ID` in test clients (`e2e/client.py`, tests/e2e_real_service/*).
|
|
- Add server-side snapshot/share-token endpoints (tenant-scoped) and tests showing snapshot tokens only work when `X-Tenant-ID` is present and matches.
|
|
|
|
|
|
Backend & Database Recommendations
|
|
|
|
- Tenant isolation:
|
|
- Prefer logical isolation with a `tenant_id` column on tenant-scoped tables (documents, document_chunks, embeddings, graph_nodes, graph_edges). The `tenant_id` stored in DB can be an internal opaque id (UUID or numeric internal id) distinct from any user-facing identifier; do not expose internal tenant identifiers in URLs or client-side tokens.
|
|
- Consider partitioning or schema separation for very large tenants (sharding or separate DB per tenant) — document migration path in rollout plan.
|
|
|
|
- Indexing & query optimizations:
|
|
- Indexes: `(tenant_id, kb_id, created_at)`, `(tenant_id, kb_id, status)`, and any filterable fields commonly used.
|
|
- Use covering indexes for frequent queries to avoid unnecessary lookups.
|
|
- For embedding search: keep tenant_id + kb_id as part of the vector index metadata for tenant-scoped nearest-neighbor searches.
|
|
|
|
- API performance:
|
|
- Use LIMIT/OFFSET carefully; for deep pagination consider keyset pagination (cursor-based) for large result sets.
|
|
- Add a short server-side cache for tenant-scoped metadata (KB list, tenant settings) with invalidation on write.
|
|
|
|
- Security & multi-tenancy:
|
|
- Enforce tenant authorization on every API endpoint. Never rely only on frontend-provided tenantId — validate against auth token.
|
|
- Audit logs for cross-tenant access attempts.
|
|
|
|
Ingestion Pipeline (tenant-aware)
|
|
|
|
- Contract:
|
|
- Ingestion API must NOT accept untrusted `tenant_id` values in the request body. Tenant context must be derived from the `X-Tenant-ID` header or an authenticated token claim. Only use `tenant_id` from the body in special admin paths with strict server-side validation.
|
|
- Each ingested object/document must be stored with `tenant_id` and `kb_id` metadata.
|
|
|
|
- Validation & idempotency:
|
|
- Support an optional `external_id` for dedup / idempotency keys so re-sending the same document won't create duplicates.
|
|
- Validate ownership and size limits per tenant; reject with clear error codes (400, 409).
|
|
|
|
- Error handling & logging:
|
|
- Structured logs must include `tenant_id`, `kb_id`, `ingestion_job_id`, and `step` to allow tracing; redact or obfuscate any tenant metadata when exporting logs to public destinations.
|
|
- Pipeline must surface per-tenant errors to a UI/inbox or to a retry queue; don't crash global pipeline.
|
|
|
|
Tests (what to add)
|
|
|
|
- Unit tests:
|
|
- `tenantStateManager` serialization/hydration to/from URL and sessionStorage. Verify sessionStorage keys are tenant-scoped using header-provided tenant ids and ensure URL remains tenant-agnostic.
|
|
- API layer ensures tenant_id is required and validated against token.
|
|
|
|
- Integration tests:
|
|
- Backend endpoints: queries filtered by tenant derived from `X-Tenant-ID` header only return tenant data and reject requests where `X-Tenant-ID` is absent or mismatched with authenticated identity.
|
|
- Pagination & filters: verify results and metadata for various page sizes and deep pages.
|
|
|
|
- E2E tests:
|
|
- Scenario: with `X-Tenant-ID=A` open the UI and set filters + go to page 3 -> switch context to `X-Tenant-ID=B` (or sign in as the other tenant) and set filters + page 1 -> switch back to `X-Tenant-ID=A` and verify state restored (page 3, filters active).
|
|
- Scenario: open a tenant-agnostic bookmarked URL under `X-Tenant-ID` header A and verify the UI loads the correct tenant-scoped state. Verify accessing the same URL under a different `X-Tenant-ID` returns data scoped to that new tenant.
|
|
- Scenario: ingest documents for multiple tenants and verify they appear in the correct tenant/KB only.
|
|
|
|
Acceptance Criteria
|
|
|
|
- UX: Tenant selector shows last selected tenant; tenant switch restores previously set page, filters and KB selection.
|
|
- URL + Security: Browser URL must NOT contain tenant identifiers. URL changes reproduce view settings, but reproducing tenant-scoped data requires a matching `X-Tenant-ID` header. To enable secure cross-user sharing/bookmarks within the same tenant, implement server-side snapshot/share-token endpoints that generate an opaque token requiring `X-Tenant-ID` on access.
|
|
- Backend: Tenant-scoped API endpoints enforce tenant isolation and return consistent pagination metadata.
|
|
- Ingestion: Documents ingested with `tenant_id` are only visible to that tenant; pipeline logs include tenant info and provide idempotency.
|
|
- Tests: Unit, integration, and e2e tests covering tenant switching, URL bookmarking, and ingestion behavior are added and passing.
|
|
|
|
Developer Notes & Rollout
|
|
|
|
- Backwards compatibility:
|
|
- For existing URLs that contain tenant identifiers, provide server-side redirects and a transition UI. Prefer moving away from route-based tenant identifiers and migrate toward header-based tenant context; log usage during the transition window to discover and convert bookmarks.
|
|
|
|
- Migration steps:
|
|
- Add required DB indexes described above and monitor slow queries.
|
|
- Deploy backend changes behind feature flag; run e2e tests in staging.
|
|
|
|
- Monitoring:
|
|
- Add dashboards for per-tenant request latency, ingestion failure rates, and cache hit ratios.
|
|
|
|
Documentation
|
|
|
|
- Update docs with:
|
|
- `docs/0001-multi-tenant-architecture.md` (or add new `0004`): architecture overview and tenant isolation recommendations.
|
|
- `docs/LOCAL_DEVELOPMENT.md` section describing how to run local multi-tenant ingestion tests and how to simulate multiple tenants.
|
|
- UI guide: how to bookmark and share tenant-scoped views without exposing tenant identifiers; document the server-side snapshot/share-token approach and how shared links are tenant-scoped and validated using `X-Tenant-ID`.
|
|
|
|
Implementation checklist (developer friendly)
|
|
|
|
- [ ] Implement `tenantStateManager` frontend module and integrate into router.
|
|
- [ ] Update React/Vue components on Documents/Graph/Chat to serialize state to URL and sessionStorage.
|
|
- [ ] Add/verify backend endpoints accept and validate `tenant_id` and `kb_id`.
|
|
- [ ] Add DB indexes and consider partitioning plan for large tenants.
|
|
- [ ] Update ingestion API to require/validate tenant context and add idempotency support.
|
|
- [ ] Add unit/integration/e2e tests described above.
|
|
- [ ] Update docs and add runbook for rollout.
|
|
|
|
Open questions / decisions to make
|
|
|
|
- URL length vs. complexity: how many filters do we serialize in the querystring? Consider compact encoding (base64 JSON) for complex filter payloads.
|
|
- Deep pagination strategy: default to offset-based for small result sets, but enable cursor-based for large queries.
|
|
|
|
Notes
|
|
- Keep URL design consistent across all tabs and DO NOT include tenant identifiers in routes. Use `X-Tenant-ID` header for tenant context. Provide server-side snapshots for safe cross-user sharing and bookmarking.
|
|
- Prioritize correctness and security (tenant validation) over saving developer time.
|
|
|
|
If you want, I can now open a PR that implements the `tenantStateManager` skeleton and updates the Documents page routing to the new URL format. |