logs: add multi-tenant audit & document count bugfix log (2025-01-06)

This commit is contained in:
Raphaël MANSUY 2025-12-05 14:15:50 +08:00
parent 30787c0d2d
commit 83e7f6a236

View file

@ -0,0 +1,85 @@
# Task Log: Multi-Tenant Implementation Audit & Document Count Bug Fix
**Date:** 2025-01-06 16:45
**Mode:** beastmode
**Status:** ✅ COMPLETED
---
## Actions Performed
1. **Fixed KB document count bug** - Modified `tenant_service.py` to compute document counts via LEFT JOINs from `lightrag_doc_full`, `lightrag_vdb_entity`, `lightrag_vdb_relation` tables
2. **Verified fix in browser** - UI now correctly shows "1 docs" for TechStart/Main KB
3. **Audited multi-tenant storage implementations** - Reviewed all tenant support modules in `lightrag/kg/`
4. **Created new test file** - `tests/test_tenant_kb_document_count.py` with 7 comprehensive tests
5. **Ran test suite** - 93 multi-tenant tests passing
---
## Decisions Made
1. Used workspace pattern `{tenant_id}:{kb_id}` for document queries (consistent with existing codebase)
2. Used LEFT JOIN instead of subqueries for better performance on count aggregation
3. Defaulted counts to 0 when no documents/entities/relationships exist (graceful handling of NULL)
4. Mocked database layer in new tests to avoid integration dependencies
---
## Key Changes
### Modified Files
| File | Change |
|------|--------|
| `lightrag/services/tenant_service.py` | Fixed `list_knowledge_bases()` and `get_knowledge_base()` to compute document/entity/relationship counts |
### Created Files
| File | Purpose |
|------|---------|
| `tests/test_tenant_kb_document_count.py` | Unit tests for document count functionality |
---
## Test Results
| Test Suite | Tests | Status |
|------------|-------|--------|
| `test_tenant_models.py` | 16 | ✅ PASSED |
| `test_tenant_storage_phase3.py` | 22 | ✅ PASSED |
| `test_multi_tenant_backends.py` | 37 | ✅ PASSED |
| `test_workspace_isolation.py` | 11 | ✅ PASSED |
| `test_tenant_kb_document_count.py` | 7 | ✅ PASSED |
| **TOTAL** | **93** | ✅ PASSED |
---
## Lessons/Insights
1. **Table names matter** - Initial fix used wrong table names (`lightrag_vdb` vs `lightrag_vdb_entity`); discovered via server error logs
2. **Workspace format** - Multi-tenant isolation uses `{tenant_id}:{kb_id}` composite key stored in `workspace` column
3. **Existing test coverage** - Multi-tenant implementation already had 86+ comprehensive tests
4. **API test isolation** - Some API route tests fail collection due to argparse conflicts; need to be run separately or with proper config initialization
---
## Storage Backends Reviewed
All storage backends have comprehensive multi-tenant support:
| Backend | Tenant Support Module | Isolation Method |
|---------|----------------------|------------------|
| PostgreSQL | `postgres_tenant_support.py` | `TenantSQLBuilder`, workspace column |
| MongoDB | `mongo_tenant_support.py` | `tenant_id`/`kb_id` fields, compound indexes |
| Redis | `redis_tenant_support.py` | Key prefixing `{tenant}:{kb}:key` |
| Neo4j | `neo4j_tenant_support.py` | Node properties, tenant constraints |
| Milvus/Qdrant | `vector_tenant_support.py` | Metadata fields, filter expressions |
---
## Next Steps (Optional)
1. Fix API test isolation by mocking `parse_args()` in test setup
2. Consider caching document counts for performance at scale
3. Add integration tests that run against real PostgreSQL database